Commit Graph

1440 Commits

Author SHA1 Message Date
Vladimir Oltean
bd954b8260 net: dsa: move tagging protocol code to tag.{c,h}
It would be nice if tagging protocol drivers could include just the
header they need, since they are (mostly) data path and isolated from
most of the other DSA core code does.

Create a tag.c and a tag.h file which are meant to support tagging
protocol drivers.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22 20:41:50 -08:00
Vladimir Oltean
09f9234168 net: dsa: move headers exported by slave.c to slave.h
Minimize the use of the bloated dsa_priv.h by moving the prototypes
exported by slave.c to their own header file.

This is just approximate to get the code structure right. There are some
interdependencies with static inline code left in dsa_priv.h, so leave
slave.h included from there for now.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22 20:41:49 -08:00
Vladimir Oltean
94ef6fad3b net: dsa: move headers exported by master.c to master.h
Minimize the use of the bloated dsa_priv.h by moving the prototypes
exported by master.c to their own header file.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22 20:41:49 -08:00
Vladimir Oltean
022bba63c3 net: dsa: move headers exported by port.c to port.h
Minimize the use of the bloated dsa_priv.h by moving the prototypes
exported by port.c to their own header file.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22 20:41:48 -08:00
Vladimir Oltean
7aea535d40 net: dsa: move rest of devlink setup/teardown to devlink.c
The code that needed further refactoring into dedicated functions in
dsa2.c was left aside. Move it now to devlink.c, and make dsa2.c stop
including net/devlink.h.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22 20:41:47 -08:00
Vladimir Oltean
d95fa75061 net: dsa: if ds->setup is true, ds->devlink is always non-NULL
Simplify dsa_switch_teardown() to remove the NULL checking for
ds->devlink.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22 20:41:47 -08:00
Vladimir Oltean
5cf2c75b5b net: dsa: move bulk of devlink code to devlink.{c,h}
dsa.c and dsa2.c are bloated with too much off-topic code. Identify all
code related to devlink and move it to a new devlink.c file.

Steer clear of the dsa_priv.h dumping ground antipattern and create a
dedicated devlink.h for it, which will be included only by the C files
which need it. Usage of dsa_priv.h will be minimized in later patches.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22 20:41:46 -08:00
Vladimir Oltean
d2be320495 net: dsa: modularize DSA_TAG_PROTO_NONE
There is no reason that I can see why the no-op tagging protocol should
be registered manually, so make it a module and make all drivers which
have any sort of reference to DSA_TAG_PROTO_NONE select it.

Note that I don't know if ksz_get_tag_protocol() really needs this,
or if it's just the logic which is poorly written. All switches seem to
have their own tagging protocol, and DSA_TAG_PROTO_NONE is just a
fallback that never gets used.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22 20:41:45 -08:00
Vladimir Oltean
c5fb8ead32 net: dsa: unexport dsa_dev_to_net_device()
dsa.o and dsa2.o are linked into the same dsa_core.o, there is no reason
to export this symbol when its only caller is local.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-22 20:41:45 -08:00
Felix Fietkau
d169ecb536 net: dsa: tag_mtk: assign per-port queues
Keeps traffic sent to the switch within link speed limits

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221116080734.44013-6-nbd@nbd.name
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-18 19:46:06 -08:00
Vladimir Oltean
0184c07a11 net: dsa: autoload tag driver module on tagging protocol change
Issue a request_module() call when an attempt to change the tagging
protocol is made, either by sysfs or by device tree. In the case of
ocelot (the only driver for which the default and the alternative
tagging protocol are compiled as different modules), the user is now no
longer required to insert tag_ocelot_8021q.ko manually.

In the particular case of ocelot, this solves a problem where
tag_ocelot_8021q.ko is built as module, and this is present in the
device tree:

&mscc_felix_port4 {
	dsa-tag-protocol = "ocelot-8021q";
};

&mscc_felix_port5 {
	dsa-tag-protocol = "ocelot-8021q";
};

Because no one attempts to load the module into the kernel at boot time,
the switch driver will fail to probe (actually forever defer) until
someone manually inserts tag_ocelot_8021q.ko. This is now no longer
necessary and happens automatically.

Rename dsa_find_tagger_by_name() to denote the change in functionality:
there is now feature parity with dsa_tag_driver_get_by_id(), i.o.w. we
also load the module if it's missing.

Link: https://lore.kernel.org/lkml/20221027113248.420216-1-michael@walle.cc/
Suggested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc> # on kontron-sl28 w/ ocelot_8021q
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-17 21:16:42 -08:00
Vladimir Oltean
54c087e839 net: dsa: rename dsa_tag_driver_get() to dsa_tag_driver_get_by_id()
A future patch will introduce one more way of getting a reference on a
tagging protocl driver (by name). Rename the current method to "by_id".

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-17 21:16:42 -08:00
Vladimir Oltean
e8666130b9 net: dsa: strip sysfs "tagging" string of trailing newline
Currently, dsa_find_tagger_by_name() uses sysfs_streq() which works both
with strings that contain \n at the end (echo ocelot > .../dsa/tagging)
and with strings that don't (printf ocelot > .../dsa/tagging).

There will be a problem once we'll want to construct the modalias string
based on which we auto-load the protocol kernel module. If the sysfs
buffer ends in a newline, we need to strip it first. This is a
preparatory patch specifically for that.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-17 21:16:42 -08:00
Vladimir Oltean
94793a56b3 net: dsa: provide a second modalias to tag proto drivers based on their name
Currently, tagging protocol drivers have a modalias of
"dsa_tag:id-<number>", where the number is one of DSA_TAG_PROTO_*_VALUE.

This modalias makes it possible for the request_module() call in
dsa_tag_driver_get() to work, given the input it has - an integer
returned by ds->ops->get_tag_protocol().

It is also possible to change tagging protocols at (pseudo-)runtime, via
sysfs or via device tree, and this works via the name string of the
tagging protocol rather than via its id (DSA_TAG_PROTO_*_VALUE).

In the latter case, there is no request_module() call, because there is
no association that the DSA core has between the string name and the ID,
to construct the modalias. The module is simply assumed to have been
inserted. This is actually slightly problematic when the tagging
protocol change should take place at probe time, since it's expected
that the dependency module should get autoloaded.

For this purpose, let's introduce a second modalias, so that the DSA
core can call request_module() by name. There is no reason to make the
modalias by name optional, so just modify the MODULE_ALIAS_DSA_TAG_DRIVER()
macro to take both the ID and the name as arguments, and generate two
modaliases behind the scenes.

Suggested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc> # on kontron-sl28 w/ ocelot_8021q
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-17 21:16:41 -08:00
Vladimir Oltean
2610937d7e net: dsa: rename tagging protocol driver modalias
It's autumn cleanup time, and today's target are modaliases.

Michael says that for users of modinfo, "dsa_tag-20" is not the most
suggestive name, and recommends a change to "dsa_tag-id-20".

Andrew points out that other modaliases have a prefix delimited by
colons, so he recommends "dsa_tag:20" instead of "dsa_tag-20".

To satisfy both proposals, Florian recommends "dsa_tag:id-20".

The modaliases are not stable ABI, and the essential information
(protocol ID) is still conveyed in the new string, which
request_module() must be adapted to form.

Link: 20221027210830.3577793-1-vladimir.oltean@nxp.com
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Suggested-by: Michael Walle <michael@walle.cc>
Suggested-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-17 21:16:41 -08:00
Vladimir Oltean
9999f85ba3 net: dsa: stop exposing tag proto module helpers to the world
The DSA tagging protocol driver macros are in the public include/net/dsa.h
probably because that's also where the DSA_TAG_PROTO_*_VALUE macros are
(MODULE_ALIAS_DSA_TAG_DRIVER hinges on those macro definitions).

But there is no reason to expose these helpers to <net/dsa.h>. That
header is shared between switch drivers (drivers/net/dsa/), tagging
protocol drivers (net/dsa/tag_*.c), the DSA core (net/dsa/ sans tag_*.c),
and the rest of the world (DSA master drivers, network stack, etc).
Too much exposure.

On the other hand, net/dsa/dsa_priv.h is included only by the DSA core
and by DSA tagging protocol drivers (or IOW, "friend" modules). Also a
bit too much exposure - I've contemplated creating a new header which is
only included by tagging protocol drivers, but completely separating a
new dsa_tag_proto.h from dsa_priv.h is not immediately trivial - for
example dsa_slave_to_port() is used both from the fast path and from the
control path.

So for now, move these definitions to dsa_priv.h which at least hides
them from the world.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-17 21:16:40 -08:00
Rasmus Villemoes
b8790661d9 net: dsa: set name_assign_type to NET_NAME_ENUM for enumerated user ports
When a user port does not have a label in device tree, and we thus
fall back to the eth%d scheme, the proper constant to use is
NET_NAME_ENUM. See also commit e9f656b7a2 ("net: ethernet: set
default assignment identifier to NET_NAME_ENUM"), which in turn quoted
commit 685343fc3b ("net: add name_assign_type netdev attribute"):

    ... when the kernel has given the interface a name using global
    device enumeration based on order of discovery (ethX, wlanY, etc)
    ... are labelled NET_NAME_ENUM.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.faineli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-17 19:40:08 -08:00
Rasmus Villemoes
6fdb038420 net: dsa: use NET_NAME_PREDICTABLE for user ports with name given in DT
When a user port has a label in device tree, the corresponding
netdevice is, to quote include/uapi/linux/netdevice.h, "predictably
named by the kernel". This is also explicitly one of the intended use
cases for NET_NAME_PREDICTABLE, quoting 685343fc3b ("net: add
name_assign_type netdev attribute"):

  NET_NAME_PREDICTABLE:
    The ifname has been assigned by the kernel in a predictable way
    [...] Examples include [...] and names deduced from hardware
    properties (including being given explicitly by the firmware).

Expose that information properly for the benefit of userspace tools
that make decisions based on the name_assign_type attribute,
e.g. a systemd-udev rule with "kernel" in NamePolicy.

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.faineli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-17 19:40:08 -08:00
Rasmus Villemoes
0171a1d22b net: dsa: refactor name assignment for user ports
The following two patches each have a (small) chance of causing
regressions for userspace and will in that case of course need to be
reverted.

In order to prepare for that and make those two patches independent
and individually revertable, refactor the code which sets the names
for user ports by moving the "fall back to eth%d if no label is given
in device tree" to dsa_slave_create().

No functional change (at least none intended).

Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.faineli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-17 19:40:07 -08:00
Jakub Kicinski
224b744abf Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/linux/bpf.h
  1f6e04a1c7 ("bpf: Fix offset calculation error in __copy_map_value and zero_map_value")
  aa3496accc ("bpf: Refactor kptr_off_tab into btf_record")
  f71b2f6417 ("bpf: Refactor map->off_arr handling")
https://lore.kernel.org/all/20221114095000.67a73239@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-17 18:30:39 -08:00
Vladimir Oltean
4e0c19fcb8 net: dsa: don't leak tagger-owned storage on switch driver unbind
In the initial commit dc452a471d ("net: dsa: introduce tagger-owned
storage for private and shared data"), we had a call to
tag_ops->disconnect(dst) issued from dsa_tree_free(), which is called at
tree teardown time.

There were problems with connecting to a switch tree as a whole, so this
got reworked to connecting to individual switches within the tree. In
this process, tag_ops->disconnect(ds) was made to be called only from
switch.c (cross-chip notifiers emitted as a result of dynamic tag proto
changes), but the normal driver teardown code path wasn't replaced with
anything.

Solve this problem by adding a function that does the opposite of
dsa_switch_setup_tag_protocol(), which is called from the equivalent
spot in dsa_switch_teardown(). The positioning here also ensures that we
won't have any use-after-free in tagging protocol (*rcv) ops, since the
teardown sequence is as follows:

dsa_tree_teardown
-> dsa_tree_teardown_master
   -> dsa_master_teardown
      -> unsets master->dsa_ptr, making no further packets match the
         ETH_P_XDSA packet type handler
-> dsa_tree_teardown_ports
   -> dsa_port_teardown
      -> dsa_slave_destroy
         -> unregisters DSA net devices, there is even a synchronize_net()
            in unregister_netdevice_many()
-> dsa_tree_teardown_switches
   -> dsa_switch_teardown
      -> dsa_switch_teardown_tag_protocol
         -> finally frees the tagger-owned storage

Fixes: 7f2973149c ("net: dsa: make tagging protocols connect to individual switches from a tree")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Saeed Mahameed <saeed@kernel.org>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221114143551.1906361-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-15 20:50:52 -08:00
Vladimir Oltean
53d04b9811 net: dsa: remove phylink_validate() method
As of now, no DSA driver uses a custom link mode validation procedure
anymore. So remove this DSA operation and let phylink determine what is
supported based on config->mac_capabilities (if provided by the driver).
Leave a comment why we left the code that we did, and that there is more
work to do.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-15 20:34:27 -08:00
Felix Fietkau
570d0a588d net: dsa: add support for DSA rx offloading via metadata dst
If a metadata dst is present with the type METADATA_HW_PORT_MUX on a dsa cpu
port netdev, assume that it carries the port number and that there is no DSA
tag present in the skb data.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-15 20:22:07 -08:00
Vladimir Oltean
ed1fe1bebe net: dsa: make dsa_master_ioctl() see through port_hwtstamp_get() shims
There are multi-generational drivers like mv88e6xxx which have code like
this:

int mv88e6xxx_port_hwtstamp_get(struct dsa_switch *ds, int port,
				struct ifreq *ifr)
{
	if (!chip->info->ptp_support)
		return -EOPNOTSUPP;

	...
}

DSA wants to deny PTP timestamping on the master if the switch supports
timestamping too. However it currently relies on the presence of the
port_hwtstamp_get() callback to determine PTP capability, and this
clearly does not work in that case (method is present but returns
-EOPNOTSUPP).

We should not deny PTP on the DSA master for those switches which truly
do not support hardware timestamping.

Create a dsa_port_supports_hwtstamp() method which actually probes for
support by calling port_hwtstamp_get() and seeing whether that returned
-EOPNOTSUPP or not.

Fixes: f685e609a3 ("net: dsa: Deny PTP on master if switch supports it")
Link: https://patchwork.kernel.org/project/netdevbpf/patch/20221110124345.3901389-1-festevam@gmail.com/
Reported-by: Fabio Estevam <festevam@gmail.com>
Reported-by: Steffen Bätz <steffen@innosonix.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-14 11:30:49 +00:00
Jiri Pirko
77df1db80d net: remove unused ndo_get_devlink_port
Remove ndo_get_devlink_port which is no longer used alongside with the
implementations in drivers.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03 20:48:36 -07:00
Jiri Pirko
ac73d4bf2c net: make drivers to use SET_NETDEV_DEVLINK_PORT to set devlink_port
Benefit from the previously implemented tracking of netdev events in
devlink code and instead of calling  devlink_port_type_eth_set() and
devlink_port_type_clear() to set devlink port type and link to related
netdev, use SET_NETDEV_DEVLINK_PORT() macro to assign devlink_port
pointer to netdevice which is about to be registered.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03 20:48:34 -07:00
Jakub Kicinski
fbeb229a66 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03 13:21:54 -07:00
Vladimir Oltean
a2c65a9d05 net: dsa: fall back to default tagger if we can't load the one from DT
DSA tagging protocol drivers can be changed at runtime through sysfs and
at probe time through the device tree (support for the latter was added
later).

When changing through sysfs, it is assumed that the module for the new
tagging protocol was already loaded into the kernel (in fact this is
only a concern for Ocelot/Felix switches, where we have tag_ocelot.ko
and tag_ocelot_8021q.ko; for every other switch, the default and
alternative protocols are compiled within the same .ko, so there is
nothing for the user to load).

The kernel cannot currently call request_module(), because it has no way
of constructing the modalias name of the tagging protocol driver
("dsa_tag-%d", where the number is one of DSA_TAG_PROTO_*_VALUE).
The device tree only contains the string name of the tagging protocol
("ocelot-8021q"), and the only mapping between the string and the
DSA_TAG_PROTO_OCELOT_8021Q_VALUE is present in tag_ocelot_8021q.ko.
So this is a chicken-and-egg situation and dsa_core.ko has nothing based
on which it can automatically request the insertion of the module.

As a consequence, if CONFIG_NET_DSA_TAG_OCELOT_8021Q is built as module,
the switch will forever defer probing.

The long-term solution is to make DSA call request_module() somehow,
but that probably needs some refactoring.

What we can do to keep operating with existing device tree blobs is to
cancel the attempt to change the tagging protocol with the one specified
there, and to remain operating with the default one. Depending on the
situation, the default protocol might still allow some functionality
(in the case of ocelot, it does), and it's better to have that than to
fail to probe.

Fixes: deff710703 ("net: dsa: Allow default tag protocol to be overridden from DT")
Link: https://lore.kernel.org/lkml/20221027113248.420216-1-michael@walle.cc/
Reported-by: Heiko Thiery <heiko.thiery@gmail.com>
Reported-by: Michael Walle <michael@walle.cc>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Michael Walle <michael@walle.cc>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20221027145439.3086017-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28 21:58:30 -07:00
Thomas Gleixner
d120d1a63b net: Remove the obsolte u64_stats_fetch_*_irq() users (net).
Now that the 32bit UP oddity is gone and 32bit uses always a sequence
count, there is no need for the fetch_irq() variants anymore.

Convert to the regular interface.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-28 20:13:54 -07:00
Dan Carpenter
0a6d58a70a net: dsa: uninitialized variable in dsa_slave_netdevice_event()
Return zero if both dsa_slave_dev_check() and netdev_uses_dsa() are false.

Fixes: acc43b7bf5 ("net: dsa: allow masters to join a LAG")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-15 11:15:27 +01:00
Yang Yingliang
557f050166 net: dsa: fix wrong pointer passed to PTR_ERR() in dsa_port_phylink_create()
Fix wrong pointer passed to PTR_ERR() in dsa_port_phylink_create() to print
error message.

Fixes: cf5ca4ddc3 ("net: dsa: don't leave dangling pointers in dp->pl when failing")
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-09 20:01:32 +01:00
Vladimir Oltean
61e4a51621 net: dsa: remove bool devlink_port_setup
Since dsa_port_devlink_setup() and dsa_port_devlink_teardown() are
already called from code paths which only execute once per port (due to
the existing bool dp->setup), keeping another dp->devlink_port_setup is
redundant, because we can already manage to balance the calls properly
(and not call teardown when setup was never called, or call setup twice,
or things like that).

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:17:17 -07:00
Jiri Pirko
c698a5fbf7 net: dsa: don't do devlink port setup early
Commit 3122433eb5 ("net: dsa: Register devlink ports before calling DSA driver setup()")
moved devlink port setup to be done early before driver setup()
was called. That is no longer needed, so move the devlink port
initialization back to dsa_port_setup(), as the first thing done there.

Note there is no longer needed to reinit port as unused if
dsa_port_setup() fails, as it unregisters the devlink port instance on
the error path.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:17:16 -07:00
Vladimir Oltean
cf5ca4ddc3 net: dsa: don't leave dangling pointers in dp->pl when failing
There is a desire to simplify the dsa_port registration path with
devlink, and this involves reworking a bit how user ports which fail to
connect to their PHY (because it's missing) get reinitialized as UNUSED
devlink ports.

The desire is for the change to look something like this; basically
dsa_port_setup() has failed, we just change dp->type and call
dsa_port_setup() again.

-/* Destroy the current devlink port, and create a new one which has the UNUSED
- * flavour.
- */
-static int dsa_port_reinit_as_unused(struct dsa_port *dp)
+static int dsa_port_setup_as_unused(struct dsa_port *dp)
 {
-	dsa_port_devlink_teardown(dp);
 	dp->type = DSA_PORT_TYPE_UNUSED;
-	return dsa_port_devlink_setup(dp);
+	return dsa_port_setup(dp);
 }

For an UNUSED port, dsa_port_setup() mostly only calls dsa_port_devlink_setup()
anyway, so we could get away with calling just that. But if we call the
full blown dsa_port_setup(dp) (which will be needed to properly set
dp->setup = true), the callee will have the tendency to go through this
code block too, and call dsa_port_disable(dp):

	switch (dp->type) {
	case DSA_PORT_TYPE_UNUSED:
		dsa_port_disable(dp);
		break;

That is not very good, because dsa_port_disable() has this hidden inside
of it:

	if (dp->pl)
		phylink_stop(dp->pl);

Fact is, we are not prepared to handle a call to dsa_port_disable() with
a struct dsa_port that came from a previous (and failed) call to
dsa_port_setup(). We do not clean up dp->pl, and this will make the
second call to dsa_port_setup() call phylink_stop() on a dangling dp->pl
pointer.

Solve this by creating an API for phylink destruction which is symmetric
to the phylink creation, and never leave dp->pl set to anything except
NULL or a valid phylink structure.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:17:16 -07:00
Jiri Pirko
d82acd85cc net: dsa: move port_setup/teardown to be called outside devlink port registered area
Move port_setup() op to be called before devlink_port_register() and
port_teardown() after devlink_port_unregister().

Note it makes sense to move this alongside the rest of the devlink port
code, the reinit() function also gets much nicer, as clearly the fact that
port_setup()->devlink_port_region_create() was called in dsa_port_setup
did not fit the flow.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-30 18:17:16 -07:00
Vladimir Oltean
56378f3ccb net: dsa: make user ports return to init_net on netns deletion
As pointed out during review, currently the following set of commands
crashes the kernel:

$ ip netns add ns0
$ ip link set swp0 netns ns0
$ ip netns del ns0
WARNING: CPU: 1 PID: 27 at net/core/dev.c:10884 unregister_netdevice_many+0xaa4/0xaec
Workqueue: netns cleanup_net
pstate: 20000005 (nzCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : unregister_netdevice_many+0xaa4/0xaec
lr : unregister_netdevice_many+0x700/0xaec
Call trace:
 unregister_netdevice_many+0xaa4/0xaec
 default_device_exit_batch+0x294/0x340
 ops_exit_list+0xac/0xc4
 cleanup_net+0x2e4/0x544
 process_one_work+0x4ec/0xb40
---[ end trace 0000000000000000 ]---
unregister_netdevice: waiting for swp0 to become free. Usage count = 2

This is because since DSA user ports, since they started populating
dev->rtnl_link_ops in the blamed commit, gained a different treatment
from default_device_exit_net(), which thinks these interfaces can now be
unregistered.

They can't; so set netns_refund = true to restore the behavior prior to
populating dev->rtnl_link_ops.

Fixes: 95f510d0b7 ("net: dsa: allow the DSA master to be seen and changed through rtnetlink")
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220921185428.1767001-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-26 11:28:57 -07:00
Vladimir Oltean
acc43b7bf5 net: dsa: allow masters to join a LAG
There are 2 ways in which a DSA user port may become handled by 2 CPU
ports in a LAG:

(1) its current DSA master joins a LAG

 ip link del bond0 && ip link add bond0 type bond mode 802.3ad
 ip link set eno2 master bond0

When this happens, all user ports with "eno2" as DSA master get
automatically migrated to "bond0" as DSA master.

(2) it is explicitly configured as such by the user

 # Before, the DSA master was eno3
 ip link set swp0 type dsa master bond0

The design of this configuration is that the LAG device dynamically
becomes a DSA master through dsa_master_setup() when the first physical
DSA master becomes a LAG slave, and stops being so through
dsa_master_teardown() when the last physical DSA master leaves.

A LAG interface is considered as a valid DSA master only if it contains
existing DSA masters, and no other lower interfaces. Therefore, we
mainly rely on method (1) to enter this configuration.

Each physical DSA master (LAG slave) retains its dev->dsa_ptr for when
it becomes a standalone DSA master again. But the LAG master also has a
dev->dsa_ptr, and this is actually duplicated from one of the physical
LAG slaves, and therefore needs to be balanced when LAG slaves come and
go.

To the switch driver, putting DSA masters in a LAG is seen as putting
their associated CPU ports in a LAG.

We need to prepare cross-chip host FDB notifiers for CPU ports in a LAG,
by calling the driver's ->lag_fdb_add method rather than ->port_fdb_add.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 10:32:36 +02:00
Vladimir Oltean
2e359b00a1 net: dsa: propagate extack to port_lag_join
Drivers could refuse to offload a LAG configuration for a variety of
reasons, mainly having to do with its TX type. Additionally, since DSA
masters may now also be LAG interfaces, and this will translate into a
call to port_lag_join on the CPU ports, there may be extra restrictions
there. Propagate the netlink extack to this DSA method in order for
drivers to give a meaningful error message back to the user.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 10:32:36 +02:00
Vladimir Oltean
13eccc1bbb net: dsa: suppress device links to LAG DSA masters
These don't work (print a harmless error about the operation failing)
and make little sense to have anyway, because when a LAG DSA master goes
away, we will introduce logic to move our CPU port back to the first
physical DSA master. So suppress these device links in preparation for
adding support for LAG DSA masters.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 10:32:36 +02:00
Vladimir Oltean
cfeb84a52f net: dsa: suppress appending ethtool stats to LAG DSA masters
Similar to the discussion about tracking the admin/oper state of LAG DSA
masters, we have the problem here that struct dsa_port *cpu_dp caches a
single pair of orig_ethtool_ops and netdev_ops pointers.

So if we call dsa_master_setup(bond0, cpu_dp) where cpu_dp is also the
dev->dsa_ptr of one of the physical DSA masters, we'd effectively
overwrite what we cached from that physical netdev with what replaced
from the bonding interface.

We don't need DSA ethtool stats on the bonding interface when used as
DSA master, it's good enough to have them just on the physical DSA
masters, so suppress this logic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 10:32:36 +02:00
Vladimir Oltean
6e61b55c6d net: dsa: don't keep track of admin/oper state on LAG DSA masters
We store information about the DSA master's state in
cpu_dp->master_admin_up and cpu_dp->master_oper_up, and this assumes a
bijective association between a CPU port and a DSA master.

However, when we have CPU ports in a LAG (and DSA masters in a LAG too),
the way in which we set up things is that the physical DSA masters still
have dev->dsa_ptr pointing to our cpu_dp, but the bonding/team device
itself also has its dev->dsa_ptr pointing towards one of the CPU port
structures (the first one).

So logically speaking, that first cpu_dp can't keep track of both the
physical master's admin/oper state, and of the bonding master's state.

This isn't even needed; the reason why we keep track of the DSA master's
state is to know when it is available for Ethernet-based register access.
For that use case, we don't even need LAG; we just need to decide upon
one of the physical DSA masters (if there is more than 1 available) and
use that.

This change suppresses dsa_tree_master_{admin,oper}_state_change() calls
on LAG DSA masters (which will be supported in a future change), to
allow the tracking of just physical DSA masters.

Link: https://lore.kernel.org/netdev/628cc94d.1c69fb81.15b0d.422d@mx.google.com/
Suggested-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 10:32:35 +02:00
Vladimir Oltean
95f510d0b7 net: dsa: allow the DSA master to be seen and changed through rtnetlink
Some DSA switches have multiple CPU ports, which can be used to improve
CPU termination throughput, but DSA, through dsa_tree_setup_cpu_ports(),
sets up only the first one, leading to suboptimal use of hardware.

The desire is to not change the default configuration but to permit the
user to create a dynamic mapping between individual user ports and the
CPU port that they are served by, configurable through rtnetlink. It is
also intended to permit load balancing between CPU ports, and in that
case, the foreseen model is for the DSA master to be a bonding interface
whose lowers are the physical DSA masters.

To that end, we create a struct rtnl_link_ops for DSA user ports with
the "dsa" kind. We expose the IFLA_DSA_MASTER link attribute that
contains the ifindex of the newly desired DSA master.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 10:32:35 +02:00
Vladimir Oltean
8f6a19c031 net: dsa: introduce dsa_port_get_master()
There is a desire to support for DSA masters in a LAG.

That configuration is intended to work by simply enslaving the master to
a bonding/team device. But the physical DSA master (the LAG slave) still
has a dev->dsa_ptr, and that cpu_dp still corresponds to the physical
CPU port.

However, we would like to be able to retrieve the LAG that's the upper
of the physical DSA master. In preparation for that, introduce a helper
called dsa_port_get_master() that replaces all occurrences of the
dp->cpu_dp->master pattern. The distinction between LAG and non-LAG will
be made later within the helper itself.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-20 10:32:35 +02:00
Jakub Kicinski
60ad1100d5 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
tools/testing/selftests/net/.gitignore
  sort the net-next version and use it

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-01 12:58:02 -07:00
Kurt Kanzenbach
52267ce25f net: dsa: hellcreek: Print warning only once
In case the source port cannot be decoded, print the warning only once. This
still brings attention to the user and does not spam the logs at the same time.

Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220830163448.8921-1-kurt@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31 19:54:04 -07:00
Jakub Kicinski
880b0dd94f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/mellanox/mlx5/core/en_fs.c
  21234e3a84 ("net/mlx5e: Fix use after free in mlx5e_fs_init()")
  c7eafc5ed0 ("net/mlx5e: Convert ethtool_steering member of flow_steering struct to pointer")
https://lore.kernel.org/all/20220825104410.67d4709c@canb.auug.org.au/
https://lore.kernel.org/all/20220823055533.334471-1-saeed@kernel.org/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-25 16:07:42 -07:00
Vladimir Oltean
855a28f9c9 net: dsa: don't dereference NULL extack in dsa_slave_changeupper()
When a driver returns -EOPNOTSUPP in dsa_port_bridge_join() but failed
to provide a reason for it, DSA attempts to set the extack to say that
software fallback will kick in.

The problem is, when we use brctl and the legacy bridge ioctls, the
extack will be NULL, and DSA dereferences it in the process of setting
it.

Sergei Antonov proves this using the following stack trace:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
PC is at dsa_slave_changeupper+0x5c/0x158

 dsa_slave_changeupper from raw_notifier_call_chain+0x38/0x6c
 raw_notifier_call_chain from __netdev_upper_dev_link+0x198/0x3b4
 __netdev_upper_dev_link from netdev_master_upper_dev_link+0x50/0x78
 netdev_master_upper_dev_link from br_add_if+0x430/0x7f4
 br_add_if from br_ioctl_stub+0x170/0x530
 br_ioctl_stub from br_ioctl_call+0x54/0x7c
 br_ioctl_call from dev_ifsioc+0x4e0/0x6bc
 dev_ifsioc from dev_ioctl+0x2f8/0x758
 dev_ioctl from sock_ioctl+0x5f0/0x674
 sock_ioctl from sys_ioctl+0x518/0xe40
 sys_ioctl from ret_fast_syscall+0x0/0x1c

Fix the problem by only overriding the extack if non-NULL.

Fixes: 1c6e8088d9 ("net: dsa: allow port_bridge_join() to override extack message")
Link: https://lore.kernel.org/netdev/CABikg9wx7vB5eRDAYtvAm7fprJ09Ta27a4ZazC=NX5K4wn6pWA@mail.gmail.com/
Reported-by: Sergei Antonov <saproj@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Tested-by: Sergei Antonov <saproj@gmail.com>
Link: https://lore.kernel.org/r/20220819173925.3581871-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-23 07:54:16 -07:00
Vladimir Oltean
5dc760d120 net: dsa: use dsa_tree_for_each_cpu_port in dsa_tree_{setup,teardown}_master
More logic will be added to dsa_tree_setup_master() and
dsa_tree_teardown_master() in upcoming changes.

Reduce the indentation by one level in these functions by introducing
and using a dedicated iterator for CPU ports of a tree.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
f41ec1fd1c net: dsa: all DSA masters must be down when changing the tagging protocol
The fact that the tagging protocol is set and queried from the
/sys/class/net/<dsa-master>/dsa/tagging file is a bit of a quirk from
the single CPU port days which isn't aging very well now that DSA can
have more than a single CPU port. This is because the tagging protocol
is a switch property, yet in the presence of multiple CPU ports it can
be queried and set from multiple sysfs files, all of which are handled
by the same implementation.

The current logic ensures that the net device whose sysfs file we're
changing the tagging protocol through must be down. That net device is
the DSA master, and this is fine for single DSA master / CPU port setups.

But exactly because the tagging protocol is per switch [ tree, in fact ]
and not per DSA master, this isn't fine any longer with multiple CPU
ports, and we must iterate through the tree and find all DSA masters,
and make sure that all of them are down.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
7136097e11 net: dsa: only bring down user ports assigned to a given DSA master
This is an adaptation of commit c0a8a9c274 ("net: dsa: automatically
bring user ports down when master goes down") for multiple DSA masters.
When a DSA master goes down, only the user ports under its control
should go down too, the others can still send/receive traffic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
4f03dcc6b9 net: dsa: existing DSA masters cannot join upper interfaces
All the traffic to/from a DSA master is supposed to be distributed among
its DSA switch upper interfaces, so we should not allow other upper
device kinds.

An exception to this is DSA_TAG_PROTO_NONE (switches with no DSA tags),
and in that case it is actually expected to create e.g. VLAN interfaces
on the master. But for those, netdev_uses_dsa(master) returns false, so
the restriction doesn't apply.

The motivation for this change is to allow LAG interfaces of DSA masters
to be DSA masters themselves. We want to restrict the user's degrees of
freedom by 1: the LAG should already have all DSA masters as lowers, and
while lower ports of the LAG can be removed, none can be added after the
fact.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
920a33cd72 net: bridge: move DSA master bridging restriction to DSA
When DSA gains support for multiple CPU ports in a LAG, it will become
mandatory to monitor the changeupper events for the DSA master.

In fact, there are already some restrictions to be imposed in that area,
namely that a DSA master cannot be a bridge port except in some special
circumstances.

Centralize the restrictions at the level of the DSA layer as a
preliminary step.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
0498277ee1 net: dsa: don't stop at NOTIFY_OK when calling ds->ops->port_prechangeupper
dsa_slave_prechangeupper_sanity_check() is supposed to enforce some
adjacency restrictions, and calls ds->ops->port_prechangeupper if the
driver implements it.

We convert the error code from the port_prechangeupper() call to a
notifier code, and 0 is converted to NOTIFY_OK, but the caller of
dsa_slave_prechangeupper_sanity_check() stops at any notifier code
different from NOTIFY_DONE.

Avoid this by converting back the notifier code to an error code, so
that both NOTIFY_OK and NOTIFY_DONE will be seen as 0. This allows more
parallel sanity check functions to be added.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
4c3f80d22b net: dsa: walk through all changeupper notifier functions
Traditionally, DSA has had a single netdev notifier handling function
for each device type.

For the sake of code cleanliness, we would like to introduce more
handling functions which do one thing, but the conditions for entering
these functions start to overlap. Example: a handling function which
tracks whether any bridges contain both DSA and non-DSA interfaces.
Either this is placed before dsa_slave_changeupper(), case in which it
will prevent that function from executing, or we place it after
dsa_slave_changeupper(), case in which we will prevent it from
executing. The other alternative is to ignore errors from the new
handling function (not ideal).

To support this usage, we need to change the pattern. In the new model,
we enter all notifier handling sub-functions, and exit with NOTIFY_DONE
if there is nothing to do. This allows the sub-functions to be
relatively free-form and independent from each other.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-08-23 11:39:22 +02:00
Vladimir Oltean
b18e04e362 net: dsa: tag_8021q: remove old comment regarding dsa_8021q_netdev_ops
Since commit 129bd7ca8a ("net: dsa: Prevent usage of NET_DSA_TAG_8021Q
as tagging protocol"), dsa_8021q_netdev_ops no longer exists, so remove
the comment that talks about it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220818143808.2808393-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-22 18:11:58 -07:00
Wolfram Sang
e4d44b3d27 dsa: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220818210216.8419-1-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-22 18:06:35 -07:00
Vladimir Oltean
e09e987315 net: dsa: make phylink-related OF properties mandatory on DSA and CPU ports
Early DSA drivers were kind of simplistic in that they assumed a fairly
narrow hardware layout. User ports would have integrated PHYs at an
internal MDIO address that is derivable from the port number, and shared
(DSA and CPU) ports would have an MII-style (serial or parallel)
connection to another MAC. Phylib and then phylink were used to drive
the internal PHYs, and this needed little to no description through the
platform data structures. Bringing up the shared ports at the maximum
supported link speed was the responsibility of the drivers.

As a result of this, when these early drivers were converted from
platform data to the new DSA OF bindings, there was no link information
translated into the first DT bindings.

https://lore.kernel.org/all/YtXFtTsf++AeDm1l@lunn.ch/

Later, phylink was adopted for shared ports as well, and today we have a
workaround in place, introduced by commit a20f997010 ("net: dsa: Don't
instantiate phylink for CPU/DSA ports unless needed"). There, DSA checks
for the presence of phy-handle/fixed-link/managed OF properties, and if
missing, phylink registration would be skipped. This is because phylink
is optional for some drivers (the shared ports already work without it),
but the process of starting to register a port with phylink is
irreversible: if phylink_create() fails to find the fwnode properties it
needs, it bails out and it leaves the ports inoperational (because
phylink expects ports to be initially down, so DSA necessarily takes
them down, and doesn't know how to put them back up again).

DSA being a common framework, new drivers opt into this workaround
willy-nilly, but the ideal behavior from the DSA core's side would have
been to not interfere with phylink's process of failing at all. This
isn't possible because of regression concerns with pre-phylink DT blobs,
but at least DSA should put a stop to the proliferation of more of such
cases that rely on the workaround to skip phylink registration, and
sanitize the environment that new drivers work in.

To that end, create a list of compatible strings for which the
workaround is preserved, and don't apply the workaround for any drivers
outside that list (this includes new drivers).

In some cases, we make the assumption that even existing drivers don't
rely on DSA's workaround, and we do this by looking at the device trees
in which they appear. We can't fully know what is the situation with
downstream DT blobs, but we can guess the overall trend by studying the
DT blobs that were submitted upstream. If there are upstream blobs that
have lacking descriptions, we take it as very likely that there are many
more downstream blobs that do so too. If all upstream blobs have
complete descriptions, we take that as a hint that the driver is a
candidate for enforcing strict DT bindings (considering that most
bindings are copy-pasted). If there are no upstream DT blobs, we take
the conservative route of allowing the workaround, unless the driver
maintainer instructs us otherwise.

The driver situation is as follows:

ar9331
~~~~~~

    compatible strings:
    - qca,ar9331-switch

    1 occurrence in mainline device trees, part of SoC dtsi
    (arch/mips/boot/dts/qca/ar9331.dtsi), description is not problematic.

    Verdict: opt into strict DT bindings and out of workarounds.

b53
~~~

    compatible strings:
    - brcm,bcm5325
    - brcm,bcm53115
    - brcm,bcm53125
    - brcm,bcm53128
    - brcm,bcm5365
    - brcm,bcm5389
    - brcm,bcm5395
    - brcm,bcm5397
    - brcm,bcm5398

    - brcm,bcm53010-srab
    - brcm,bcm53011-srab
    - brcm,bcm53012-srab
    - brcm,bcm53018-srab
    - brcm,bcm53019-srab
    - brcm,bcm5301x-srab
    - brcm,bcm11360-srab
    - brcm,bcm58522-srab
    - brcm,bcm58525-srab
    - brcm,bcm58535-srab
    - brcm,bcm58622-srab
    - brcm,bcm58623-srab
    - brcm,bcm58625-srab
    - brcm,bcm88312-srab
    - brcm,cygnus-srab
    - brcm,nsp-srab
    - brcm,omega-srab

    - brcm,bcm3384-switch
    - brcm,bcm6328-switch
    - brcm,bcm6368-switch
    - brcm,bcm63xx-switch

    I've found at least these mainline DT blobs with problems:

    arch/arm/boot/dts/bcm47094-linksys-panamera.dts
    - lacks phy-mode
    arch/arm/boot/dts/bcm47189-tenda-ac9.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/bcm47081-luxul-xap-1410.dts
    arch/arm/boot/dts/bcm47081-luxul-xwr-1200.dts
    arch/arm/boot/dts/bcm47081-buffalo-wzr-600dhp2.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/bcm47094-luxul-xbr-4500.dts
    arch/arm/boot/dts/bcm4708-smartrg-sr400ac.dts
    arch/arm/boot/dts/bcm4708-luxul-xap-1510.dts
    arch/arm/boot/dts/bcm953012er.dts
    arch/arm/boot/dts/bcm4708-netgear-r6250.dts
    arch/arm/boot/dts/bcm4708-buffalo-wzr-1166dhp-common.dtsi
    arch/arm/boot/dts/bcm4708-luxul-xwc-1000.dts
    arch/arm/boot/dts/bcm47094-luxul-abr-4500.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/bcm53016-meraki-mr32.dts
    - lacks phy-mode

    Verdict: opt into DSA workarounds.

bcm_sf2
~~~~~~~

    compatible strings:
    - brcm,bcm4908-switch
    - brcm,bcm7445-switch-v4.0
    - brcm,bcm7278-switch-v4.0
    - brcm,bcm7278-switch-v4.8

    A single occurrence in mainline
    (arch/arm64/boot/dts/broadcom/bcm4908/bcm4908.dtsi), part of a SoC
    dtsi, valid description. Florian Fainelli explains that most of the
    bcm_sf2 device trees lack a full description for the internal IMP
    ports.

    Verdict: opt the BCM4908 into strict DT bindings, and opt the rest
    into the workarounds. Note that even though BCM4908 has strict DT
    bindings, it still does not register with phylink on the IMP port
    due to it implementing ->adjust_link().

hellcreek
~~~~~~~~~

    compatible strings:
    - hirschmann,hellcreek-de1soc-r1

    No occurrence in mainline device trees. Kurt Kanzenbach explains
    that the downstream device trees lacked phy-mode and fixed link, and
    needed work, but were fixed in the meantime.

    Verdict: opt into strict DT bindings and out of workarounds.

lan9303
~~~~~~~

    compatible strings:
    - smsc,lan9303-mdio
    - smsc,lan9303-i2c

    1 occurrence in mainline device trees:
    arch/arm/boot/dts/imx53-kp-hsc.dts
    - no phy-mode, no fixed-link

    Verdict: opt out of strict DT bindings and into workarounds.

lantiq_gswip
~~~~~~~~~~~~

    compatible strings:
    - lantiq,xrx200-gswip
    - lantiq,xrx300-gswip
    - lantiq,xrx330-gswip

    No occurrences in mainline device trees. Martin Blumenstingl
    confirms that the downstream OpenWrt device trees lack a proper
    fixed-link and need work, and that the incomplete description can
    even be seen in the example from
    Documentation/devicetree/bindings/net/dsa/lantiq-gswip.txt.

    Verdict: opt out of strict DT bindings and into workarounds.

microchip ksz
~~~~~~~~~~~~~

    compatible strings:
    - microchip,ksz8765
    - microchip,ksz8794
    - microchip,ksz8795
    - microchip,ksz8863
    - microchip,ksz8873
    - microchip,ksz9477
    - microchip,ksz9897
    - microchip,ksz9893
    - microchip,ksz9563
    - microchip,ksz8563
    - microchip,ksz9567
    - microchip,lan9370
    - microchip,lan9371
    - microchip,lan9372
    - microchip,lan9373
    - microchip,lan9374

    5 occurrences in mainline device trees, all descriptions are valid.
    But we had a snafu for the ksz8795 and ksz9477 drivers where the
    phy-mode property would be expected to be located directly under the
    'switch' node rather than under a port OF node. It was fixed by
    commit edecfa98f6 ("net: dsa: microchip: look for phy-mode in port
    nodes"). The driver still has compatibility with the old DT blobs.
    The lan937x support was added later than the above snafu was fixed,
    and even though it has support for the broken DT blobs by virtue of
    sharing a common probing function, I'll take it that its DT blobs
    are correct.

    Verdict: opt lan937x into strict DT bindings, and the others out.

mt7530
~~~~~~

    compatible strings
    - mediatek,mt7621
    - mediatek,mt7530
    - mediatek,mt7531

    Multiple occurrences in mainline device trees, one is part of an SoC
    dtsi (arch/mips/boot/dts/ralink/mt7621.dtsi), all descriptions are fine.

    Verdict: opt into strict DT bindings and out of workarounds.

mv88e6060
~~~~~~~~~

    compatible string:
    - marvell,mv88e6060

    no occurrences in mainline, nobody knows anybody who uses it.

    Verdict: opt out of strict DT bindings and into workarounds.

mv88e6xxx
~~~~~~~~~

    compatible strings:
    - marvell,mv88e6085
    - marvell,mv88e6190
    - marvell,mv88e6250

    Device trees that have incomplete descriptions of CPU or DSA ports:
    arch/arm64/boot/dts/freescale/imx8mq-zii-ultra.dtsi
    - lacks phy-mode
    arch/arm64/boot/dts/marvell/cn9130-crb.dtsi
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/vf610-zii-ssmb-spu3.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-mv88f6281gtw-ge.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-spb4.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-cfu1.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-dev-rev-c.dts
    - lacks phy-mode on CPU port, fixed-link on DSA ports
    arch/arm/boot/dts/vf610-zii-dev-rev-b.dts
    - lacks phy-mode on CPU port
    arch/arm/boot/dts/armada-381-netgear-gs110emx.dts
    - lacks phy-mode
    arch/arm/boot/dts/vf610-zii-scu4-aib.dts
    - lacks fixed-link on xgmii DSA ports and/or in-band-status on
      2500base-x DSA ports, and phy-mode on CPU port
    arch/arm/boot/dts/imx6qdl-gw5904.dtsi
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/armada-385-clearfog-gtr-l8.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/vf610-zii-ssmb-dtu.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-dir665.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-rd88f6281.dtsi
    - lacks phy-mode
    arch/arm/boot/dts/orion5x-netgear-wnr854t.dts
    - lacks phy-mode and fixed-link
    arch/arm/boot/dts/armada-388-clearfog.dts
    - lacks phy-mode
    arch/arm/boot/dts/armada-xp-linksys-mamba.dts
    - lacks phy-mode
    arch/arm/boot/dts/armada-385-linksys.dtsi
    - lacks phy-mode
    arch/arm/boot/dts/imx6q-b450v3.dts
    arch/arm/boot/dts/imx6q-b850v3.dts
    - has a phy-handle but not a phy-mode?
    arch/arm/boot/dts/armada-370-rd.dts
    - lacks phy-mode
    arch/arm/boot/dts/kirkwood-linksys-viper.dts
    - lacks phy-mode
    arch/arm/boot/dts/imx51-zii-rdu1.dts
    - lacks phy-mode
    arch/arm/boot/dts/imx51-zii-scu2-mezz.dts
    - lacks phy-mode
    arch/arm/boot/dts/imx6qdl-zii-rdu2.dtsi
    - lacks phy-mode
    arch/arm/boot/dts/armada-385-clearfog-gtr-s4.dts
    - lacks phy-mode and fixed-link

    Verdict: opt out of strict DT bindings and into workarounds.

ocelot
~~~~~~

    compatible strings:
    - mscc,vsc9953-switch
    - felix (arch/arm64/boot/dts/freescale/fsl-ls1028a.dtsi) is a PCI
      device, has no compatible string

    2 occurrences in mainline, both are part of SoC dtsi and complete.

    Verdict: opt into strict DT bindings and out of workarounds.

qca8k
~~~~~

    compatible strings:
    - qca,qca8327
    - qca,qca8328
    - qca,qca8334
    - qca,qca8337

    5 occurrences in mainline device trees, none of the descriptions are
    problematic.

    Verdict: opt into strict DT bindings and out of workarounds.

realtek
~~~~~~~

    compatible strings:
    - realtek,rtl8366rb
    - realtek,rtl8365mb

    2 occurrences in mainline, both descriptions are fine, additionally
    rtl8365mb.c has a comment "The device tree firmware should also
    specify the link partner of the extension port - either via a
    fixed-link or other phy-handle."

    Verdict: opt into strict DT bindings and out of workarounds.

rzn1_a5psw
~~~~~~~~~~

    compatible strings:
    - renesas,rzn1-a5psw

    One single occurrence, part of SoC dtsi
    (arch/arm/boot/dts/r9a06g032.dtsi), description is fine.

    Verdict: opt into strict DT bindings and out of workarounds.

sja1105
~~~~~~~

    Driver already validates its port OF nodes in
    sja1105_parse_ports_node().

    Verdict: opt into strict DT bindings and out of workarounds.

vsc73xx
~~~~~~~

    compatible strings:
    - vitesse,vsc7385
    - vitesse,vsc7388
    - vitesse,vsc7395
    - vitesse,vsc7398

    2 occurrences in mainline device trees, both descriptions are fine.

    Verdict: opt into strict DT bindings and out of workarounds.

xrs700x
~~~~~~~

    compatible strings:
    - arrow,xrs7003e
    - arrow,xrs7003f
    - arrow,xrs7004e
    - arrow,xrs7004f

    no occurrences in mainline, we don't know.

    Verdict: opt out of strict DT bindings and into workarounds.

Because there is a pattern where newly added switches reuse existing
drivers more often than introducing new ones, I've opted for deciding
who gets to opt into the workaround based on an OF compatible match
table in the DSA core. The alternative would have been to add another
boolean property to struct dsa_switch, like configure_vlan_while_not_filtering.
But this avoids situations where sometimes driver maintainers obfuscate
what goes on by sharing a common probing function, and therefore making
new switches inherit old quirks.

Side note, we also warn about missing properties for drivers that rely
on the workaround. This isn't an indication that we'll break
compatibility with those DT blobs any time soon, but is rather done to
raise awareness about the change, for future DT blob authors.

Cc: Rob Herring <robh+dt@kernel.org>
Cc: Frank Rowand <frowand.list@gmail.com>
Acked-by: Alvin Šipraga <alsi@bang-olufsen.dk> # realtek
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-22 17:45:47 -07:00
Vladimir Oltean
770375ff33 net: dsa: rename dsa_port_link_{,un}register_of
There is a subset of functions that applies only to shared (DSA and CPU)
ports, yet this is difficult to comprehend by looking at their code alone.
These are dsa_port_link_register_of(), dsa_port_link_unregister_of(),
and the functions that only these 2 call.

Rename this class of functions to dsa_shared_port_* to make this fact
more evident, even if this goes against the apparent convention that
function names in port.c must start with dsa_port_.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-22 17:45:47 -07:00
Vladimir Oltean
da2c398e59 net: dsa: avoid dsa_port_link_{,un}register_of() calls with platform data
dsa_port_link_register_of() and dsa_port_link_unregister_of() are not
written with the fact in mind that they can be called with a dp->dn that
is NULL (as evidenced even by the _of suffix in their name), but this is
exactly what happens.

How this behaves will differ depending on whether the backing driver
implements ->adjust_link() or not.

If it doesn't, the "if (of_phy_is_fixed_link(dp->dn) || phy_np)"
condition will return false, and dsa_port_link_register_of() will do
nothing and return 0.

If the driver does implement ->adjust_link(), the
"if (of_phy_is_fixed_link(dp->dn))" condition will return false
(dp->dn is NULL) and we will call dsa_port_setup_phy_of(). This will
call dsa_port_get_phy_device(), which will also return NULL, and we will
also do nothing and return 0.

It is hard to maintain this code and make future changes to it in this
state, so just suppress calls to these 2 functions if dp->dn is NULL.
The only functional effect is that if the driver does implement
->adjust_link(), we'll stop printing this to the console:

Using legacy PHYLIB callbacks. Please migrate to PHYLINK!

but instead we'll always print:

[    8.539848] dsa-loop fixed-0:1f: skipping link registration for CPU port 5

This is for the better anyway, since "using legacy phylib callbacks"
was misleading information - we weren't issuing _any_ callbacks due to
dsa_port_get_phy_device() returning NULL.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-22 17:45:47 -07:00
Vladimir Oltean
211987f3ac net: dsa: don't warn in dsa_port_set_state_now() when driver doesn't support it
ds->ops->port_stp_state_set() is, like most DSA methods, optional, and
if absent, the port is supposed to remain in the forwarding state (as
standalone). Such is the case with the mv88e6060 driver, which does not
offload the bridge layer. DSA warns that the STP state can't be changed
to FORWARDING as part of dsa_port_enable_rt(), when in fact it should not.

The error message is also not up to modern standards, so take the
opportunity to make it more descriptive.

Fixes: fd36454131 ("net: dsa: change scope of STP state setter")
Reported-by: Sergei Antonov <saproj@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Sergei Antonov <saproj@gmail.com>
Link: https://lore.kernel.org/r/20220816201445.1809483-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-17 21:58:22 -07:00
Xie Shaowen
062cf5ebc2 net: dsa: Fix spelling mistakes and cleanup code
fix follow spelling misktakes:
	desconstructed ==> deconstructed
	enforcment ==> enforcement

Reported-by: Hacash Robot <hacashRobot@santino.com>
Signed-off-by: Xie Shaowen <studentxswpy@163.com>
Link: https://lore.kernel.org/r/20220730092254.3102875-1-studentxswpy@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-01 12:23:06 -07:00
Jakub Kicinski
272ac32f56 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 18:21:16 -07:00
Vladimir Oltean
c7560d1203 net: dsa: fix reference counting for LAG FDBs
Due to an invalid conflict resolution on my side while working on 2
different series (LAG FDBs and FDB isolation), dsa_switch_do_lag_fdb_add()
does not store the database associated with a dsa_mac_addr structure.

So after adding an FDB entry associated with a LAG, dsa_mac_addr_find()
fails to find it while deleting it, because &a->db is zeroized memory
for all stored FDB entries of lag->fdbs, and dsa_switch_do_lag_fdb_del()
returns -ENOENT rather than deleting the entry.

Fixes: c26933639b ("net: dsa: request drivers to perform FDB isolation")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220723012411.1125066-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-25 19:37:06 -07:00
Jakub Kicinski
6e0e846ee2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-21 13:03:39 -07:00
Vladimir Oltean
1699b4d502 net: dsa: fix NULL pointer dereference in dsa_port_reset_vlan_filtering
The "ds" iterator variable used in dsa_port_reset_vlan_filtering() ->
dsa_switch_for_each_port() overwrites the "dp" received as argument,
which is later used to call dsa_port_vlan_filtering() proper.

As a result, switches which do enter that code path (the ones with
vlan_filtering_is_global=true) will dereference an invalid dp in
dsa_port_reset_vlan_filtering() after leaving a VLAN-aware bridge.

Use a dedicated "other_dp" iterator variable to avoid this from
happening.

Fixes: d0004a020b ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:14:23 -07:00
Vladimir Oltean
4db2a5ef4c net: dsa: fix dsa_port_vlan_filtering when global
The blamed refactoring commit changed a "port" iterator with "other_dp",
but still looked at the slave_dev of the dp outside the loop, instead of
other_dp->slave from the loop.

As a result, dsa_port_vlan_filtering() would not call
dsa_slave_manage_vlan_filtering() except for the port in cause, and not
for all switch ports as expected.

Fixes: d0004a020b ("net: dsa: remove the "dsa_to_port in a loop" antipattern from the core")
Reported-by: Lucian Banu <Lucian.Banu@westermo.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-18 20:14:23 -07:00
Prasanna Vengateshan
092f875131 net: dsa: tag_ksz: add tag handling for Microchip LAN937x
The Microchip LAN937X switches have a tagging protocol which is
very similar to KSZ tagging. So that the implementation is added to
tag_ksz.c and reused common APIs

Signed-off-by: Prasanna Vengateshan <prasanna.vengateshan@microchip.com>
Signed-off-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-02 16:34:05 +01:00
Oleksij Rempel
3d410403a5 net: dsa: add get_pause_stats support
Add support for pause stats

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-29 20:17:11 -07:00
Clément Léger
a08d6a6dc8 net: dsa: add Renesas RZ/N1 switch tag driver
The switch that is present on the Renesas RZ/N1 SoC uses a specific
VLAN value followed by 6 bytes which contains forwarding configuration.

Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-27 11:37:55 +01:00
Clément Léger
67f38b1c73 net: dsa: add support for ethtool get_rmon_stats()
Add support to allow dsa drivers to specify the .get_rmon_stats()
operation.

Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-27 11:37:55 +01:00
Clément Léger
1c6e8088d9 net: dsa: allow port_bridge_join() to override extack message
Some drivers might report that they are unable to bridge ports by
returning -EOPNOTSUPP, but still wants to override extack message.
In order to do so, in dsa_slave_changeupper(), if port_bridge_join()
returns -EOPNOTSUPP, check if extack message is set and if so, do not
override it.

Signed-off-by: Clément Léger <clement.leger@bootlin.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-27 11:37:54 +01:00
Eric Dumazet
9962acefbc net: adopt u64_stats_t in struct pcpu_sw_netstats
As explained in commit 316580b69d ("u64_stats: provide u64_stats_t type")
we should use u64_stats_t and related accessors to avoid load/store tearing.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-09 21:53:11 -07:00
Luiz Angelo Daros de Luca
fe7324b932 net: dsa: OF-ware slave_mii_bus
If found, register the DSA internally allocated slave_mii_bus with an OF
"mdio" child object. It can save some drivers from creating their
custom internal MDIO bus.

Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-23 12:27:53 +01:00
Vladimir Oltean
bacf93b056 net: dsa: remove port argument from ->change_tag_protocol()
DSA has not supported (and probably will not support in the future
either) independent tagging protocols per CPU port.

Different switch drivers have different requirements, some may need to
replicate some settings for each CPU port, some may need to apply some
settings on a single CPU port, while some may have to configure some
global settings and then some per-CPU-port settings.

In any case, the current model where DSA calls ->change_tag_protocol for
each CPU port turns out to be impractical for drivers where there are
global things to be done. For example, felix calls dsa_tag_8021q_register(),
which makes no sense per CPU port, so it suppresses the second call.

Let drivers deal with replication towards all CPU ports, and remove the
CPU port argument from the function prototype.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:38:55 -07:00
Vladimir Oltean
72c3b0c735 net: dsa: felix: manage host flooding using a specific driver callback
At the time - commit 7569459a52 ("net: dsa: manage flooding on the CPU
ports") - not introducing a dedicated switch callback for host flooding
made sense, because for the only user, the felix driver, there was
nothing different to do for the CPU port than set the flood flags on the
CPU port just like on any other bridge port.

There are 2 reasons why this approach is not good enough, however.

(1) Other drivers, like sja1105, support configuring flooding as a
    function of {ingress port, egress port}, whereas the DSA
    ->port_bridge_flags() function only operates on an egress port.
    So with that driver we'd have useless host flooding from user ports
    which don't need it.

(2) Even with the felix driver, support for multiple CPU ports makes it
    difficult to piggyback on ->port_bridge_flags(). The way in which
    the felix driver is going to support host-filtered addresses with
    multiple CPU ports is that it will direct these addresses towards
    both CPU ports (in a sort of multicast fashion), then restrict the
    forwarding to only one of the two using the forwarding masks.
    Consequently, flooding will also be enabled towards both CPU ports.
    However, ->port_bridge_flags() gets passed the index of a single CPU
    port, and that leaves the flood settings out of sync between the 2
    CPU ports.

This is to say, it's better to have a specific driver method for host
flooding, which takes the user port as argument. This solves problem (1)
by allowing the driver to do different things for different user ports,
and problem (2) by abstracting the operation and letting the driver do
whatever, rather than explicitly making the DSA core point to the CPU
port it thinks needs to be touched.

This new method also creates a problem, which is that cross-chip setups
are not handled. However I don't have hardware right now where I can
test what is the proper thing to do, and there isn't hardware compatible
with multi-switch trees that supports host flooding. So it remains a
problem to be tackled in the future.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:38:55 -07:00
Jakub Kicinski
9b19e57a3c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Build issue in drivers/net/ethernet/sfc/ptp.c
  54fccfdd7c ("sfc: efx_default_channel_type APIs can be static")
  49e6123c65 ("net: sfc: fix memory leak due to ptp channel")
https://lore.kernel.org/all/20220510130556.52598fe2@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:15:30 -07:00
Vladimir Oltean
630fd4822a net: dsa: flush switchdev workqueue on bridge join error path
There is a race between switchdev_bridge_port_offload() and the
dsa_port_switchdev_sync_attrs() call right below it.

When switchdev_bridge_port_offload() finishes, FDB entries have been
replayed by the bridge, but are scheduled for deferred execution later.

However dsa_port_switchdev_sync_attrs -> dsa_port_can_apply_vlan_filtering()
may impose restrictions on the vlan_filtering attribute and refuse
offloading.

When this happens, the delayed FDB entries will dereference dp->bridge,
which is a NULL pointer because we have stopped the process of
offloading this bridge.

Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
Workqueue: dsa_ordered dsa_slave_switchdev_event_work
pc : dsa_port_bridge_host_fdb_del+0x64/0x100
lr : dsa_slave_switchdev_event_work+0x130/0x1bc
Call trace:
 dsa_port_bridge_host_fdb_del+0x64/0x100
 dsa_slave_switchdev_event_work+0x130/0x1bc
 process_one_work+0x294/0x670
 worker_thread+0x80/0x460
---[ end trace 0000000000000000 ]---
Error: dsa_core: Must first remove VLAN uppers having VIDs also present in bridge.

Fix the bug by doing what we do on the normal bridge leave path as well,
which is to wait until the deferred FDB entries complete executing, then
exit.

The placement of dsa_flush_workqueue() after switchdev_bridge_port_unoffload()
guarantees that both the FDB additions and deletions on rollback are waited for.

Fixes: d7d0d423db ("net: dsa: flush switchdev workqueue when leaving the bridge")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220507134550.1849834-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-09 18:08:04 -07:00
Vladimir Oltean
fe5233b0ba net: dsa: delete dsa_port_walk_{fdbs,mdbs}
All the users of these functions are gone, delete them before they gain
new ones.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-06 21:00:12 -07:00
Jakub Kicinski
0e55546b18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/linux/netdevice.h
net/core/dev.c
  6510ea973d ("net: Use this_cpu_inc() to increment net->core_stats")
  794c24e992 ("net-core: rx_otherhost_dropped to core_stats")
https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/

drivers/net/wan/cosa.c
  d48fea8401 ("net: cosa: fix error check return value of register_chrdev()")
  89fbca3307 ("net: wan: remove support for COSA and SRP synchronous serial boards")
https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 13:02:01 -07:00
Marcin Wojtas
df1cc21152 net: dsa: remove unused headers
Reduce a number of included headers to a necessary minimum.

Signed-off-by: Marcin Wojtas <mw@semihalf.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 12:08:40 +01:00
Vladimir Oltean
7c762e70c5 net: dsa: flood multicast to CPU when slave has IFF_PROMISC
Certain DSA switches can eliminate flooding to the CPU when none of the
ports have the IFF_ALLMULTI or IFF_PROMISC flags set. This is done by
synthesizing a call to dsa_port_bridge_flags() for the CPU port, a call
which normally comes from the bridge driver via switchdev.

The bridge port flags and IFF_PROMISC|IFF_ALLMULTI have slightly
different semantics, and due to inattention/lack of proper testing, the
IFF_PROMISC flag allows unknown unicast to be flooded to the CPU, but
not unknown multicast.

This must be fixed by setting both BR_FLOOD (unicast) and BR_MCAST_FLOOD
in the synthesized dsa_port_bridge_flags() call, since IFF_PROMISC means
that packets should not be filtered regardless of their MAC DA.

Fixes: 7569459a52 ("net: dsa: manage flooding on the CPU ports")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 11:46:24 +01:00
Miaoqian Lin
fc06b2867f net: dsa: Add missing of_node_put() in dsa_port_link_register_of
The device_node pointer is returned by of_parse_phandle()  with refcount
incremented. We should use of_node_put() on it when done.
of_node_put() will check for NULL value.

Fixes: a20f997010 ("net: dsa: Don't instantiate phylink for CPU/DSA ports unless needed")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-22 13:06:58 +01:00
Paolo Abeni
f70925bf99 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/microchip/lan966x/lan966x_main.c
  d08ed85256 ("net: lan966x: Make sure to release ptp interrupt")
  c834963932 ("net: lan966x: Add FDMA functionality")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-22 09:56:00 +02:00
Vladimir Oltean
be6ff9665d net: dsa: don't emit targeted cross-chip notifiers for MTU change
A cross-chip notifier with "targeted_match=true" is one that matches
only the local port of the switch that emitted it. In other words,
passing through the cross-chip notifier layer serves no purpose.

Eliminate this concept by calling directly ds->ops->port_change_mtu
instead of emitting a targeted cross-chip notifier. This leaves the
DSA_NOTIFIER_MTU event being emitted only for MTU updates on the CPU
port, which need to be reflected also across all DSA links.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:34:34 +01:00
Vladimir Oltean
4715029fa7 net: dsa: drop dsa_slave_priv from dsa_slave_change_mtu
We can get a hold of the "ds" pointer directly from "dp", no need for
the dsa_slave_priv.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:34:34 +01:00
Vladimir Oltean
cf1c39d3b3 net: dsa: avoid one dsa_to_port() in dsa_slave_change_mtu
We could retrieve the cpu_dp pointer directly from the "dp" we already
have, no need to resort to dsa_to_port(ds, port).

This change also removes the need for an "int port", so that is also
deleted.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:34:34 +01:00
Vladimir Oltean
b2033a05a7 net: dsa: use dsa_tree_for_each_user_port in dsa_slave_change_mtu
Use the more conventional iterator over user ports instead of explicitly
ignoring them, and use the more conventional name "other_dp" instead of
"dp_iter", for readability.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:34:34 +01:00
Vladimir Oltean
726816a129 net: dsa: make cross-chip notifiers more efficient for host events
To determine whether a given port should react to the port targeted by
the notifier, dsa_port_host_vlan_match() and dsa_port_host_address_match()
look at the positioning of the switch port currently executing the
notifier relative to the switch port for which the notifier was emitted.

To maintain stylistic compatibility with the other match functions from
switch.c, the host address and host VLAN match functions take the
notifier information about targeted port, switch and tree indices as
argument. However, these functions only use that information to retrieve
the struct dsa_port *targeted_dp, which is an invariant for the outer
loop that calls them. So it makes more sense to calculate the targeted
dp only once, and pass it to them as argument.

But furthermore, the targeted dp is actually known at the time the call
to dsa_port_notify() is made. It is just that we decide to only save the
indices of the port, switch and tree in the notifier structure, just to
retrace our steps and find the dp again using dsa_switch_find() and
dsa_to_port().

But both the above functions are relatively expensive, since they need
to iterate through lists. It appears more straightforward to make all
notifiers just pass the targeted dp inside their info structure, and
have the code that needs the indices to look at info->dp->index instead
of info->port, or info->dp->ds->index instead of info->sw_index, or
info->dp->ds->dst->index instead of info->tree_index.

For the sake of consistency, all cross-chip notifiers are converted to
pass the "dp" directly.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:34:34 +01:00
Vladimir Oltean
8e9e678e47 net: dsa: move reset of VLAN filtering to dsa_port_switchdev_unsync_attrs
In dsa_port_switchdev_unsync_attrs() there is a comment that resetting
the VLAN filtering isn't done where it is expected. And since commit
108dc8741c ("net: dsa: Avoid cross-chip syncing of VLAN filtering"),
there is no reason to handle this in switch.c either.

Therefore, move the logic to port.c, and adapt it slightly to the data
structures and naming conventions from there.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 10:34:34 +01:00
Kurt Kanzenbach
0763120b09 net: dsa: hellcreek: Calculate checksums in tagger
In case the checksum calculation is offloaded to the DSA master network
interface, it will include the switch trailing tag. As soon as the switch strips
that tag on egress, the calculated checksum is wrong.

Therefore, add the checksum calculation to the tagger (if required) before
adding the switch tag. This way, the hellcreek code works with all DSA master
interfaces regardless of their declared feature set.

Fixes: 01ef09caad ("net: dsa: Add tag handling for Hirschmann Hellcreek switches")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220415103320.90657-1-kurt@linutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-19 09:49:38 +02:00
Vladimir Oltean
762c2998c9 Revert "net: dsa: setup master before ports"
This reverts commit 11fd667dac.

dsa_slave_change_mtu() updates the MTU of the DSA master and of the
associated CPU port, but only if it detects a change to the master MTU.

The blamed commit in the Fixes: tag below addressed a regression where
dsa_slave_change_mtu() would return early and not do anything due to
ds->ops->port_change_mtu() not being implemented.

However, that commit also had the effect that the master MTU got set up
to the correct value by dsa_master_setup(), but the associated CPU port's
MTU did not get updated. This causes breakage for drivers that rely on
the ->port_change_mtu() DSA call to account for the tagging overhead on
the CPU port, and don't set up the initial MTU during the setup phase.

Things actually worked before because they were in a fragile equilibrium
where dsa_slave_change_mtu() was called before dsa_master_setup() was.
So dsa_slave_change_mtu() could actually detect a change and update the
CPU port MTU too.

Restore the code to the way things used to work by reverting the reorder
of dsa_tree_setup_master() and dsa_tree_setup_ports(). That change did
not have a concrete motivation going for it anyway, it just looked
better.

Fixes: 066dfc4290 ("Revert "net: dsa: stop updating master MTU from master.c"")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13 12:38:06 +01:00
Vladimir Oltean
066dfc4290 Revert "net: dsa: stop updating master MTU from master.c"
This reverts commit a1ff94c297.

Switch drivers that don't implement ->port_change_mtu() will cause the
DSA master to remain with an MTU of 1500, since we've deleted the other
code path. In turn, this causes a regression for those systems, where
MTU-sized traffic can no longer be terminated.

Revert the change taking into account the fact that rtnl_lock() is now
taken top-level from the callers of dsa_master_setup() and
dsa_master_teardown(). Also add a comment in order for it to be
absolutely clear why it is still needed.

Fixes: a1ff94c297 ("net: dsa: stop updating master MTU from master.c")
Reported-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-01 11:59:01 +01:00
Jakub Kicinski
89695196f0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in overtime fixes, no conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-23 10:53:49 -07:00
Vladimir Oltean
5077e2c8cf net: dsa: fix missing host-filtered multicast addresses
DSA ports are stacked devices, so they use dev_mc_add() to sync their
address list to their lower interface (DSA master). But they are also
hardware devices, so they program those addresses to hardware using the
__dev_mc_add() sync and unsync callbacks.

Unfortunately both cannot work at the same time, and it seems that the
multicast addresses which are already present on the DSA master, like
33:33:00:00:00:01 (added by addrconf.c as in6addr_linklocal_allnodes)
are synced to the master via dev_mc_sync(), but not to hardware by
__dev_mc_sync().

This happens because both the dev_mc_sync() -> __hw_addr_sync_one()
code path, as well as __dev_mc_sync() -> __hw_addr_sync_dev(), operate
on the same variable: ha->sync_cnt, in a way that causes the "sync"
method (dsa_slave_sync_mc) to no longer be called.

To fix the issue we need to work with the API in the way in which it was
intended to be used, and therefore, call dev_uc_add() and friends for
each individual hardware address, from the sync and unsync callbacks.

Fixes: 5e8a1e03aa ("net: dsa: install secondary unicast and multicast addresses as host FDB/MDB")
Link: https://lore.kernel.org/netdev/20220321163213.lrn5sk7m6grighbl@skbuf/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220322003701.2056895-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-22 22:19:35 -07:00
Vladimir Oltean
8fd36358ce net: dsa: fix panic on shutdown if multi-chip tree failed to probe
DSA probing is atypical because a tree of devices must probe all at
once, so out of N switches which call dsa_tree_setup_routing_table()
during probe, for (N - 1) of them, "complete" will return false and they
will exit probing early. The Nth switch will set up the whole tree on
their behalf.

The implication is that for (N - 1) switches, the driver binds to the
device successfully, without doing anything. When the driver is bound,
the ->shutdown() method may run. But if the Nth switch has failed to
initialize the tree, there is nothing to do for the (N - 1) driver
instances, since the slave devices have not been created, etc. Moreover,
dsa_switch_shutdown() expects that the calling @ds has been in fact
initialized, so it jumps at dereferencing the various data structures,
which is incorrect.

Avoid the ensuing NULL pointer dereferences by simply checking whether
the Nth switch has previously set "ds->setup = true" for the switch
which is currently shutting down. The entire setup is serialized under
dsa2_mutex which we already hold.

Fixes: 0650bf52b3 ("net: dsa: be compatible with masters which unregister on shutdown")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220318195443.275026-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-21 22:31:28 -07:00
Vladimir Oltean
0148bb50b8 net: dsa: pass extack to dsa_switch_ops :: port_mirror_add()
Drivers might have error messages to propagate to user space, most
common being that they support a single mirror port.

Propagate the netlink extack so that they can inform user space in a
verbal way of their limitations.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 17:42:47 -07:00
Tobias Waldekranz
7414af30b7 net: dsa: Handle MST state changes
Add the usual trampoline functionality from the generic DSA layer down
to the drivers for MST state changes.

When a state changes to disabled/blocking/listening, make sure to fast
age any dynamic entries in the affected VLANs (those controlled by the
MSTI in question).

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 16:49:59 -07:00
Tobias Waldekranz
8e6598a7b0 net: dsa: Pass VLAN MSTI migration notifications to driver
Add the usual trampoline functionality from the generic DSA layer down
to the drivers for VLAN MSTI migrations.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 16:49:59 -07:00
Tobias Waldekranz
332afc4c8c net: dsa: Validate hardware support for MST
When joining a bridge where MST is enabled, we validate that the
proper offloading support is in place, otherwise we fallback to
software bridging.

When then mode is changed on a bridge in which we are members, we
refuse the change if offloading is not supported.

At the moment we only check for configurable learning, but this will
be further restricted as we support more MST related switchdev events.

Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 16:49:58 -07:00
Jakub Kicinski
e243f39685 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 13:56:58 -07:00
Miaoqian Lin
cb0b430b4e net: dsa: Add missing of_node_put() in dsa_port_parse_of
The device_node pointer is returned by of_parse_phandle()  with refcount
incremented. We should use of_node_put() on it when done.

Fixes: 6d4e5c570c ("net: dsa: get port type at parse time")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Link: https://lore.kernel.org/r/20220316082602.10785-1-linmq006@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17 13:13:27 +01:00
Tobias Waldekranz
a860352e9d net: dsa: Never offload FDB entries on standalone ports
If a port joins a bridge that it can't offload, it will fallback to
standalone mode and software bridging. In this case, we never want to
offload any FDB entries to hardware either.

Previously, for host addresses, we would eventually end up in
dsa_port_bridge_host_fdb_add, which would unconditionally dereference
dp->bridge and cause a segfault.

Fixes: c26933639b ("net: dsa: request drivers to perform FDB isolation")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220315233033.1468071-1-tobias@waldekranz.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-16 19:36:56 -07:00
Vladimir Oltean
47d75f7822 net: dsa: report and change port dscp priority using dcbnl
Similar to the port-based default priority, IEEE 802.1Q-2018 allows the
Application Priority Table to define QoS classes (0 to 7) per IP DSCP
value (0 to 63).

In the absence of an app table entry for a packet with DSCP value X,
QoS classification for that packet falls back to other methods (VLAN PCP
or port-based default). The presence of an app table for DSCP value X
with priority Y makes the hardware classify the packet to QoS class Y.

As opposed to the default-prio where DSA exposes only a "set" in
dsa_switch_ops (because the port-based default is the fallback, it
always exists, either implicitly or explicitly), for DSCP priorities we
expose an "add" and a "del". The addition of a DSCP entry means trusting
that DSCP priority, the deletion means ignoring it.

Drivers that already trust (at least some) DSCP values can describe
their configuration in dsa_switch_ops :: port_get_dscp_prio(), which is
called for each DSCP value from 0 to 63.

Again, there can be more than one dcbnl app table entry for the same
DSCP value, DSA chooses the one with the largest configured priority.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-14 10:36:15 +00:00
Vladimir Oltean
d538eca85c net: dsa: report and change port default priority using dcbnl
The port-based default QoS class is assigned to packets that lack a
VLAN PCP (or the port is configured to not trust the VLAN PCP),
an IP DSCP (or the port is configured to not trust IP DSCP), and packets
on which no tc-skbedit action has matched.

Similar to other drivers, this can be exposed to user space using the
DCB Application Priority Table. IEEE 802.1Q-2018 specifies in Table
D-8 - Sel field values that when the Selector is 1, the Protocol ID
value of 0 denotes the "Default application priority. For use when
application priority is not otherwise specified."

The way in which the dcbnl integration in DSA has been designed has to
do with its requirements. Andrew Lunn explains that SOHO switches are
expected to come with some sort of pre-configured QoS profile, and that
it is desirable for this to come pre-loaded into the DSA slave interfaces'
DCB application priority table.

In the dcbnl design, this is possible because calls to dcb_ieee_setapp()
can be initiated by anyone including being self-initiated by this device
driver.

However, what makes this challenging to implement in DSA is that the DSA
core manages the net_devices (effectively hiding them from drivers),
while drivers manage the hardware. The DSA core has no knowledge of what
individual drivers' QoS policies are. DSA could export to drivers a
wrapper over dcb_ieee_setapp() and these could call that function to
pre-populate the app priority table, however drivers don't have a good
moment in time to do this. The dsa_switch_ops :: setup() method gets
called before the net_devices are created (dsa_slave_create), and so is
dsa_switch_ops :: port_setup(). What remains is dsa_switch_ops ::
port_enable(), but this gets called upon each ndo_open. If we add app
table entries on every open, we'd need to remove them on close, to avoid
duplicate entry errors. But if we delete app priority entries on close,
what we delete may not be the initial, driver pre-populated entries, but
rather user-added entries.

So it is clear that letting drivers choose the timing of the
dcb_ieee_setapp() call is inappropriate. The alternative which was
chosen is to introduce hardware-specific ops in dsa_switch_ops, and
effectively hide dcbnl details from drivers as well. For pre-populating
the application table, dsa_slave_dcbnl_init() will call
ds->ops->port_get_default_prio() which is supposed to read from
hardware. If the operation succeeds, DSA creates a default-prio app
table entry. The method is called as soon as the slave_dev is
registered, but before we release the rtnl_mutex. This is done such that
user space sees the app table entries as soon as it sees the interface
being registered.

The fact that we populate slave_dev->dcbnl_ops with a non-NULL pointer
changes behavior in dcb_doit() from net/dcb/dcbnl.c, which used to
return -EOPNOTSUPP for any dcbnl operation where netdev->dcbnl_ops is
NULL. Because there are still dcbnl-unaware DSA drivers even if they
have dcbnl_ops populated, the way to restore the behavior is to make all
dcbnl_ops return -EOPNOTSUPP on absence of the hardware-specific
dsa_switch_ops method.

The dcbnl framework absurdly allows there to be more than one app table
entry for the same selector and protocol (in other words, more than one
port-based default priority). In the iproute2 dcb program, there is a
"replace" syntactical sugar command which performs an "add" and a "del"
to hide this away. But we choose the largest configured priority when we
call ds->ops->port_set_default_prio(), using __fls(). When there is no
default-prio app table entry left, the port-default priority is restored
to 0.

Link: https://patchwork.kernel.org/project/netdevbpf/patch/20210113154139.1803705-2-olteanv@gmail.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-14 10:36:15 +00:00
Jakub Kicinski
1e8a3f0d2a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/dsa/dsa2.c
  commit afb3cc1a39 ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails")
  commit e83d565378 ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master")
https://lore.kernel.org/all/20220307101436.7ae87da0@canb.auug.org.au/

drivers/net/ethernet/intel/ice/ice.h
  commit 97b0129146 ("ice: Fix error with handling of bonding MTU")
  commit 43113ff734 ("ice: add TTY for GNSS module for E810T device")
https://lore.kernel.org/all/20220310112843.3233bcf1@canb.auug.org.au/

drivers/staging/gdm724x/gdm_lte.c
  commit fc7f750dc9 ("staging: gdm724x: fix use after free in gdm_lte_rx()")
  commit 4bcc4249b4 ("staging: Use netif_rx().")
https://lore.kernel.org/all/20220308111043.1018a59d@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 17:16:56 -08:00
Luiz Angelo Daros de Luca
3126b731ce net: dsa: tag_rtl8_4: fix typo in modalias name
DSA_TAG_PROTO_RTL8_4L is not defined. It should be
DSA_TAG_PROTO_RTL8_4T.

Fixes: cd87fecded ("net: dsa: tag_rtl8_4: add rtl8_4t trailing variant")
Reported-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220309175641.12943-1-luizluca@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09 20:36:24 -08:00
Vladimir Oltean
7e580490ac net: dsa: felix: avoid early deletion of host FDB entries
The Felix driver declares FDB isolation but puts all standalone ports in
VID 0. This is mostly problem-free as discussed with Alvin here:
https://patchwork.kernel.org/project/netdevbpf/cover/20220302191417.1288145-1-vladimir.oltean@nxp.com/#24763870

however there is one catch. DSA still thinks that FDB entries are
installed on the CPU port as many times as there are user ports, and
this is problematic when multiple user ports share the same MAC address.

Consider the default case where all user ports inherit their MAC address
from the DSA master, and then the user runs:

ip link set swp0 address 00:01:02:03:04:05

The above will make dsa_slave_set_mac_address() call
dsa_port_standalone_host_fdb_add() for 00:01:02:03:04:05 in port 0's
standalone database, and dsa_port_standalone_host_fdb_del() for the old
address of swp0, again in swp0's standalone database.

Both the ->port_fdb_add() and ->port_fdb_del() will be propagated down
to the felix driver, which will end up deleting the old MAC address from
the CPU port. But this is still in use by other user ports, so we end up
breaking unicast termination for them.

There isn't a problem in the fact that DSA keeps track of host
standalone addresses in the individual database of each user port: some
drivers like sja1105 need this. There also isn't a problem in the fact
that some drivers choose the same VID/FID for all standalone ports.
It is just that the deletion of these host addresses must be delayed
until they are known to not be in use any longer, and only the driver
has this knowledge. Since DSA keeps these addresses in &cpu_dp->fdbs and
&cpu_db->mdbs, it is just a matter of walking over those lists and see
whether the same MAC address is present on the CPU port in the port db
of another user port.

I have considered reusing the generic dsa_port_walk_fdbs() and
dsa_port_walk_mdbs() schemes for this, but locking makes it difficult.
In the ->port_fdb_add() method and co, &dp->addr_lists_lock is held, but
dsa_port_walk_fdbs() also acquires that lock. Also, even assuming that
we introduce an unlocked variant of the address iterator, we'd still
need some relatively complex data structures, and a void *ctx in the
dsa_fdb_walk_cb_t which we don't currently pass, such that drivers are
able to figure out, after iterating, whether the same MAC address is or
isn't present in the port db of another port.

All the above, plus the fact that I expect other drivers to follow the
same model as felix where all standalone ports use the same FID, made me
conclude that a generic method provided by DSA is necessary:
dsa_fdb_present_in_other_db() and the mdb equivalent. Felix calls this
from the ->port_fdb_del() handler for the CPU port, when the database
was classified to either a port db, or a LAG db.

For symmetry, we also call this from ->port_fdb_add(), because if the
address was installed once, then installing it a second time serves no
purpose: it's already in hardware in VID 0 and it affects all standalone
ports.

This change moves dsa_db_equal() from switch.c to dsa.c, since it now
has one more caller.

Fixes: 54c3198460 ("net: mscc: ocelot: enforce FDB isolation when VLAN-unaware")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 11:12:10 +00:00
Vladimir Oltean
e2d0576f0c net: dsa: be mostly no-op in dsa_slave_set_mac_address when down
Since the slave unicast address is synced to hardware and to the DSA
master during dsa_slave_open(), this means that a call to
dsa_slave_set_mac_address() while the slave interface is down will
result to a call to dsa_port_standalone_host_fdb_del() and to
dev_uc_del() for the MAC address while there was no previous
dsa_port_standalone_host_fdb_add() or dev_uc_add().

This is a partial revert of the blamed commit below, which was too
aggressive.

Fixes: 35aae5ab91 ("net: dsa: remove workarounds for changing master promisc/allmulti only while up")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 11:12:09 +00:00
Vladimir Oltean
fe95784fb1 net: dsa: move port lists initialization to dsa_port_touch
&cpu_db->fdbs and &cpu_db->mdbs may be uninitialized lists during some
call paths of felix_set_tag_protocol().

There was an attempt to avoid calling dsa_port_walk_fdbs() during setup
by using a "bool change" in the felix driver, but this doesn't work when
the tagging protocol is defined in the device tree, and a change is
triggered by DSA at pseudo-runtime:

dsa_tree_setup_switches
-> dsa_switch_setup
   -> dsa_switch_setup_tag_protocol
      -> ds->ops->change_tag_protocol
dsa_tree_setup_ports
-> dsa_port_setup
   -> &dp->fdbs and &db->mdbs only get initialized here

So it seems like the only way to fix this is to move the initialization
of these lists earlier.

dsa_port_touch() is called from dsa_switch_touch_ports() which is called
from dsa_switch_parse_of(), and this runs completely before
dsa_tree_setup(). Similarly, dsa_switch_release_ports() runs after
dsa_tree_teardown().

Fixes: f9cef64fa2 ("net: dsa: felix: migrate host FDB and MDB entries when changing tag proto")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 11:12:09 +00:00
Vladimir Oltean
0832cd9f1f net: dsa: warn if port lists aren't empty in dsa_port_teardown
There has been recent work towards matching each switchdev object
addition with a corresponding deletion.

Therefore, having elements in the fdbs, mdbs, vlans lists at the time of
a shared (DSA, CPU) port's teardown is indicative of a bug somewhere
else, and not something that is to be expected.

We shouldn't try to silently paper over that. Instead, print a warning
and a stack trace.

This change is a prerequisite for moving the initialization/teardown of
these lists. Make it clear that clearing the lists isn't needed.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-09 11:12:09 +00:00
Tobias Waldekranz
6c43a920a5 net: dsa: tag_dsa: Fix tx from VLAN uppers on non-filtering bridges
In this situation (VLAN filtering disabled on br0):

    br0.10
     /
   br0
   / \
swp0 swp1

When a frame is transmitted from the VLAN upper, the bridge will send
it down to one of the switch ports with forward offloading
enabled. This will cause tag_dsa to generate a FORWARD tag. Before
this change, that tag would have it's VID set to 10, even though VID
10 is not loaded in the VTU.

Before the blamed commit, the frame would trigger a VTU miss and be
forwarded according to the PVT configuration. Now that all fabric
ports are in 802.1Q secure mode, the frame is dropped instead.

Therefore, restrict the condition under which we rewrite an 802.1Q tag
to a DSA tag. On standalone port's, reuse is always safe since we will
always generate FROM_CPU tags in that case. For bridged ports though,
we must ensure that VLAN filtering is enabled, which in turn
guarantees that the VID in question is loaded into the VTU.

Fixes: d352b20f41 ("net: dsa: mv88e6xxx: Improve multichip isolation of standalone ports")
Signed-off-by: Tobias Waldekranz <tobias@waldekranz.com>
Tested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220307110548.812455-1-tobias@waldekranz.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-08 11:12:28 +01:00
Tom Rix
cd5169841c net: dsa: return success if there was nothing to do
Clang static analysis reports this representative issue
dsa.c:486:2: warning: Undefined or garbage value
  returned to caller
  return err;
  ^~~~~~~~~~

err is only set in the loop.  If the loop is empty,
garbage will be returned.  So initialize err to 0
to handle this noop case.

Signed-off-by: Tom Rix <trix@redhat.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-07 12:25:59 +00:00
Vladimir Oltean
afb3cc1a39 net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails
After the blamed commit, dsa_tree_setup_master() may exit without
calling rtnl_unlock(), fix that.

Fixes: c146f9bc19 ("net: dsa: hold rtnl_mutex when calling dsa_master_{setup,teardown}")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-06 10:55:54 +00:00
Luiz Angelo Daros de Luca
cd87fecded net: dsa: tag_rtl8_4: add rtl8_4t trailing variant
Realtek switches supports the same tag both before ethertype or between
payload and the CRC.

Signed-off-by: Luiz Angelo Daros de Luca <luizluca@gmail.com>
Reviewed-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-05 11:04:25 +00:00
Jakub Kicinski
80901bff81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/batman-adv/hard-interface.c
  commit 690bb6fb64 ("batman-adv: Request iflink once in batadv-on-batadv check")
  commit 6ee3c393ee ("batman-adv: Demote batadv-on-batadv skip error message")
https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/

net/smc/af_smc.c
  commit 4d08b7b57e ("net/smc: Fix cleanup when register ULP fails")
  commit 462791bbfa ("net/smc: add sysctl interface for SMC")
https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 11:55:12 -08:00
Vladimir Oltean
e1bec7fa1c net: dsa: make dsa_tree_change_tag_proto actually unwind the tag proto change
The blamed commit said one thing but did another. It explains that we
should restore the "return err" to the original "goto out_unwind_tagger",
but instead it replaced it with "goto out_unlock".

When DSA_NOTIFIER_TAG_PROTO fails after the first switch of a
multi-switch tree, the switches would end up not using the same tagging
protocol.

Fixes: 0b0e2ff103 ("net: dsa: restore error path of dsa_tree_change_tag_proto")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220303154249.1854436-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 08:39:12 -08:00
Vladimir Oltean
f9cef64fa2 net: dsa: felix: migrate host FDB and MDB entries when changing tag proto
The "ocelot" and "ocelot-8021q" tagging protocols make use of different
hardware resources, and host FDB entries have different destination
ports in the switch analyzer module, practically speaking.

So when the user requests a tagging protocol change, the driver must
migrate all host FDB and MDB entries from the NPI port (in fact CPU port
module) towards the same physical port, but this time used as a regular
port.

It is pointless for the felix driver to keep a copy of the host
addresses, when we can create and export DSA helpers for walking through
the addresses that it already needs to keep on the CPU port, for
refcounting purposes.

felix_classify_db() is moved up to avoid a forward declaration.

We pass "bool change" because dp->fdbs and dp->mdbs are uninitialized
lists when felix_setup() first calls felix_set_tag_protocol(), so we
need to avoid calling dsa_port_walk_fdbs() during probe time.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
7569459a52 net: dsa: manage flooding on the CPU ports
DSA can treat IFF_PROMISC and IFF_ALLMULTI on standalone user ports as
signifying whether packets with an unknown MAC DA will be received or
not. Since known MAC DAs are handled by FDB/MDB entries, this means that
promiscuity is analogous to including/excluding the CPU port from the
flood domain of those packets.

There are two ways to signal CPU flooding to drivers.

The first (chosen here) is to synthesize a call to
ds->ops->port_bridge_flags() for the CPU port, with a mask of
BR_FLOOD | BR_MCAST_FLOOD. This has the effect of turning on egress
flooding on the CPU port regardless of source.

The alternative would be to create a new ds->ops->port_host_flood()
which is called per user port. Some switches (sja1105) have a flood
domain that is managed per {ingress port, egress port} pair, so it would
make more sense for this kind of switch to not flood the CPU from port A
if just port B requires it. Nonetheless, the sja1105 has other quirks
that prevent it from making use of unicast filtering, and without a
concrete user making use of this feature, I chose not to implement it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
499aa9e1b3 net: dsa: install the primary unicast MAC address as standalone port host FDB
To be able to safely turn off CPU flooding for standalone ports, we need
to ensure that the dev_addr of each DSA slave interface is installed as
a standalone host FDB entry for compatible switches.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
5e8a1e03aa net: dsa: install secondary unicast and multicast addresses as host FDB/MDB
In preparation of disabling flooding towards the CPU in standalone ports
mode, identify the addresses requested by upper interfaces and use the
new API for DSA FDB isolation to request the hardware driver to offload
these as FDB or MDB objects. The objects belong to the user port's
database, and are installed pointing towards the CPU port.

Because dev_uc_add()/dev_mc_add() is VLAN-unaware, we offload to the
port standalone database addresses with VID 0 (also VLAN-unaware).
So this excludes switches with global VLAN filtering from supporting
unicast filtering, because there, it is possible for a port of a switch
to join a VLAN-aware bridge, and this changes the VLAN awareness of
standalone ports, requiring VLAN-aware standalone host FDB entries.
For the same reason, hellcreek, which requires VLAN awareness in
standalone mode, is also exempted from unicast filtering.

We create "standalone" variants of dsa_port_host_fdb_add() and
dsa_port_host_mdb_add() (and the _del coresponding functions).

We also create a separate work item type for handling deferred
standalone host FDB/MDB entries compared to the switchdev one.
This is done for the purpose of clarity - the procedure for offloading a
bridge FDB entry is different than offloading a standalone one, and
the switchdev event work handles only FDBs anyway, not MDBs.
Deferral is needed for standalone entries because ndo_set_rx_mode runs
in atomic context. We could probably optimize things a little by first
queuing up all entries that need to be offloaded, and scheduling the
work item just once, but the data structures that we can pass through
__dev_uc_sync() and __dev_mc_sync() are limiting (there is nothing like
a void *priv), so we'd have to keep the list of queued events somewhere
in struct dsa_switch, and possibly a lock for it. Too complicated for
now.

Adding the address to the master is handled by dev_uc_sync(), adding it
to the hardware is handled by __dev_uc_sync(). So this is the reason why
dsa_port_standalone_host_fdb_add() does not call dev_uc_add(). Not that
it had the rtnl_mutex anyway - ndo_set_rx_mode has it, but is atomic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
68d6d71eaf net: dsa: rename the host FDB and MDB methods to contain the "bridge" namespace
We are preparing to add API in port.c that adds FDB and MDB entries that
correspond to the port's standalone database. Rename the existing
methods to make it clear that the FDB and MDB entries offloaded come
from the bridge database.

Since the function names lengthen in dsa_slave_switchdev_event_work(),
we place "addr" and "vid" in temporary variables, to shorten those.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
35aae5ab91 net: dsa: remove workarounds for changing master promisc/allmulti only while up
Lennert Buytenhek explains in commit df02c6ff2e ("dsa: fix master
interface allmulti/promisc handling"), dated Nov 2008, that changing the
promiscuity of interfaces that are down (here the master) is broken.

This fact regarding promisc/allmulti has changed since commit
b6c40d68ff ("net: only invoke dev->change_rx_flags when device is UP")
by Vlad Yasevich, dated Nov 2013.

Therefore, DSA now has unnecessary complexity to handle master state
transitions from down to up. In fact, syncing the unicast and multicast
addresses can happen completely asynchronously to the administrative
state changes.

This change reduces that complexity by effectively fully reverting
commit df02c6ff2e ("dsa: fix master interface allmulti/promisc
handling").

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:15:31 +00:00
Vladimir Oltean
0b0e2ff103 net: dsa: restore error path of dsa_tree_change_tag_proto
When the DSA_NOTIFIER_TAG_PROTO returns an error, the user space process
which initiated the protocol change exits the kernel processing while
still holding the rtnl_mutex. So any other process attempting to lock
the rtnl_mutex would deadlock after such event.

The error handling of DSA_NOTIFIER_TAG_PROTO was inadvertently changed
by the blamed commit, introducing this regression. We must still call
rtnl_unlock(), and we must still call DSA_NOTIFIER_TAG_PROTO for the old
protocol. The latter is due to the limiting design of notifier chains
for cross-chip operations, which don't have a built-in error recovery
mechanism - we should look into using notifier_call_chain_robust for that.

Fixes: dc452a471d ("net: dsa: introduce tagger-owned storage for private and shared data")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220228141715.146485-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-01 18:26:21 -08:00
Vladimir Oltean
06b9cce426 net: dsa: pass extack to .port_bridge_join driver methods
As FDB isolation cannot be enforced between VLAN-aware bridges in lack
of hardware assistance like extra FID bits, it seems plausible that many
DSA switches cannot do it. Therefore, they need to reject configurations
with multiple VLAN-aware bridges from the two code paths that can
transition towards that state:

- joining a VLAN-aware bridge
- toggling VLAN awareness on an existing bridge

The .port_vlan_filtering method already propagates the netlink extack to
the driver, let's propagate it from .port_bridge_join too, to make sure
that the driver can use the same function for both.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27 11:06:14 +00:00
Vladimir Oltean
c26933639b net: dsa: request drivers to perform FDB isolation
For DSA, to encourage drivers to perform FDB isolation simply means to
track which bridge does each FDB and MDB entry belong to. It then
becomes the driver responsibility to use something that makes the FDB
entry from one bridge not match the FDB lookup of ports from other
bridges.

The top-level functions where the bridge is determined are:
- dsa_port_fdb_{add,del}
- dsa_port_host_fdb_{add,del}
- dsa_port_mdb_{add,del}
- dsa_port_host_mdb_{add,del}

aka the pre-crosschip-notifier functions.

Changing the API to pass a reference to a bridge is not superfluous, and
looking at the passed bridge argument is not the same as having the
driver look at dsa_to_port(ds, port)->bridge from the ->port_fdb_add()
method.

DSA installs FDB and MDB entries on shared (CPU and DSA) ports as well,
and those do not have any dp->bridge information to retrieve, because
they are not in any bridge - they are merely the pipes that serve the
user ports that are in one or multiple bridges.

The struct dsa_bridge associated with each FDB/MDB entry is encapsulated
in a larger "struct dsa_db" database. Although only databases associated
to bridges are notified for now, this API will be the starting point for
implementing IFF_UNICAST_FLT in DSA. There, the idea is to install FDB
entries on the CPU port which belong to the corresponding user port's
port database. These are supposed to match only when the port is
standalone.

It is better to introduce the API in its expected final form than to
introduce it for bridges first, then to have to change drivers which may
have made one or more assumptions.

Drivers can use the provided bridge.num, but they can also use a
different numbering scheme that is more convenient.

DSA must perform refcounting on the CPU and DSA ports by also taking
into account the bridge number. So if two bridges request the same local
address, DSA must notify the driver twice, once for each bridge.

In fact, if the driver supports FDB isolation, DSA must perform
refcounting per bridge, but if the driver doesn't, DSA must refcount
host addresses across all bridges, otherwise it would be telling the
driver to delete an FDB entry for a bridge and the driver would delete
it for all bridges. So introduce a bool fdb_isolation in drivers which
would make all bridge databases passed to the cross-chip notifier have
the same number (0). This makes dsa_mac_addr_find() -> dsa_db_equal()
say that all bridge databases are the same database - which is
essentially the legacy behavior.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27 11:06:14 +00:00
Vladimir Oltean
b6362bdf75 net: dsa: tag_8021q: rename dsa_8021q_bridge_tx_fwd_offload_vid
The dsa_8021q_bridge_tx_fwd_offload_vid is no longer used just for
bridge TX forwarding offload, it is the private VLAN reserved for
VLAN-unaware bridging in a way that is compatible with FDB isolation.

So just rename it dsa_tag_8021q_bridge_vid.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27 11:06:14 +00:00
Vladimir Oltean
04b67e18ce net: dsa: tag_8021q: merge RX and TX VLANs
In the old Shared VLAN Learning mode of operation that tag_8021q
previously used for forwarding, we needed to have distinct concepts for
an RX and a TX VLAN.

An RX VLAN could be installed on all ports that were members of a given
bridge, so that autonomous forwarding could still work, while a TX VLAN
was dedicated for precise packet steering, so it just contained the CPU
port and one egress port.

Now that tag_8021q uses Independent VLAN Learning and imprecise RX/TX
all over, those lines have been blurred and we no longer have the need
to do precise TX towards a port that is in a bridge. As for standalone
ports, it is fine to use the same VLAN ID for both RX and TX.

This patch changes the tag_8021q format by shifting the VLAN range it
reserves, and halving it. Previously, our DIR bits were encoding the
VLAN direction (RX/TX) and were set to either 1 or 2. This meant that
tag_8021q reserved 2K VLANs, or 50% of the available range.

Change the DIR bits to a hardcoded value of 3 now, which makes tag_8021q
reserve only 1K VLANs, and a different range now (the last 1K). This is
done so that we leave the old format in place in case we need to return
to it.

In terms of code, the vid_is_dsa_8021q_rxvlan and vid_is_dsa_8021q_txvlan
functions go away. Any vid_is_dsa_8021q is both a TX and an RX VLAN, and
they are no longer distinct. For example, felix which did different
things for different VLAN types, now needs to handle the RX and the TX
logic for the same VLAN.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27 11:06:14 +00:00
Vladimir Oltean
d7f9787a76 net: dsa: tag_8021q: add support for imprecise RX based on the VBID
The sja1105 switch can't populate the PORT field of the tag_8021q header
when sending a frame to the CPU with a non-zero VBID.

Similar to dsa_find_designated_bridge_port_by_vid() which performs
imprecise RX for VLAN-aware bridges, let's introduce a helper in
tag_8021q for performing imprecise RX based on the VLAN that it has
allocated for a VLAN-unaware bridge.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27 11:06:13 +00:00
Vladimir Oltean
91495f21fc net: dsa: tag_8021q: replace the SVL bridging with VLAN-unaware IVL bridging
For VLAN-unaware bridging, tag_8021q uses something perhaps a bit too
tied with the sja1105 switch: each port uses the same pvid which is also
used for standalone operation (a unique one from which the source port
and device ID can be retrieved when packets from that port are forwarded
to the CPU). Since each port has a unique pvid when performing
autonomous forwarding, the switch must be configured for Shared VLAN
Learning (SVL) such that the VLAN ID itself is ignored when performing
FDB lookups. Without SVL, packets would always be flooded, since FDB
lookup in the source port's VLAN would never find any entry.

First of all, to make tag_8021q more palatable to switches which might
not support Shared VLAN Learning, let's just use a common VLAN for all
ports that are under the same bridge.

Secondly, using Shared VLAN Learning means that FDB isolation can never
be enforced. But if all ports under the same VLAN-unaware bridge share
the same VLAN ID, it can.

The disadvantage is that the CPU port can no longer perform precise
source port identification for these packets. But at least we have a
mechanism which has proven to be adequate for that situation: imprecise
RX (dsa_find_designated_bridge_port_by_vid), which is what we use for
termination on VLAN-aware bridges.

The VLAN ID that VLAN-unaware bridges will use with tag_8021q is the
same one as we were previously using for imprecise TX (bridge TX
forwarding offload). It is already allocated, it is just a matter of
using it.

Note that because now all ports under the same bridge share the same
VLAN, the complexity of performing a tag_8021q bridge join decreases
dramatically. We no longer have to install the RX VLAN of a newly
joining port into the port membership of the existing bridge ports.
The newly joining port just becomes a member of the VLAN corresponding
to that bridge, and the other ports are already members of it from when
they joined the bridge themselves. So forwarding works properly.

This means that we can unhook dsa_tag_8021q_bridge_{join,leave} from the
cross-chip notifier level dsa_switch_bridge_{join,leave}. We can put
these calls directly into the sja1105 driver.

With this new mode of operation, a port controlled by tag_8021q can have
two pvids whereas before it could only have one. The pvid for standalone
operation is different from the pvid used for VLAN-unaware bridging.
This is done, again, so that FDB isolation can be enforced.
Let tag_8021q manage this by deleting the standalone pvid when a port
joins a bridge, and restoring it when it leaves it.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-27 11:06:13 +00:00
Vladimir Oltean
e212fa7c54 net: dsa: support FDB events on offloaded LAG interfaces
This change introduces support for installing static FDB entries towards
a bridge port that is a LAG of multiple DSA switch ports, as well as
support for filtering towards the CPU local FDB entries emitted for LAG
interfaces that are bridge ports.

Conceptually, host addresses on LAG ports are identical to what we do
for plain bridge ports. Whereas FDB entries _towards_ a LAG can't simply
be replicated towards all member ports like we do for multicast, or VLAN.
Instead we need new driver API. Hardware usually considers a LAG to be a
"logical port", and sets the entire LAG as the forwarding destination.
The physical egress port selection within the LAG is made by hashing
policy, as usual.

To represent the logical port corresponding to the LAG, we pass by value
a copy of the dsa_lag structure to all switches in the tree that have at
least one port in that LAG.

To illustrate why a refcounted list of FDB entries is needed in struct
dsa_lag, it is enough to say that:
- a LAG may be a bridge port and may therefore receive FDB events even
  while it isn't yet offloaded by any DSA interface
- DSA interfaces may be removed from a LAG while that is a bridge port;
  we don't want FDB entries lingering around, but we don't want to
  remove entries that are still in use, either

For all the cases below to work, the idea is to always keep an FDB entry
on a LAG with a reference count equal to the DSA member ports. So:
- if a port joins a LAG, it requests the bridge to replay the FDB, and
  the FDB entries get created, or their refcount gets bumped by one
- if a port leaves a LAG, the FDB replay deletes or decrements refcount
  by one
- if an FDB is installed towards a LAG with ports already present, that
  entry is created (if it doesn't exist) and its refcount is bumped by
  the amount of ports already present in the LAG

echo "Adding FDB entry to bond with existing ports"
ip link del bond0
ip link add bond0 type bond mode 802.3ad
ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up
ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up
ip link del br0
ip link add br0 type bridge
ip link set bond0 master br0
bridge fdb add dev bond0 00:01:02:03:04:05 master static

ip link del br0
ip link del bond0

echo "Adding FDB entry to empty bond"
ip link del bond0
ip link add bond0 type bond mode 802.3ad
ip link del br0
ip link add br0 type bridge
ip link set bond0 master br0
bridge fdb add dev bond0 00:01:02:03:04:05 master static
ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up
ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up

ip link del br0
ip link del bond0

echo "Adding FDB entry to empty bond, then removing ports one by one"
ip link del bond0
ip link add bond0 type bond mode 802.3ad
ip link del br0
ip link add br0 type bridge
ip link set bond0 master br0
bridge fdb add dev bond0 00:01:02:03:04:05 master static
ip link set swp1 down && ip link set swp1 master bond0 && ip link set swp1 up
ip link set swp2 down && ip link set swp2 master bond0 && ip link set swp2 up

ip link set swp1 nomaster
ip link set swp2 nomaster
ip link del br0
ip link del bond0

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:44 -08:00
Vladimir Oltean
93c798230a net: dsa: call SWITCHDEV_FDB_OFFLOADED for the orig_dev
When switchdev_handle_fdb_event_to_device() replicates a FDB event
emitted for the bridge or for a LAG port and DSA offloads that, we
should notify back to switchdev that the FDB entry on the original
device is what was offloaded, not on the DSA slave devices that the
event is replicated on.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:43 -08:00
Vladimir Oltean
e35f12e993 net: dsa: remove "ds" and "port" from struct dsa_switchdev_event_work
By construction, the struct net_device *dev passed to
dsa_slave_switchdev_event_work() via struct dsa_switchdev_event_work
is always a DSA slave device.

Therefore, it is redundant to pass struct dsa_switch and int port
information in the deferred work structure. This can be retrieved at all
times from the provided struct net_device via dsa_slave_to_port().

For the same reason, we can drop the dsa_is_user_port() check in
dsa_fdb_offload_notify().

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:43 -08:00
Vladimir Oltean
ec638740fc net: switchdev: remove lag_mod_cb from switchdev_handle_fdb_event_to_device
When the switchdev_handle_fdb_event_to_device() event replication helper
was created, my original thought was that FDB events on LAG interfaces
should most likely be special-cased, not just replicated towards all
switchdev ports beneath that LAG. So this replication helper currently
does not recurse through switchdev lower interfaces of LAG bridge ports,
but rather calls the lag_mod_cb() if that was provided.

No switchdev driver uses this helper for FDB events on LAG interfaces
yet, so that was an assumption which was yet to be tested. It is
certainly usable for that purpose, as my RFC series shows:

https://patchwork.kernel.org/project/netdevbpf/cover/20220210125201.2859463-1-vladimir.oltean@nxp.com/

however this approach is slightly convoluted because:

- the switchdev driver gets a "dev" that isn't its own net device, but
  rather the LAG net device. It must call switchdev_lower_dev_find(dev)
  in order to get a handle of any of its own net devices (the ones that
  pass check_cb).

- in order for FDB entries on LAG ports to be correctly refcounted per
  the number of switchdev ports beneath that LAG, we haven't escaped the
  need to iterate through the LAG's lower interfaces. Except that is now
  the responsibility of the switchdev driver, because the replication
  helper just stopped half-way.

So, even though yes, FDB events on LAG bridge ports must be
special-cased, in the end it's simpler to let switchdev_handle_fdb_*
just iterate through the LAG port's switchdev lowers, and let the
switchdev driver figure out that those physical ports are under a LAG.

The switchdev_handle_fdb_event_to_device() helper takes a
"foreign_dev_check" callback so it can figure out whether @dev can
autonomously forward to @foreign_dev. DSA fills this method properly:
if the LAG is offloaded by another port in the same tree as @dev, then
it isn't foreign. If it is a software LAG, it is foreign - forwarding
happens in software.

Whether an interface is foreign or not decides whether the replication
helper will go through the LAG's switchdev lowers or not. Since the
lan966x doesn't properly fill this out, FDB events on software LAG
uppers will get called. By changing lan966x_foreign_dev_check(), we can
suppress them.

Whereas DSA will now start receiving FDB events for its offloaded LAG
uppers, so we need to return -EOPNOTSUPP, since we currently don't do
the right thing for them.

Cc: Horatiu Vultur <horatiu.vultur@microchip.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:43 -08:00
Vladimir Oltean
dedd6a009f net: dsa: create a dsa_lag structure
The main purpose of this change is to create a data structure for a LAG
as seen by DSA. This is similar to what we have for bridging - we pass a
copy of this structure by value to ->port_lag_join and ->port_lag_leave.
For now we keep the lag_dev, id and a reference count in it. Future
patches will add a list of FDB entries for the LAG (these also need to
be refcounted to work properly).

The LAG structure is created using dsa_port_lag_create() and destroyed
using dsa_port_lag_destroy(), just like we have for bridging.

Because now, the dsa_lag itself is refcounted, we can simplify
dsa_lag_map() and dsa_lag_unmap(). These functions need to keep a LAG in
the dst->lags array only as long as at least one port uses it. The
refcounting logic inside those functions can be removed now - they are
called only when we should perform the operation.

dsa_lag_dev() is renamed to dsa_lag_by_id() and now returns the dsa_lag
structure instead of the lag_dev net_device.

dsa_lag_foreach_port() now takes the dsa_lag structure as argument.

dst->lags holds an array of dsa_lag structures.

dsa_lag_map() now also saves the dsa_lag->id value, so that linear
walking of dst->lags in drivers using dsa_lag_id() is no longer
necessary. They can just look at lag.id.

dsa_port_lag_id_get() is a helper, similar to dsa_port_bridge_num_get(),
which can be used by drivers to get the LAG ID assigned by DSA to a
given port.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:43 -08:00
Vladimir Oltean
3d4a0a2a46 net: dsa: make LAG IDs one-based
The DSA LAG API will be changed to become more similar with the bridge
data structures, where struct dsa_bridge holds an unsigned int num,
which is generated by DSA and is one-based. We have a similar thing
going with the DSA LAG, except that isn't stored anywhere, it is
calculated dynamically by dsa_lag_id() by iterating through dst->lags.

The idea of encoding an invalid (or not requested) LAG ID as zero for
the purpose of simplifying checks in drivers means that the LAG IDs
passed by DSA to drivers need to be one-based too. So back-and-forth
conversion is needed when indexing the dst->lags array, as well as in
drivers which assume a zero-based index.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:42 -08:00
Vladimir Oltean
46a76724e4 net: dsa: rename references to "lag" as "lag_dev"
In preparation of converting struct net_device *dp->lag_dev into a
struct dsa_lag *dp->lag, we need to rename, for consistency purposes,
all occurrences of the "lag" variable in the DSA core to "lag_dev".

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 21:31:42 -08:00
Jakub Kicinski
aaa25a2fa7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
tools/testing/selftests/net/mptcp/mptcp_join.sh
  34aa6e3bcc ("selftests: mptcp: add ip mptcp wrappers")

  857898eb4b ("selftests: mptcp: add missing join check")
  6ef84b1517 ("selftests: mptcp: more robust signal race test")
https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/

drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c
  fb7e76ea3f ("net/mlx5e: TC, Skip redundant ct clear actions")
  c63741b426 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information")

  09bf979232 ("net/mlx5e: TC, Move pedit_headers_action to parse_attr")
  84ba8062e3 ("net/mlx5e: Test CT and SAMPLE on flow attr")
  efe6f961cd ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow")
  3b49a7edec ("net/mlx5e: TC, Reject rules with multiple CT actions")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-24 17:54:25 -08:00
Hans Schultz
b9e8b58fd2 net: dsa: Include BR_PORT_LOCKED in the list of synced brport flags
Ensures that the DSA switch driver gets notified of changes to the
BR_PORT_LOCKED flag as well, for the case when a DSA port joins or
leaves a LAG that is a bridge port.

Signed-off-by: Hans Schultz <schultz.hans+netdev@gmail.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-23 12:52:34 +00:00
Alvin Šipraga
342b641919 net: dsa: fix panic when removing unoffloaded port from bridge
If a bridged port is not offloaded to the hardware - either because the
underlying driver does not implement the port_bridge_{join,leave} ops,
or because the operation failed - then its dp->bridge pointer will be
NULL when dsa_port_bridge_leave() is called. Avoid dereferncing NULL.

This fixes the following splat when removing a port from a bridge:

 Unable to handle kernel access to user memory outside uaccess routines at virtual address 0000000000000000
 Internal error: Oops: 96000004 [#1] PREEMPT_RT SMP
 CPU: 3 PID: 1119 Comm: brctl Tainted: G           O      5.17.0-rc4-rt4 #1
 Call trace:
  dsa_port_bridge_leave+0x8c/0x1e4
  dsa_slave_changeupper+0x40/0x170
  dsa_slave_netdevice_event+0x494/0x4d4
  notifier_call_chain+0x80/0xe0
  raw_notifier_call_chain+0x1c/0x24
  call_netdevice_notifiers_info+0x5c/0xac
  __netdev_upper_dev_unlink+0xa4/0x200
  netdev_upper_dev_unlink+0x38/0x60
  del_nbp+0x1b0/0x300
  br_del_if+0x38/0x114
  add_del_if+0x60/0xa0
  br_ioctl_stub+0x128/0x2dc
  br_ioctl_call+0x68/0xb0
  dev_ifsioc+0x390/0x554
  dev_ioctl+0x128/0x400
  sock_do_ioctl+0xb4/0xf4
  sock_ioctl+0x12c/0x4e0
  __arm64_sys_ioctl+0xa8/0xf0
  invoke_syscall+0x4c/0x110
  el0_svc_common.constprop.0+0x48/0xf0
  do_el0_svc+0x28/0x84
  el0_svc+0x1c/0x50
  el0t_64_sync_handler+0xa8/0xb0
  el0t_64_sync+0x17c/0x180
 Code: f9402f00 f0002261 f9401302 913cc021 (a9401404)
 ---[ end trace 0000000000000000 ]---

Fixes: d3eed0e57d ("net: dsa: keep the bridge_dev and bridge_num as part of the same structure")
Signed-off-by: Alvin Šipraga <alsi@bang-olufsen.dk>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220221203539.310690-1-alvin@pqrs.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-22 17:03:02 -08:00
Russell King (Oracle)
1054457006 net: phy: phylink: fix DSA mac_select_pcs() introduction
Vladimir Oltean reports that probing on DSA drivers that aren't yet
populating supported_interfaces now fails. Fix this by allowing
phylink to detect whether DSA actually provides an underlying
mac_select_pcs() implementation.

Reported-by: Vladimir Oltean <olteanv@gmail.com>
Fixes: bde018222c ("net: dsa: add support for phylink mac_select_pcs()")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/E1nMCD6-00A0wC-FG@rmk-PC.armlinux.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-22 16:57:08 -08:00
Vladimir Oltean
8940e6b669 net: dsa: avoid call to __dev_set_promiscuity() while rtnl_mutex isn't held
If the DSA master doesn't support IFF_UNICAST_FLT, then the following
call path is possible:

dsa_slave_switchdev_event_work
-> dsa_port_host_fdb_add
   -> dev_uc_add
      -> __dev_set_rx_mode
         -> __dev_set_promiscuity

Since the blamed commit, dsa_slave_switchdev_event_work() no longer
holds rtnl_lock(), which triggers the ASSERT_RTNL() from
__dev_set_promiscuity().

Taking rtnl_lock() around dev_uc_add() is impossible, because all the
code paths that call dsa_flush_workqueue() do so from contexts where the
rtnl_mutex is already held - so this would lead to an instant deadlock.

dev_uc_add() in itself doesn't require the rtnl_mutex for protection.
There is this comment in __dev_set_rx_mode() which assumes so:

		/* Unicast addresses changes may only happen under the rtnl,
		 * therefore calling __dev_set_promiscuity here is safe.
		 */

but it is from commit 4417da668c ("[NET]: dev: secondary unicast
address support") dated June 2007, and in the meantime, commit
f1f28aa351 ("netdev: Add addr_list_lock to struct net_device."), dated
July 2008, has added &dev->addr_list_lock to protect this instead of the
global rtnl_mutex.

Nonetheless, __dev_set_promiscuity() does assume rtnl_mutex protection,
but it is the uncommon path of what we typically expect dev_uc_add()
to do. So since only the uncommon path requires rtnl_lock(), just check
ahead of time whether dev_uc_add() would result into a call to
__dev_set_promiscuity(), and handle that condition separately.

DSA already configures the master interface to be promiscuous if the
tagger requires this. We can extend this to also cover the case where
the master doesn't handle dev_uc_add() (doesn't support IFF_UNICAST_FLT),
and on the premise that we'd end up making it promiscuous during
operation anyway, either if a DSA slave has a non-inherited MAC address,
or if the bridge notifies local FDB entries for its own MAC address, the
address of a station learned on a foreign port, etc.

Fixes: 0faf890fc5 ("net: dsa: drop rtnl_lock from dsa_slave_switchdev_event_work")
Reported-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 21:18:33 +00:00
Russell King (Oracle)
ccfbf44d4c net: dsa: remove pcs_poll
With drivers converted over to using phylink PCS, there is no need for
the struct dsa_switch member "pcs_poll" to exist anymore - there is a
flag in the struct phylink_pcs which indicates whether this PCS needs
to be polled which supersedes this.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-19 16:41:50 +00:00
Russell King (Oracle)
bde018222c net: dsa: add support for phylink mac_select_pcs()
Add DSA support for the phylink mac_select_pcs() method so DSA drivers
can return provide phylink with the appropriate PCS for the PHY
interface mode.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-18 11:28:32 +00:00
Vladimir Oltean
d2b1d186ce net: dsa: delete unused exported symbols for ethtool PHY stats
Introduced in commit cf96357303 ("net: dsa: Allow providing PHY
statistics from CPU port"), it appears these were never used.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Link: https://lore.kernel.org/r/20220216193726.2926320-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 20:07:09 -08:00
Jakub Kicinski
6b5567b1b2 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 11:44:20 -08:00
Mans Rullgard
017b355bbd net: dsa: lan9303: handle hwaccel VLAN tags
Check for a hwaccel VLAN tag on rx and use it if present.  Otherwise,
use __skb_vlan_pop() like the other tag parsers do.  This fixes the case
where the VLAN tag has already been consumed by the master.

Fixes: a1292595e0 ("net: dsa: add new DSA switch driver for the SMSC-LAN9303")
Signed-off-by: Mans Rullgard <mans@mansr.com>
Reviewed-by: Vladimir Oltean <olteanv@gmail.com>
Link: https://lore.kernel.org/r/20220216124634.23123-1-mans@mansr.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-17 09:32:06 -08:00
Vladimir Oltean
29940ce32a net: dsa: tag_ocelot_8021q: calculate TX checksum in software for deferred packets
DSA inherits NETIF_F_CSUM_MASK from master->vlan_features, and the
expectation is that TX checksumming is offloaded and not done in
software.

Normally the DSA master takes care of this, but packets handled by
ocelot_defer_xmit() are a very special exception, because they are
actually injected into the switch through register-based MMIO. So the
DSA master is not involved at all for these packets => no one calculates
the checksum.

This allows PTP over UDP to work using the ocelot-8021q tagging
protocol.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-17 14:06:51 +00:00
Vladimir Oltean
c862033595 net: dsa: tag_8021q: only call skb_push/skb_pull around __skb_vlan_pop
__skb_vlan_pop() needs skb->data to point at the mac_header, while
skb_vlan_tag_present() and skb_vlan_tag_get() don't, because they don't
look at skb->data at all.

So we can avoid uselessly moving around skb->data for the case where the
VLAN tag was offloaded by the DSA master.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20220215204722.2134816-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-16 20:35:35 -08:00
Vladimir Oltean
164f861bd4 net: dsa: offload bridge port VLANs on foreign interfaces
DSA now explicitly handles VLANs installed with the 'self' flag on the
bridge as host VLANs, instead of just replicating every bridge port VLAN
also on the CPU port and never deleting it, which is what it did before.

However, this leaves a corner case uncovered, as explained by
Tobias Waldekranz:
https://patchwork.kernel.org/project/netdevbpf/patch/20220209213044.2353153-6-vladimir.oltean@nxp.com/#24735260

Forwarding towards a bridge port VLAN installed on a bridge port foreign
to DSA (separate NIC, Wi-Fi AP) used to work by virtue of the fact that
DSA itself needed to have at least one port in that VLAN (therefore, it
also had the CPU port in said VLAN). However, now that the CPU port may
not be member of all VLANs that user ports are members of, we need to
ensure this isn't the case if software forwarding to a foreign interface
is required.

The solution is to treat bridge port VLANs on standalone interfaces in
the exact same way as host VLANs. From DSA's perspective, there is no
difference between local termination and software forwarding; packets in
that VLAN must reach the CPU in both cases.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:05 +00:00
Vladimir Oltean
134ef2388e net: dsa: add explicit support for host bridge VLANs
Currently, DSA programs VLANs on shared (DSA and CPU) ports each time it
does so on user ports. This is good for basic functionality but has
several limitations:

- the VLAN group which must reach the CPU may be radically different
  from the VLAN group that must be autonomously forwarded by the switch.
  In other words, the admin may want to isolate noisy stations and avoid
  traffic from them going to the control processor of the switch, where
  it would just waste useless cycles. The bridge already supports
  independent control of VLAN groups on bridge ports and on the bridge
  itself, and when VLAN-aware, it will drop packets in software anyway
  if their VID isn't added as a 'self' entry towards the bridge device.

- Replaying host FDB entries may depend, for some drivers like mv88e6xxx,
  on replaying the host VLANs as well. The 2 VLAN groups are
  approximately the same in most regular cases, but there are corner
  cases when timing matters, and DSA's approximation of replicating
  VLANs on shared ports simply does not work.

- If a user makes the bridge (implicitly the CPU port) join a VLAN by
  accident, there is no way for the CPU port to isolate itself from that
  noisy VLAN except by rebooting the system. This is because for each
  VLAN added on a user port, DSA will add it on shared ports too, but
  for each VLAN deletion on a user port, it will remain installed on
  shared ports, since DSA has no good indication of whether the VLAN is
  still in use or not.

Now that the bridge driver emits well-balanced SWITCHDEV_OBJ_ID_PORT_VLAN
addition and removal events, DSA has a simple and straightforward task
of separating the bridge port VLANs (these have an orig_dev which is a
DSA slave interface, or a LAG interface) from the host VLANs (these have
an orig_dev which is a bridge interface), and to keep a simple reference
count of each VID on each shared port.

Forwarding VLANs must be installed on the bridge ports and on all DSA
ports interconnecting them. We don't have a good view of the exact
topology, so we simply install forwarding VLANs on all DSA ports, which
is what has been done until now.

Host VLANs must be installed primarily on the dedicated CPU port of each
bridge port. More subtly, they must also be installed on upstream-facing
and downstream-facing DSA ports that are connecting the bridge ports and
the CPU. This ensures that the mv88e6xxx's problem (VID of host FDB
entry may be absent from VTU) is still addressed even if that switch is
in a cross-chip setup, and it has no local CPU port.

Therefore:
- user ports contain only bridge port (forwarding) VLANs, and no
  refcounting is necessary
- DSA ports contain both forwarding and host VLANs. Refcounting is
  necessary among these 2 types.
- CPU ports contain only host VLANs. Refcounting is also necessary.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:05 +00:00