linux/net/bridge
Vladimir Oltean b28d580e29 net: bridge: switchdev: replay all VLAN groups
The major user of replayed switchdev objects is DSA, and so far it
hasn't needed information about anything other than bridge port VLANs,
so this is all that br_switchdev_vlan_replay() knows to handle.

DSA has managed to get by through replicating every VLAN addition on a
user port such that the same VLAN is also added on all DSA and CPU
ports, but there is a corner case where this does not work.

The mv88e6xxx DSA driver currently prints this error message as soon as
the first port of a switch joins a bridge:

mv88e6085 0x0000000008b96000:00: port 0 failed to add a6:ef:77:c8:5f:3d vid 1 to fdb: -95

where a6:ef:77:c8:5f:3d vid 1 is a local FDB entry corresponding to the
bridge MAC address in the default_pvid.

The -EOPNOTSUPP is returned by mv88e6xxx_port_db_load_purge() because it
tries to map VID 1 to a FID (the ATU is indexed by FID not VID), but
fails to do so. This is because ->port_fdb_add() is called before
->port_vlan_add() for VID 1.

The abridged timeline of the calls is:

br_add_if
-> netdev_master_upper_dev_link
   -> dsa_port_bridge_join
      -> switchdev_bridge_port_offload
         -> br_switchdev_vlan_replay (*)
         -> br_switchdev_fdb_replay
            -> mv88e6xxx_port_fdb_add
-> nbp_vlan_init
   -> nbp_vlan_add
      -> mv88e6xxx_port_vlan_add

and the issue is that at the time of (*), the bridge port isn't in VID 1
(nbp_vlan_init hasn't been called), therefore br_switchdev_vlan_replay()
won't have anything to replay, therefore VID 1 won't be in the VTU by
the time mv88e6xxx_port_fdb_add() is called.

This happens only when the first port of a switch joins. For further
ports, the initial mv88e6xxx_port_vlan_add() is sufficient for VID 1 to
be loaded in the VTU (which is switch-wide, not per port).

The problem is somewhat unique to mv88e6xxx by chance, because most
other drivers offload an FDB entry by VID, so FDBs and VLANs can be
added asynchronously with respect to each other, but addressing the
issue at the bridge layer makes sense, since what mv88e6xxx requires
isn't absurd.

To fix this problem, we need to recognize that it isn't the VLAN group
of the port that we're interested in, but the VLAN group of the bridge
itself (so it isn't a timing issue, but rather insufficient information
being passed from switchdev to drivers).

As mentioned, currently nbp_switchdev_sync_objs() only calls
br_switchdev_vlan_replay() for VLANs corresponding to the port, but the
VLANs corresponding to the bridge itself, for local termination, also
need to be replayed. In this case, VID 1 is not (yet) present in the
port's VLAN group but is present in the bridge's VLAN group.

So to fix this bug, DSA is now obligated to explicitly handle VLANs
pointing towards the bridge in order to "close this race" (which isn't
really a race). As Tobias Waldekranz notices, this also implies that it
must explicitly handle port VLANs on foreign interfaces, something that
worked implicitly before:
https://patchwork.kernel.org/project/netdevbpf/patch/20220209213044.2353153-6-vladimir.oltean@nxp.com/#24735260

So in the end, br_switchdev_vlan_replay() must replay all VLANs from all
VLAN groups: all the ports, and the bridge itself.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-02-16 11:21:04 +00:00
..
netfilter netfilter: nft_reject_bridge: Fix for missing reply from prerouting 2022-01-27 00:06:52 +01:00
br_arp_nd_proxy.c net: bridge: when suppression is enabled exclude RARP packets 2021-03-22 13:30:24 -07:00
br_cfm_netlink.c bridge: cfm: Netlink Notifications. 2020-10-29 18:39:44 -07:00
br_cfm.c bridge: cfm: remove redundant return 2021-06-22 10:35:15 -07:00
br_device.c net: bridge: move bridge ioctls out of .ndo_do_ioctl 2021-07-27 20:11:45 +01:00
br_fdb.c net: bridge: move br_fdb_replay inside br_switchdev.c 2021-10-27 14:54:02 +01:00
br_forward.c net: bridge: fix build when setting skb->offload_fwd_mark with CONFIG_NET_SWITCHDEV=n 2021-07-24 21:48:26 +01:00
br_if.c net: bridge: fix net device refcount tracking issue in error path 2022-01-12 14:44:18 +00:00
br_input.c net: bridge: change return type of br_handle_ingress_vlan_tunnel 2021-08-24 16:51:09 -07:00
br_ioctl.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next 2021-12-31 14:35:40 +00:00
br_mdb.c net: bridge: mdb: move all switchdev logic to br_switchdev.c 2021-10-28 20:05:57 -07:00
br_mrp_netlink.c bridge: mrp: Use hlist_head instead of list_head for mrp 2020-11-09 16:42:12 -08:00
br_mrp_switchdev.c bridge: mrp: Extend br_mrp_switchdev to detect better the errors 2021-02-16 14:47:46 -08:00
br_mrp.c net: bridge: mrp: Update the Test frames for MRA 2021-06-28 15:46:10 -07:00
br_multicast_eht.c net: bridge: multicast: use multicast contexts instead of bridge or port 2021-07-20 05:41:19 -07:00
br_multicast.c net: bridge: mcast: add and enforce startup query interval minimum 2021-12-29 12:59:38 -08:00
br_netfilter_hooks.c netfilter: bridge: add support for pppoe filtering 2021-11-30 23:37:55 +01:00
br_netfilter_ipv6.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2019-06-25 01:32:59 +02:00
br_netlink_tunnel.c net: bridge: notify on vlan tunnel changes done via the old api 2020-07-12 15:18:24 -07:00
br_netlink.c net: bridge: mcast: add and enforce startup query interval minimum 2021-12-29 12:59:38 -08:00
br_nf_core.c net: add bool confirm_neigh parameter for dst_ops.update_pmtu 2019-12-24 22:28:54 -08:00
br_private_cfm.h bridge: cfm: Kernel space implementation of CFM. CCM frame RX added. 2020-10-29 18:39:43 -07:00
br_private_mcast_eht.h net: bridge: multicast: use multicast contexts instead of bridge or port 2021-07-20 05:41:19 -07:00
br_private_mrp.h net: bridge: mrp: Update the Test frames for MRA 2021-06-28 15:46:10 -07:00
br_private_stp.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
br_private_tunnel.h net: bridge: change return type of br_handle_ingress_vlan_tunnel 2021-08-24 16:51:09 -07:00
br_private.h net: bridge: switchdev: differentiate new VLANs from changed ones 2022-02-16 11:21:04 +00:00
br_stp_bpdu.c net: bridge: add STP xstats 2019-12-14 20:02:36 -08:00
br_stp_if.c net: use eth_hw_addr_set() 2021-10-02 14:18:25 +01:00
br_stp_timer.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
br_stp.c net: bridge: constify variables in the replay helpers 2021-06-28 14:09:03 -07:00
br_switchdev.c net: bridge: switchdev: replay all VLAN groups 2022-02-16 11:21:04 +00:00
br_sysfs_br.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2021-12-30 12:12:12 -08:00
br_sysfs_if.c net: bridge: mcast: br_multicast_set_port_router takes multicast context as argument 2021-08-20 15:00:35 +01:00
br_vlan_options.c net: bridge: mcast: add and enforce startup query interval minimum 2021-12-29 12:59:38 -08:00
br_vlan_tunnel.c net: bridge: change return type of br_handle_ingress_vlan_tunnel 2021-08-24 16:51:09 -07:00
br_vlan.c net: bridge: switchdev: differentiate new VLANs from changed ones 2022-02-16 11:21:04 +00:00
br.c net: make use of helper netif_is_bridge_master() 2021-10-16 15:02:56 +01:00
Kconfig bridge: cfm: Add BRIDGE_CFM to Kconfig. 2020-10-29 18:39:43 -07:00
Makefile net: bridge: multicast: add EHT host handling functions 2021-01-22 19:39:56 -08:00