linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2025-01-10 15:54:39 +08:00

History

Vladimir Oltean 67c3ca2c5c net: mscc: ocelot: use ocelot_xmit_get_vlan_info() also for FDMA and register injection Problem description ------------------- On an NXP LS1028A (felix DSA driver) with the following configuration: - ocelot-8021q tagging protocol - VLAN-aware bridge (with STP) spanning at least swp0 and swp1 - 8021q VLAN upper interfaces on swp0 and swp1: swp0.700, swp1.700 - ptp4l on swp0.700 and swp1.700 we see that the ptp4l instances do not see each other's traffic, and they all go to the grand master state due to the ANNOUNCE_RECEIPT_TIMEOUT_EXPIRES condition. Jumping to the conclusion for the impatient ------------------------------------------- There is a zero-day bug in the ocelot switchdev driver in the way it handles VLAN-tagged packet injection. The correct logic already exists in the source code, in function ocelot_xmit_get_vlan_info() added by commit `5ca721c54d` ("net: dsa: tag_ocelot: set the classified VLAN during xmit"). But it is used only for normal NPI-based injection with the DSA "ocelot" tagging protocol. The other injection code paths (register-based and FDMA-based) roll their own wrong logic. This affects and was noticed on the DSA "ocelot-8021q" protocol because it uses register-based injection. By moving ocelot_xmit_get_vlan_info() to a place that's common for both the DSA tagger and the ocelot switch library, it can also be called from ocelot_port_inject_frame() in ocelot.c. We need to touch the lines with ocelot_ifh_port_set()'s prototype anyway, so let's rename it to something clearer regarding what it does, and add a kernel-doc. ocelot_ifh_set_basic() should do. Investigation notes ------------------- Debugging reveals that PTP event (aka those carrying timestamps, like Sync) frames injected into swp0.700 (but also swp1.700) hit the wire with two VLAN tags: 00000000: 01 1b 19 00 00 00 00 01 02 03 04 05 81 00 02 bc ~~~~~~~~~~~ 00000010: 81 00 02 bc 88 f7 00 12 00 2c 00 00 02 00 00 00 ~~~~~~~~~~~ 00000020: 00 00 00 00 00 00 00 00 00 00 00 01 02 ff fe 03 00000030: 04 05 00 01 00 04 00 00 00 00 00 00 00 00 00 00 00000040: 00 00 The second (unexpected) VLAN tag makes felix_check_xtr_pkt() -> ptp_classify_raw() fail to see these as PTP packets at the link partner's receiving end, and return PTP_CLASS_NONE (because the BPF classifier is not written to expect 2 VLAN tags). The reason why packets have 2 VLAN tags is because the transmission code treats VLAN incorrectly. Neither ocelot switchdev, nor felix DSA, declare the NETIF_F_HW_VLAN_CTAG_TX feature. Therefore, at xmit time, all VLANs should be in the skb head, and none should be in the hwaccel area. This is done by: static struct sk_buff validate_xmit_vlan(struct sk_buff skb, netdev_features_t features) { if (skb_vlan_tag_present(skb) && !vlan_hw_offload_capable(features, skb->vlan_proto)) skb = __vlan_hwaccel_push_inside(skb); return skb; } But ocelot_port_inject_frame() handles things incorrectly: ocelot_ifh_port_set(ifh, port, rew_op, skb_vlan_tag_get(skb)); void ocelot_ifh_port_set(struct sk_buff skb, void ifh, int port, u32 rew_op) { (...) if (vlan_tag) ocelot_ifh_set_vlan_tci(ifh, vlan_tag); (...) } The way __vlan_hwaccel_push_inside() pushes the tag inside the skb head is by calling: static inline void __vlan_hwaccel_clear_tag(struct sk_buff *skb) { skb->vlan_present = 0; } which does _not_ zero out skb->vlan_tci as seen by skb_vlan_tag_get(). This means that ocelot, when it calls skb_vlan_tag_get(), sees (and uses) a residual skb->vlan_tci, while the same VLAN tag is _already_ in the skb head. The trivial fix for double VLAN headers is to replace the content of ocelot_ifh_port_set() with: if (skb_vlan_tag_present(skb)) ocelot_ifh_set_vlan_tci(ifh, skb_vlan_tag_get(skb)); but this would not be correct either, because, as mentioned, vlan_hw_offload_capable() is false for us, so we'd be inserting dead code and we'd always transmit packets with VID=0 in the injection frame header. I can't actually test the ocelot switchdev driver and rely exclusively on code inspection, but I don't think traffic from 8021q uppers has ever been injected properly, and not double-tagged. Thus I'm blaming the introduction of VLAN fields in the injection header - early driver code. As hinted at in the early conclusion, what we _want_ to happen for VLAN transmission was already described once in commit `5ca721c54d` ("net: dsa: tag_ocelot: set the classified VLAN during xmit"). ocelot_xmit_get_vlan_info() intends to ensure that if the port through which we're transmitting is under a VLAN-aware bridge, the outer VLAN tag from the skb head is stripped from there and inserted into the injection frame header (so that the packet is processed in hardware through that actual VLAN). And in all other cases, the packet is sent with VID=0 in the injection frame header, since the port is VLAN-unaware and has logic to strip this VID on egress (making it invisible to the wire). Fixes: `08d02364b1` ("net: mscc: fix the injection header") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>		2024-08-16 09:59:32 +01:00
..
6lowpan
9p	Two fixes headed to stable trees:	2024-05-29 09:25:15 -07:00
802
8021q	net: Add struct kernel_ethtool_ts_info	2024-07-15 08:02:26 -07:00
appletalk	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2024-05-09 10:01:01 -07:00
atm	atm: clean up a put_user() calls	2024-06-14 19:08:50 -07:00
ax25	ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put()	2024-06-01 15:49:42 -07:00
batman-adv	Revert "batman-adv: prefer kfree_rcu() over call_rcu() with free-only callbacks"	2024-06-12 20:18:00 +02:00
bluetooth	Bluetooth: hci_sync: avoid dup filtering when passive scanning with adv monitor	2024-08-07 16:36:01 -04:00
bpf	bpf-next-for-netdev	2024-07-09 17:01:46 +02:00
bridge	netfilter: nf_queue: drop packets with cloned unconfirmed conntracks	2024-08-14 23:37:23 +02:00
caif	net: caif: remove unused structs	2024-06-05 10:18:06 +01:00
can	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2024-06-27 12:14:11 -07:00
ceph	libceph: fix crush_choose_firstn() kernel-doc warnings	2024-07-11 16:33:07 +02:00
core	net: Make USO depend on CSUM offload	2024-08-09 21:58:08 -07:00
dcb
dccp	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2024-06-27 12:14:11 -07:00
devlink	devlink: Constify the 'table_ops' parameter of devl_dpipe_table_register()	2024-06-05 10:24:57 +01:00
dns_resolver
dsa	net: mscc: ocelot: use ocelot_xmit_get_vlan_info() also for FDMA and register injection	2024-08-16 09:59:32 +01:00
ethernet	netkit: Fix pkt_type override upon netkit pass verdict	2024-05-25 10:48:57 -07:00
ethtool	net: ethtool: Allow write mechanism of LPL and both LPL and EPL	2024-08-15 12:20:14 +02:00
handshake	net/handshake: remove redundant assignment to variable ret	2024-04-16 17:14:55 -07:00
hsr	net: hsr: cosmetic: Remove extra white space	2024-06-19 17:32:57 -07:00
ieee802154	bpf-next-for-netdev	2024-05-28 07:27:29 -07:00
ife
ipv4	tcp: Update window clamping condition	2024-08-14 10:50:49 +01:00
ipv6	netfilter: allow ipv6 fragments to arrive on different devices	2024-08-14 21:16:12 +02:00
iucv	net/iucv: fix use after free in iucv_sock_close()	2024-07-30 15:01:50 +02:00
kcm
key
l2tp	l2tp: fix lockdep splat	2024-08-08 08:28:24 -07:00
l3mdev
lapb
llc	llc: Constify struct llc_sap_state_trans	2024-07-15 08:51:19 -07:00
mac80211	wifi: mac80211: use monitor sdata with driver only if desired	2024-07-26 12:30:49 +02:00
mac802154	net: mac802154: Fix racy device stats updates by DEV_STATS_INC() and DEV_STATS_ADD()	2024-06-03 11:20:56 +02:00
mctp
mpls	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
mptcp	mptcp: correct MPTCP_SUBFLOW_ATTR_SSN_OFFSET reserved size	2024-08-13 19:13:25 -07:00
ncsi	net/ncsi: Fix the multi thread manner of NCSI driver	2024-06-01 16:21:44 -07:00
netfilter	netfilter: nf_tables: Add locking for NFT_MSG_GETOBJ_RESET requests	2024-08-14 23:44:55 +02:00
netlabel	netlabel: fix RCU annotation for IPv4 options on socket creation	2024-05-13 14:58:12 -07:00
netlink	net: netlink: remove the cb_mutex "injection" from netlink core	2024-06-10 13:15:40 +01:00
netrom	netrom: Fix a memory leak in nr_heartbeat_expiry()	2024-06-17 13:06:23 +01:00
nfc	Quite smaller than usual. Notably it includes the fix for the unix	2024-05-23 12:49:37 -07:00
nsh	nsh: Restore skb->{protocol,data,mac_header} for outer header in nsh_gso_segment().	2024-04-26 12:20:01 +02:00
openvswitch	net: openvswitch: store sampling probability in cb.	2024-07-05 17:45:47 -07:00
packet	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2024-07-15 13:19:17 -07:00
phonet	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
psample	net: psample: fix flag being set in wrong skb	2024-07-11 18:11:31 -07:00
qrtr	net: qrtr: ns: Ignore ENODEV failures in ns	2024-06-14 13:17:21 +02:00
rds	sysctl: treewide: constify the ctl_table argument of proc_handlers	2024-07-24 20:59:29 +02:00
rfkill	net: rfkill: Correct return value in invalid parameter case	2024-06-26 10:49:01 +02:00
rose	net: change proto and proto_ops accept type	2024-05-13 18:19:09 -06:00
rxrpc	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2024-05-09 10:01:01 -07:00
sched	sched: act_ct: take care of padding in struct zones_ht_key	2024-07-26 11:22:57 +01:00
sctp	sctp: Fix null-ptr-deref in reuseport_add_sock().	2024-08-02 16:25:06 -07:00
smc	net/smc: add the max value of fallback reason count	2024-08-07 19:36:23 -07:00
strparser
sunrpc	nfsd-6.11 fixes:	2024-08-10 10:44:21 -07:00
switchdev	net: bridge: switchdev: Improve error message for port_obj_add/del functions	2024-05-08 12:19:12 +01:00
tipc	A lot of networking people were at a conference last week, busy	2024-07-25 13:32:25 -07:00
tls	net: tls: Pass union tls_crypto_context pointer to memzero_explicit	2024-07-09 11:14:47 -07:00
unix	af_unix: Disable MSG_OOB handling for sockets in sockmap/sockhash	2024-07-17 22:49:00 +02:00
vmw_vsock	vsock: fix recursive ->recvmsg calls	2024-08-15 12:07:04 +02:00
wireless	wifi: cfg80211: correct S1G beacon length calculation	2024-07-26 12:32:47 +02:00
x25	net: change proto and proto_ops accept type	2024-05-13 18:19:09 -06:00
xdp	xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len	2024-07-25 11:57:27 +02:00
xfrm	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	2024-07-15 13:19:17 -07:00
compat.c
devres.c
Kconfig	ethtool: provide customized dim profile management	2024-06-25 17:15:06 -07:00
Kconfig.debug
Makefile
socket.c	net: Split a __sys_listen helper for io_uring	2024-06-19 07:57:21 -06:00
sysctl_net.c	sysctl: Remove check for sentinel element in ctl_table arrays	2024-06-13 10:50:52 +02:00