Unmask the upper DSCP bits in input route lookup so that in the future
the lookup could be performed according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-9-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As explained in commit 35ebf65e85 ("ipv4: Create and use
fib_compute_spec_dst() helper."), the function is used - for example -
to determine the source address for an ICMP reply. If we are responding
to a multicast or broadcast packet, the source address is set to the
source address that we would use if we were to send a packet to the
unicast source of the original packet. This address is determined by
performing a FIB lookup and using the preferred source address of the
resulting route.
Unmask the upper DSCP bits of the DS field of the packet that triggered
the reply so that in the future the FIB lookup could be performed
according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-8-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Unmask the upper DSCP bits when calling ipmr_fib_lookup() so that in the
future it could perform the FIB lookup according to the full DSCP value.
Note that ipmr_fib_lookup() performs a FIB rule lookup (returning the
relevant routing table) and that IPv4 multicast FIB rules do not support
matching on TOS / DSCP. However, it is still worth unmasking the upper
DSCP bits in case support for DSCP matching is ever added.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-7-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In a similar fashion to the iptables rpfilter match, unmask the upper
DSCP bits of the DS field of the currently tested packet so that in the
future the FIB lookup could be performed according to the full DSCP
value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-6-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The rpfilter match performs a reverse path filter test on a packet by
performing a FIB lookup with the source and destination addresses
swapped.
Unmask the upper DSCP bits of the DS field of the tested packet so that
in the future the FIB lookup could be performed according to the full
DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-5-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The Record Route IP option records the addresses of the routers that
routed the packet. In the case of forwarded packets, the kernel performs
a route lookup via fib_lookup() and fills in the preferred source
address of the matched route.
Unmask the upper DSCP bits when performing the lookup so that in the
future the lookup could be performed according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The NETLINK_FIB_LOOKUP netlink family can be used to perform a FIB
lookup according to user provided parameters and communicate the result
back to user space.
Unmask the upper DSCP bits of the user-provided DS field before invoking
the IPv4 FIB lookup API so that in the future the lookup could be
performed according to the full DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The helper performs a FIB lookup according to the parameters in the
'params' argument, one of which is 'tos'. According to the test in
test_tc_neigh_fib.c, it seems that BPF programs are expected to
initialize the 'tos' field to the full 8 bit DS field from the IPv4
header.
Unmask the upper DSCP bits before invoking the IPv4 FIB lookup APIs so
that in the future the lookup could be performed according to the full
DSCP value.
No functional changes intended since the upper DSCP bits are masked when
comparing against the TOS selectors in FIB rules and routes.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Acked-by: Florian Westphal <fw@strlen.de>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240821125251.1571445-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Abhinav Jain says:
====================
Enhance network interface feature testing
This small series includes fixes for creation of veth pairs for
networkless kernels & adds tests for turning the different network
interface features on and off in selftests/net/netdevice.sh script.
Tested using vng and compiles for network as well as networkless kernel.
# selftests: net: netdevice.sh
# No valid network device found, creating veth pair
# PASS: veth0: set interface up
# PASS: veth0: set MAC address
# XFAIL: veth0: set IP address unsupported for veth*
# PASS: veth0: ethtool list features
# PASS: veth0: Turned off feature: rx-checksumming
# PASS: veth0: Turned on feature: rx-checksumming
# PASS: veth0: Restore feature rx-checksumming to initial state on
# Actual changes:
# tx-checksum-ip-generic: off
...
# PASS: veth0: Turned on feature: rx-udp-gro-forwarding
# PASS: veth0: Restore feature rx-udp-gro-forwarding to initial state off
# Cannot get register dump: Operation not supported
# XFAIL: veth0: ethtool dump not supported
# PASS: veth0: ethtool stats
# PASS: veth0: stop interface
====================
Link: https://patch.msgid.link/20240821171903.118324-1-jain.abhinav177@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Check if veth pair was created and if yes, xfail on setting IP address
logging an informational message.
Use XFAIL instead of SKIP for unsupported ethtool APIs.
Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821171903.118324-4-jain.abhinav177@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Implement on/off testing for all non-fixed features via while loop.
Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821171903.118324-3-jain.abhinav177@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Check if the netdev list is empty and create veth pair to be used for
feature on/off testing.
Remove the veth pair after testing is complete.
Signed-off-by: Abhinav Jain <jain.abhinav177@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821171903.118324-2-jain.abhinav177@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
W=1 builds with GCC 14.2.0 warn that:
.../aq_ethtool.c:278:59: warning: ‘%d’ directive output may be truncated writing between 1 and 11 bytes into a region of size 6 [-Wformat-truncation=]
278 | snprintf(tc_string, 8, "TC%d ", tc);
| ^~
.../aq_ethtool.c:278:56: note: directive argument in the range [-2147483641, 254]
278 | snprintf(tc_string, 8, "TC%d ", tc);
| ^~~~~~~
.../aq_ethtool.c:278:33: note: ‘snprintf’ output between 5 and 15 bytes into a destination of size 8
278 | snprintf(tc_string, 8, "TC%d ", tc);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
tc is always in the range 0 - cfg->tcs. And as cfg->tcs is a u8,
the range is 0 - 255. Further, on inspecting the code, it seems
that cfg->tcs will never be more than AQ_CFG_TCS_MAX (8), so
the range is actually 0 - 8.
So, it seems that the condition that GCC flags will not occur.
But, nonetheless, it would be nice if it didn't emit the warning.
It seems that this can be achieved by changing the format specifier
from %d to %u, in which case I believe GCC recognises an upper bound
on the range of tc of 0 - 255. After some experimentation I think
this is due to the combination of the use of %u and the type of
cfg->tcs (u8).
Empirically, updating the type of the tc variable to unsigned int
has the same effect.
As both of these changes seem to make sense in relation to what the code
is actually doing - iterating over unsigned values - do both.
Compile tested only.
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821-atlantic-str-v1-1-fa2cfe38ca00@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current release - regressions:
- virtio_net: avoid crash on resume - move netdev_tx_reset_queue()
call before RX napi enable
Current release - new code bugs:
- net/mlx5e: fix page leak and incorrect header release w/ HW GRO
Previous releases - regressions:
- udp: fix receiving fraglist GSO packets
- tcp: prevent refcount underflow due to concurrent execution
of tcp_sk_exit_batch()
Previous releases - always broken:
- ipv6: fix possible UAF when incrementing error counters on output
- ip6: tunnel: prevent merging of packets with different L2
- mptcp: pm: fix IDs not being reusable
- bonding: fix potential crashes in IPsec offload handling
- Bluetooth: HCI:
- MGMT: add error handling to pair_device() to avoid a crash
- invert LE State quirk to be opt-out rather then opt-in
- fix LE quote calculation
- drv: dsa: VLAN fixes for Ocelot driver
- drv: igb: cope with large MAX_SKB_FRAGS Kconfig settings
- drv: ice: fi Rx data path on architectures with PAGE_SIZE >= 8192
Misc:
- netpoll: do not export netpoll_poll_[disable|enable]()
- MAINTAINERS: update the list of networking headers
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmbHpqwACgkQMUZtbf5S
IrtLCQ//ZcAj1aNa68uSBEE8/jAfiKy+L+uZDVcP1aEJfw+B6UFgdclf9mOG5fJY
ZlVyqr0YbNaOl8aGTd7MHNFzdRxgY8rbw93GZAdZfgV+yM2XLec4/cjLxx5vuqMn
rfbLyZ3+yPeHq/PhI1uHdyxzCUDpMlKBxsVybZ4KZTFV1UvwRIw0/k1C3N+E0Bu0
b/qgAC9sp7nxclpjMSbhbK1Tpg41MQeQluDVPSuPI2yZDfcFTHTb1DMfYN2XmaOe
nsVNExwgrJp/bSUx31fatehcsZBk0iIwHbBcfMjig4PYdTaMo8NnBxE9eN0HBkcN
WLb7jpENPcdo1/p86hAajQID5w2C1ubruAtrlzhznDH6ZsvA3ewuxS18a2FEEaZE
ZJdR5eQqNFjgLc0yUOs7Pc3QZpdbELVFN22PYsq8Dg2anaSpBEMSZIxSR469Yeuo
+Uf5WZApoCu92CgXB+pyIzM/X9iCSGTr3Pn5hCp0/zQeuUAqRIKSCBxM+SRhElCu
Qf6WktOiLOMT1Gyrz7m8dNzWtosP657YTQ9QGPgl9fDpgwjQmNLKoWpVYeE0cYTL
wOVxafB++a/n5kRkt3iidNkIF2JkIfrd84/dI7BmCV7NnU5y2H8uLCzG5Z+R4iUy
KLSUAkCHoeIb1VNWMSo7ne6L/pSEl+SIgz58gP2I8LqgsVYA+Hs=
=htcO
-----END PGP SIGNATURE-----
Merge tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bluetooth and netfilter.
Current release - regressions:
- virtio_net: avoid crash on resume - move netdev_tx_reset_queue()
call before RX napi enable
Current release - new code bugs:
- net/mlx5e: fix page leak and incorrect header release w/ HW GRO
Previous releases - regressions:
- udp: fix receiving fraglist GSO packets
- tcp: prevent refcount underflow due to concurrent execution of
tcp_sk_exit_batch()
Previous releases - always broken:
- ipv6: fix possible UAF when incrementing error counters on output
- ip6: tunnel: prevent merging of packets with different L2
- mptcp: pm: fix IDs not being reusable
- bonding: fix potential crashes in IPsec offload handling
- Bluetooth: HCI:
- MGMT: add error handling to pair_device() to avoid a crash
- invert LE State quirk to be opt-out rather then opt-in
- fix LE quote calculation
- drv: dsa: VLAN fixes for Ocelot driver
- drv: igb: cope with large MAX_SKB_FRAGS Kconfig settings
- drv: ice: fi Rx data path on architectures with PAGE_SIZE >= 8192
Misc:
- netpoll: do not export netpoll_poll_[disable|enable]()
- MAINTAINERS: update the list of networking headers"
* tag 'net-6.11-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (82 commits)
s390/iucv: Fix vargs handling in iucv_alloc_device()
net: ovs: fix ovs_drop_reasons error
net: xilinx: axienet: Fix dangling multicast addresses
net: xilinx: axienet: Always disable promiscuous mode
MAINTAINERS: Mark JME Network Driver as Odd Fixes
MAINTAINERS: Add header files to NETWORKING sections
MAINTAINERS: Add limited globs for Networking headers
MAINTAINERS: Add net_tstamp.h to SOCKET TIMESTAMPING section
MAINTAINERS: Add sonet.h to ATM section of MAINTAINERS
octeontx2-af: Fix CPT AF register offset calculation
net: phy: realtek: Fix setting of PHY LEDs Mode B bit on RTL8211F
net: ngbe: Fix phy mode set to external phy
netfilter: flowtable: validate vlan header
bnxt_en: Fix double DMA unmapping for XDP_REDIRECT
ipv6: prevent possible UAF in ip6_xmit()
ipv6: fix possible UAF in ip6_finish_output2()
ipv6: prevent UAF in ip6_send_skb()
netpoll: do not export netpoll_poll_[disable|enable]()
selftests: mlxsw: ethtool_lanes: Source ethtool lib from correct path
udp: fix receiving fraglist GSO packets
...
iucv_alloc_device() gets a format string and a varying number of
arguments. This is incorrectly forwarded by calling dev_set_name() with
the format string and a va_list, while dev_set_name() expects also a
varying number of arguments.
Symptoms:
Corrupted iucv device names, which can result in log messages like:
sysfs: cannot create duplicate filename '/devices/iucv/hvc_iucv1827699952'
Fixes: 4452e8ef8c ("s390/iucv: Provide iucv_alloc_device() / iucv_release_device()")
Link: https://bugzilla.suse.com/show_bug.cgi?id=1228425
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Reviewed-by: Thorsten Winkler <twinkler@linux.ibm.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20240821091337.3627068-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is something wrong with ovs_drop_reasons. ovs_drop_reasons[0] is
"OVS_DROP_LAST_ACTION", but OVS_DROP_LAST_ACTION == __OVS_DROP_REASON + 1,
which means that ovs_drop_reasons[1] should be "OVS_DROP_LAST_ACTION".
And as Adrian tested, without the patch, adding flow to drop packets
results in:
drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97)
origin: software
input port ifindex: 8
timestamp: Tue Aug 20 10:19:17 2024 859853461 nsec
protocol: 0x800
length: 98
original length: 98
drop reason: OVS_DROP_ACTION_ERROR
With the patch, the same results in:
drop at: do_execute_actions+0x197/0xb20 [openvsw (0xffffffffc0db6f97)
origin: software
input port ifindex: 8
timestamp: Tue Aug 20 10:16:13 2024 475856608 nsec
protocol: 0x800
length: 98
original length: 98
drop reason: OVS_DROP_LAST_ACTION
Fix this by initializing ovs_drop_reasons with index.
Fixes: 9d802da40b ("net: openvswitch: add last-action drop reason")
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Tested-by: Adrian Moreno <amorenoz@redhat.com>
Reviewed-by: Adrian Moreno <amorenoz@redhat.com>
Link: https://patch.msgid.link/20240821123252.186305-1-dongml2@chinatelecom.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmbHEBcACgkQ1V2XiooU
IORLYhAAokOmQRxYoSgjbZASfhss57XGI1GzstYCr10kxKPI0AthP26sGk4RoiuP
22+6sDIk7Qo8SGJSPdvHuwHcsqo4wRXxyItuWYJ875cDXErkyhrWNQhbp+5SNcgN
t8f9S5FGmw1+iY/YyBh5v69u5+ycwgiFdUAUogxr7W/bsB0QcOSAI27hnrynj5K7
yTn1ys17pAICnMsrX11xtS41lWrXE8bBr91tW+Ko1CRzEc2/zWgaSN8oEadJfV05
jmNFYPgXWlmhB3KDoTdE/lWvjphUqxe6Jv0ILLNsH15ETgjP9JFbq2J7memAQIjr
1FLFsEX1ErrdrV7OCNI0n4QE8nuDs3eBKpsalg3ey6k9do+bLRCxZbHR86zkJGp2
b1M4Qa5dV0D3mRrKP3RQQt4k7LyAN7C11al4FMj6FU5FdV4CiDwjPazjqhkHYp30
VZmZJOmlJhGiiWDNHV921oMGa7uEb7Eo4CHLAoAsUTojzTZ+XWtkRDsGjPgbdsIZ
6On/Or6LvW/B+XgHdf3LVvGdTve6dLSO2bpU9vTj9lVp6iGxQ08taUlfZcC5P1kp
FeE7jnhiZdNk6SJ1np83zx38un5ckHWqH8O/jr06fDii7RWn/RhKJgmLYaBsV7U7
85CETA9G0OtVEC15f3Gxsvme4aMRXCdoMtK7HI8fTCe8F4IDNko=
=Vnhf
-----END PGP SIGNATURE-----
Merge tag 'nf-24-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
Patch #1 disable BH when collecting stats via hardware offload to ensure
concurrent updates from packet path do not result in losing stats.
From Sebastian Andrzej Siewior.
Patch #2 uses write seqcount to reset counters serialize against reader.
Also from Sebastian Andrzej Siewior.
Patch #3 ensures vlan header is in place before accessing its fields,
according to KMSAN splat triggered by syzbot.
* tag 'nf-24-08-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: flowtable: validate vlan header
netfilter: nft_counter: Synchronize nft_counter_reset() against reader.
netfilter: nft_counter: Disable BH in nft_counter_offload_stats().
====================
Link: https://patch.msgid.link/20240822101842.4234-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sean Anderson says:
====================
net: xilinx: axienet: Multicast fixes and improvements [part]
====================
First two patches of the series which are fixes.
Link: https://patch.msgid.link/20240822154059.1066595-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If a multicast address is removed but there are still some multicast
addresses, that address would remain programmed into the frame filter.
Fix this by explicitly setting the enable bit for each filter.
Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240822154059.1066595-3-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If promiscuous mode is disabled when there are fewer than four multicast
addresses, then it will not be reflected in the hardware. Fix this by
always clearing the promiscuous mode flag even when we program multicast
addresses.
Fixes: 8a3b7a252d ("drivers/net/ethernet/xilinx: added Xilinx AXI Ethernet driver")
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240822154059.1066595-2-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This typo in scripts/Makefile.build has been present for more than 20
years. It was accidentally copy-pasted to other scripts/Makefile.* files.
Fix them all.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
This driver only appears to have received sporadic clean-ups, typically
part of some tree-wide activity, and fixes for quite some time. And
according to the maintainer, Guo-Fu Tseng, the device has been EOLed for
a long time (see Link).
Accordingly, it seems appropriate to mark this driver as odd fixes.
Cc: Moon Yeounsu <yyyynoom@gmail.com>
Cc: Guo-Fu Tseng <cooldavid@cooldavid.org>
Link: https://lore.kernel.org/netdev/20240805003139.M94125@cooldavid.org/
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This is part of an effort to assign a section in MAINTAINERS to header
files that relate to Networking. In this case the files with "net" or
"skbuff" in their name.
This patch adds a number of such files to the NETWORKING DRIVERS
and NETWORKING [GENERAL] sections.
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This aims to add limited globs to improve the coverage of header files
in the NETWORKING DRIVERS and NETWORKING [GENERAL] sections.
It is done so in a minimal way to exclude overlap with other sections.
And so as not to require "X" entries to exclude files otherwise
matched by these new globs.
While imperfect, due to it's limited nature, this does extend coverage
of header files by these sections. And aims to automatically cover
new files that seem very likely belong to these sections.
The include/linux/netdev* glob (both sections)
+ Subsumes the entries for:
- include/linux/netdevice.h
+ Extends the sections to cover
- include/linux/netdevice_xmit.h
- include/linux/netdev_features.h
The include/uapi/linux/netdev* globs: (both sections)
+ Subsumes the entries for:
- include/linux/netdevice.h
+ Extends the sections to cover
- include/linux/netdev.h
The include/linux/skbuff* glob (NETWORKING [GENERAL] section only):
+ Subsumes the entry for:
- include/linux/skbuff.h
+ Extends the section to cover
- include/linux/skbuff_ref.h
A include/uapi/linux/net_* glob was not added to the NETWORKING [GENERAL]
section. Although it would subsume the entry for
include/uapi/linux/net_namespace.h, which is fine, it would also extend
coverage to:
- include/uapi/linux/net_dropmon.h, which belongs to the
NETWORK DROP MONITOR section
- include/uapi/linux/net_tstamp.h which, as per an earlier patch in this
series, belongs to the SOCKET TIMESTAMPING section
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This is part of an effort to assign a section in MAINTAINERS to header
files that relate to Networking. In this case the files with "net" in
their name.
Cc: Richard Cochran <richardcochran@gmail.com>
Cc: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Acked-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This is part of an effort to assign a section in MAINTAINERS to header
files that relate to Networking. In this case the files with "net" in
their name.
It seems that sonet.h is included in ATM related source files,
and thus that ATM is the most relevant section for these files.
Cc: Chas Williams <3chas3@gmail.com>
Signed-off-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Let the kememdup_array() take care about multiplication and possible
overflows.
Signed-off-by: Yu Jiaoliang <yujiaoliang@vivo.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821081447.12430-1-yujiaoliang@vivo.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
GDM1 port on EN7581 SoC is connected to the lan dsa switch.
GDM{2,3,4} can be used as wan port connected to an external
phy module. Configure hw mac address registers according to the port id.
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://patch.msgid.link/20240821-airoha-eth-wan-mac-addr-v2-1-8706d0cd6cd5@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Some CPT AF registers are per LF and others are global. Translation
of PF/VF local LF slot number to actual LF slot number is required
only for accessing perf LF registers. CPT AF global registers access
do not require any LF slot number. Also, there is no reason CPT
PF/VF to know actual lf's register offset.
Without this fix microcode loading will fail, VFs cannot be created
and hardware is not usable.
Fixes: bc35e28af7 ("octeontx2-af: replace cpt slot with lf id on reg write")
Signed-off-by: Bharat Bhushan <bbhushan2@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240821070558.1020101-1-bbhushan2@marvell.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The current implementation incorrectly sets the mode bit of the PHY chip.
Bit 15 (RTL8211F_LEDCR_MODE) should not be shifted together with the
configuration nibble of a LED- it should be set independently of the
index of the LED being configured.
As a consequence, the RTL8211F LED control is actually operating in Mode A.
Fix the error by or-ing final register value to write with a const-value of
RTL8211F_LEDCR_MODE, thus setting Mode bit explicitly.
Fixes: 17784801d8 ("net: phy: realtek: Add support for PHY LEDs on RTL8211F")
Signed-off-by: Sava Jakovljev <savaj@meyersound.com>
Reviewed-by: Marek Vasut <marex@denx.de>
Link: https://patch.msgid.link/PAWP192MB21287372F30C4E55B6DF6158C38E2@PAWP192MB2128.EURP192.PROD.OUTLOOK.COM
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
A few tests check if nettest exists in the $PATH before adding
$PWD to $PATH and re-checking. They don't discard stderr on
the first check (and nettest is built as part of selftests,
so it's pretty normal for it to not be available in system $PATH).
This leads to output noise:
which: no nettest in (/home/virtme/tools/fs/bin:/home/virtme/tools/fs/sbin:/home/virtme/tools/fs/usr/bin:/home/virtme/tools/fs/usr/sbin:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin)
Add a common helper for the check which does silence stderr.
There is another small functional change hiding here, because pmtu.sh
and fib_rule_tests.sh used to return from the test case rather than
completely exit. Building nettest is not hard, there should be no need
to maintain the ability to selectively skip cases in its absence.
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20240821012227.1398769-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The MAC only has add the TX delay and it can not be modified.
MAC and PHY are both set the TX delay cause transmission problems.
So just disable TX delay in PHY, when use rgmii to attach to
external phy, set PHY_INTERFACE_MODE_RGMII_RXID to phy drivers.
And it is does not matter to internal phy.
Fixes: bc2426d74a ("net: ngbe: convert phylib to phylink")
Signed-off-by: Mengyuan Lou <mengyuanlou@net-swift.com>
Cc: stable@vger.kernel.org # 6.3+
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/E6759CF1387CF84C+20240820030425.93003-1-mengyuanlou@net-swift.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Ensure there is sufficient room to access the protocol field of the
VLAN header, validate it once before the flowtable lookup.
=====================================================
BUG: KMSAN: uninit-value in nf_flow_offload_inet_hook+0x45a/0x5f0 net/netfilter/nf_flow_table_inet.c:32
nf_flow_offload_inet_hook+0x45a/0x5f0 net/netfilter/nf_flow_table_inet.c:32
nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline]
nf_hook_slow+0xf4/0x400 net/netfilter/core.c:626
nf_hook_ingress include/linux/netfilter_netdev.h:34 [inline]
nf_ingress net/core/dev.c:5440 [inline]
Fixes: 4cd91f7c29 ("netfilter: flowtable: add vlan support")
Reported-by: syzbot+8407d9bb88cd4c6bf61a@syzkaller.appspotmail.com
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Justin Iurman says:
====================
net: ipv6: ioam6: introduce tunsrc
This series introduces a new feature called "tunsrc" (just like seg6
already does).
v3:
- address Jakub's comments
v2:
- add links to performance result figures (see patch#2 description)
- move the ipv6_addr_any() check out of the datapath
====================
Link: https://patch.msgid.link/20240817131818.11834-1-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This patch provides a new feature (i.e., "tunsrc") for the tunnel (i.e.,
"encap") mode of ioam6. Just like seg6 already does, except it is
attached to a route. The "tunsrc" is optional: when not provided (by
default), the automatic resolution is applied. Using "tunsrc" when
possible has a benefit: performance. See the comparison:
- before (= "encap" mode): https://ibb.co/bNCzvf7
- after (= "encap" mode with "tunsrc"): https://ibb.co/PT8L6yq
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
This patch prepares the next one by correcting the alignment of some
lines.
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Smatch reported a possible off-by-one in tcp_validate_cookie().
However, it's false positive because the possible range of mssind is
limited from 0 to 3 by the preceding calculation.
mssind = (cookie & (3 << 6)) >> 6;
Now, the verifier does not complain without the boundary check.
Let's remove the checks.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Closes: https://lore.kernel.org/bpf/6ae12487-d3f1-488b-9514-af0dac96608f@stanley.mountain/
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/20240821013425.49316-1-kuniyu@amazon.com
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-08-20 (ice)
This series contains updates to ice driver only.
Maciej fixes issues with Rx data path on architectures with
PAGE_SIZE >= 8192; correcting page reuse usage and calculations for
last offset and truesize.
Michal corrects assignment of devlink port number to use PF id.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: use internal pf id instead of function number
ice: fix truesize operations for PAGE_SIZE >= 8192
ice: fix ICE_LAST_OFFSET formula
ice: fix page reuse when PAGE_SIZE is over 8k
====================
Link: https://patch.msgid.link/20240820215620.1245310-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently the parsing code generator assumes that the yaml
specification file name and the main 'name' attribute carried
inside correspond, that is the field is the c-name representation
of the file basename.
The above assumption held true within the current tree, but will be
hopefully broken soon by the upcoming net shaper specification.
Additionally, it makes the field 'name' itself useless.
Lift the assumption, always computing the generated include file
name from the generated c file name.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/24da5a3596d814beeb12bd7139a6b4f89756cc19.1724165948.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sean Anderson says:
====================
net: xilinx: axienet: Add statistics support
Add support for hardware statistics counters (if they are enabled) in
the AXI Ethernet driver. Unfortunately, the implementation is
complicated a bit since the hardware might only support 32-bit counters.
====================
Link: https://patch.msgid.link/20240820175343.760389-1-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support for reading the statistics counters, if they are enabled.
The counters may be 64-bit, but we can't detect this statically as
there's no ability bit for it and the counters are read-only. Therefore,
we assume the counters are 32-bits by default. To ensure we don't miss
an overflow, we read all counters at 13-second intervals. This should be
often enough to ensure the bytes counters don't wrap at 2.5 Gbit/s.
Another complication is that the counters may be reset when the device
is reset (depending on configuration). To ensure the counters persist
across link up/down (including suspend/resume), we maintain our own
versions along with the last counter value we saw. Because we might wait
up to 100 ms for the reset to complete, we use a mutex to protect
writing hw_stats. We can't sleep in ndo_get_stats64, so we use a seqlock
to protect readers.
We don't bother disabling the refresh work when we detect 64-bit
counters. This is because the reset issue requires us to read
hw_stat_base and reset_in_progress anyway, which would still require the
seqcount. And I don't think skipping the task is worth the extra
bookkeeping.
We can't use the byte counters for either get_stats64 or
get_eth_mac_stats. This is because the byte counters include everything
in the frame (destination address to FCS, inclusive). But
rtnl_link_stats64 wants bytes excluding the FCS, and
ethtool_eth_mac_stats wants to exclude the L2 overhead (addresses and
length/type). It might be possible to calculate the byte values Linux
expects based on the frame counters, but I think it is simpler to use
the existing software counters.
get_ethtool_stats is implemented for nonstandard statistics. This
includes the aforementioned byte counters, VLAN and PFC frame
counters, and user-defined (e.g. with custom RTL) counters.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Link: https://patch.msgid.link/20240820175343.760389-3-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The Receive Frame Rejected interrupt is asserted whenever there was a
receive error (bad FCS, bad length, etc.) or whenever the frame was
dropped due to a mismatched address. So this is really a combination of
rx_otherhost_dropped, rx_length_errors, rx_frame_errors, and
rx_crc_errors. Mismatched addresses are common and aren't really errors
at all (much like how fragments are normal on half-duplex links). To
avoid confusion, report these events as rx_dropped. This better
reflects what's going on: the packet was received by the MAC but dropped
before being processed.
Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Radhey Shyam Pandey <radhey.shyam.pandey@amd.com>
Link: https://patch.msgid.link/20240820175343.760389-2-sean.anderson@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Adding the NAPI pointer to struct netdev_queue made it grow into another
cacheline, even though there was 44 bytes of padding available.
The struct was historically grouped as follows:
/* read-mostly stuff (align) */
/* ... random control path fields ... */
/* write-mostly stuff (align) */
/* ... 40 byte hole ... */
/* struct dql (align) */
It seems that people want to add control path fields after
the read only fields. struct dql looks pretty innocent
but it forces its own alignment and nothing indicates that
there is a lot of empty space above it.
Move dql above the xmit_lock. This shifts the empty space
to the end of the struct rather than in the middle of it.
Move two example fields there to set an example.
Hopefully people will now add new fields at the end of
the struct. A lot of the read-only stuff is also control
path-only, but if we move it all we'll have another hole
in the middle.
Before:
/* size: 384, cachelines: 6, members: 16 */
/* sum members: 284, holes: 3, sum holes: 100 */
After:
/* size: 320, cachelines: 5, members: 16 */
/* sum members: 284, holes: 1, sum holes: 8 */
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240820205119.1321322-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Eric Dumazet says:
====================
ipv6: fix possible UAF in output paths
First patch fixes an issue spotted by syzbot, and the two
other patches fix error paths after skb_expand_head()
adoption.
====================
Link: https://patch.msgid.link/20240820160859.3786976-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If skb_expand_head() returns NULL, skb has been freed
and the associated dst/idev could also have been freed.
We must use rcu_read_lock() to prevent a possible UAF.
Fixes: 0c9f227bee ("ipv6: use skb_expand_head in ip6_xmit")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240820160859.3786976-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
If skb_expand_head() returns NULL, skb has been freed
and associated dst/idev could also have been freed.
We need to hold rcu_read_lock() to make sure the dst and
associated idev are alive.
Fixes: 5796015fa9 ("ipv6: allocate enough headroom in ip6_finish_output2()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Vasily Averin <vasily.averin@linux.dev>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20240820160859.3786976-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>