linux/drivers/net
Vladimir Oltean 84ce1ca3fe net: enetc: survive memory pressure without crashing
Under memory pressure, enetc_refill_rx_ring() may fail, and when called
during the enetc_open() -> enetc_setup_rxbdr() procedure, this is not
checked for.

An extreme case of memory pressure will result in exactly zero buffers
being allocated for the RX ring, and in such a case it is expected that
hardware drops all RX packets due to lack of buffers.

This does not happen, because the reset-default value of the consumer
and produces index is 0, and this makes the ENETC think that all buffers
have been initialized and that it owns them (when in reality none were).

The hardware guide explains this best:

| Configure the receive ring producer index register RBaPIR with a value
| of 0. The producer index is initially configured by software but owned
| by hardware after the ring has been enabled. Hardware increments the
| index when a frame is received which may consume one or more BDs.
| Hardware is not allowed to increment the producer index to match the
| consumer index since it is used to indicate an empty condition. The ring
| can hold at most RBLENR[LENGTH]-1 received BDs.
|
| Configure the receive ring consumer index register RBaCIR. The
| consumer index is owned by software and updated during operation of the
| of the BD ring by software, to indicate that any receive data occupied
| in the BD has been processed and it has been prepared for new data.
| - If consumer index and producer index are initialized to the same
|   value, it indicates that all BDs in the ring have been prepared and
|   hardware owns all of the entries.
| - If consumer index is initialized to producer index plus N, it would
|   indicate N BDs have been prepared. Note that hardware cannot start if
|   only a single buffer is prepared due to the restrictions described in
|   (2).
| - Software may write consumer index to match producer index anytime
|   while the ring is operational to indicate all received BDs prior have
|   been processed and new BDs prepared for hardware.

Normally, the value of rx_ring->rcir (consumer index) is brought in sync
with the rx_ring->next_to_use software index, but this only happens if
page allocation ever succeeded.

When PI==CI==0, the hardware appears to receive frames and write them to
DMA address 0x0 (?!), then set the READY bit in the BD.

The enetc_clean_rx_ring() function (and its XDP derivative) is naturally
not prepared to handle such a condition. It will attempt to process
those frames using the rx_swbd structure associated with index i of the
RX ring, but that structure is not fully initialized (enetc_new_page()
does all of that). So what happens next is undefined behavior.

To operate using no buffer, we must initialize the CI to PI + 1, which
will block the hardware from advancing the CI any further, and drop
everything.

The issue was seen while adding support for zero-copy AF_XDP sockets,
where buffer memory comes from user space, which can even decide to
supply no buffers at all (example: "xdpsock --txonly"). However, the bug
is present also with the network stack code, even though it would take a
very determined person to trigger a page allocation failure at the
perfect time (a series of ifup/ifdown under memory pressure should
eventually reproduce it given enough retries).

Fixes: d4fd0404c1 ("enetc: Introduce basic PF and VF ENETC ethernet drivers")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Link: https://lore.kernel.org/r/20221027182925.3256653-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-27 11:32:25 -07:00
..
appletalk
arcnet
bonding treewide: use get_random_u32() when possible 2022-10-11 17:42:58 -06:00
caif
can can: rcar_canfd: fix channel specific IRQ handling for RZ/G2L 2022-10-27 09:30:59 +02:00
dsa net: dsa: qca8k: fix ethtool autocast mib for big-endian systems 2022-10-14 08:22:28 +01:00
ethernet net: enetc: survive memory pressure without crashing 2022-10-27 11:32:25 -07:00
fddi
fjes
hamradio treewide: use get_random_{u8,u16}() when possible, part 1 2022-10-11 17:42:58 -06:00
hippi
hyperv net: hv_netvsc: Fix a warning triggered by memcpy in rndis_filter 2022-10-15 11:09:53 +01:00
ieee802154
ipa net: ipa: don't configure IDLE_INDICATION on v3.1 2022-10-25 19:49:13 -07:00
ipvlan
mctp
mdio net: mdiobus: search for PSE nodes by parsing PHY nodes. 2022-10-03 17:33:57 -07:00
netdevsim netdevsim: remove dir in nsim_dev_debugfs_init() when creating ports dir failed 2022-10-27 10:47:29 -07:00
pcs
phy Networking fixes for 6.1-rc2, including fixes from netfilter 2022-10-20 17:24:59 -07:00
plip
ppp
pse-pd net: pse-pd: PSE_REGULATOR should depend on REGULATOR 2022-10-05 20:32:28 -07:00
slip
team
usb TTY/Serial driver update for 6.1-rc1 2022-10-07 16:36:24 -07:00
vmxnet3
vxlan
wan
wireguard treewide: use get_random_bytes() when possible 2022-10-11 17:42:58 -06:00
wireless Random number generator fixes for Linux 6.1-rc1. 2022-10-16 15:27:07 -07:00
wwan wwan_hwsim: fix possible memory leak in wwan_hwsim_dev_new() 2022-10-19 17:25:10 -07:00
xen-netback
amt.c
bareudp.c
dummy.c
eql.c
geneve.c
gtp.c
ifb.c
Kconfig - Yu Zhao's Multi-Gen LRU patches are here. They've been under test in 2022-10-10 17:53:04 -07:00
LICENSE.SRC
loopback.c
macsec.c
macvlan.c net: macvlan: change schedule system_wq to system_unbound_wq 2022-10-14 08:28:19 +01:00
macvtap.c
Makefile net: add framework to support Ethernet PSE and PDs devices 2022-10-03 17:33:56 -07:00
mdio.c
mhi_net.c
mii.c
net_failover.c
netconsole.c
nlmon.c
ntb_netdev.c
rionet.c
sb1000.c
Space.c
sungem_phy.c
tap.c
thunderbolt.c
tun.c net: tun: Convert to use sysfs_emit() APIs 2022-09-30 12:27:43 +01:00
veth.c
virtio_net.c virtio: fixes, features 2022-10-10 14:02:53 -07:00
vrf.c
vsockmon.c
xen-netfront.c