Go to file
Vladimir Oltean bcf2c1dc53 net: enetc: avoid buffer leaks on xdp_do_redirect() failure
[ Upstream commit 628050ec95 ]

Before enetc_clean_rx_ring_xdp() calls xdp_do_redirect(), each software
BD in the RX ring between index orig_i and i can have one of 2 refcount
values on its page.

We are the owner of the current buffer that is being processed, so the
refcount will be at least 1.

If the current owner of the buffer at the diametrically opposed index
in the RX ring (i.o.w, the other half of this page) has not yet called
kfree(), this page's refcount could even be 2.

enetc_page_reusable() in enetc_flip_rx_buff() tests for the page
refcount against 1, and [ if it's 2 ] does not attempt to reuse it.

But if enetc_flip_rx_buff() is put after the xdp_do_redirect() call,
the page refcount can have one of 3 values. It can also be 0, if there
is no owner of the other page half, and xdp_do_redirect() for this
buffer ran so far that it triggered a flush of the devmap/cpumap bulk
queue, and the consumers of those bulk queues also freed the buffer,
all by the time xdp_do_redirect() returns the execution back to enetc.

This is the reason why enetc_flip_rx_buff() is called before
xdp_do_redirect(), but there is a big flaw with that reasoning:
enetc_flip_rx_buff() will set rx_swbd->page = NULL on both sides of the
enetc_page_reusable() branch, and if xdp_do_redirect() returns an error,
we call enetc_xdp_free(), which does not deal gracefully with that.

In fact, what happens is quite special. The page refcounts start as 1.
enetc_flip_rx_buff() figures they're reusable, transfers these
rx_swbd->page pointers to a different rx_swbd in enetc_reuse_page(), and
bumps the refcount to 2. When xdp_do_redirect() later returns an error,
we call the no-op enetc_xdp_free(), but we still haven't lost the
reference to that page. A copy of it is still at rx_ring->next_to_alloc,
but that has refcount 2 (and there are no concurrent owners of it in
flight, to drop the refcount). What really kills the system is when
we'll flip the rx_swbd->page the second time around. With an updated
refcount of 2, the page will not be reusable and we'll really leak it.
Then enetc_new_page() will have to allocate more pages, which will then
eventually leak again on further errors from xdp_do_redirect().

The problem, summarized, is that we zeroize rx_swbd->page before we're
completely done with it, and this makes it impossible for the error path
to do something with it.

Since the packet is potentially multi-buffer and therefore the
rx_swbd->page is potentially an array, manual passing of the old
pointers between enetc_flip_rx_buff() and enetc_xdp_free() is a bit
difficult.

For the sake of going with a simple solution, we accept the possibility
of racing with xdp_do_redirect(), and we move the flip procedure to
execute only on the redirect success path. By racing, I mean that the
page may be deemed as not reusable by enetc (having a refcount of 0),
but there will be no leak in that case, either.

Once we accept that, we have something better to do with buffers on
XDP_REDIRECT failure. Since we haven't performed half-page flipping yet,
we won't, either (and this way, we can avoid enetc_xdp_free()
completely, which gives the entire page to the slab allocator).
Instead, we'll call enetc_xdp_drop(), which will recycle this half of
the buffer back to the RX ring.

Fixes: 9d2b68cc10 ("net: enetc: add support for XDP_REDIRECT")
Suggested-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/20221213001908.2347046-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-12-31 13:14:37 +01:00
arch powerpc/pseries/eeh: use correct API for error log size 2022-12-31 13:14:36 +01:00
block block, bfq: fix possible uaf for 'bfqq->bic' 2022-12-31 13:14:37 +01:00
certs certs/blacklist_hashes.c: fix const confusion in certs blacklist 2022-06-22 14:22:01 +02:00
crypto crypto: tcrypt - Fix multibuffer skcipher speed test mem leak 2022-12-31 13:14:24 +01:00
Documentation overflow: Implement size_t saturating arithmetic helpers 2022-12-31 13:14:33 +01:00
drivers net: enetc: avoid buffer leaks on xdp_do_redirect() failure 2022-12-31 13:14:37 +01:00
fs nfsd: under NFSv4.1, fix double svc_xprt_put on rpc_create failure 2022-12-31 13:14:37 +01:00
include dmaengine: idxd: Fix crc_val field for completion record 2022-12-31 13:14:34 +01:00
init init/Kconfig: fix CC_HAS_ASM_GOTO_TIED_OUTPUT test with dash 2022-12-02 17:41:08 +01:00
io_uring io_uring: Fix a null-ptr-deref in io_tctx_exit_cb() 2022-12-14 11:37:31 +01:00
ipc ipc/sem: Fix dangling sem_array access in semtimedop race 2022-12-08 11:28:45 +01:00
kernel tracing/hist: Fix issue of losting command info in error_log 2022-12-31 13:14:31 +01:00
lib overflow: Implement size_t saturating arithmetic helpers 2022-12-31 13:14:33 +01:00
LICENSES LICENSES/dual/CC-BY-4.0: Git rid of "smart quotes" 2021-07-15 06:31:24 -06:00
mm mm/gup: fix gup_pud_range() for dax 2022-12-14 11:37:20 +01:00
net netfilter: flowtable: really fix NAT IPv6 offload 2022-12-31 13:14:36 +01:00
samples samples: vfio-mdev: Fix missing pci_disable_device() in mdpy_fb_probe() 2022-12-31 13:14:31 +01:00
scripts scripts/faddr2line: Fix regression in name resolution on ppc64le 2022-12-08 11:28:38 +01:00
security apparmor: Fix memleak in alloc_ns() 2022-12-31 13:14:22 +01:00
sound ALSA: mts64: fix possible null-ptr-defer in snd_mts64_interrupt 2022-12-31 13:14:16 +01:00
tools selftests/bpf: Add test for unstable CT lookup API 2022-12-31 13:14:37 +01:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-02-01 17:27:15 +01:00
virt kvm: Add support for arch compat vm ioctls 2022-10-29 10:12:54 +02:00
.clang-format clang-format: Update with the latest for_each macro list 2021-05-12 23:32:39 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap mailmap: add Andrej Shadura 2021-10-18 20:22:03 -10:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Daniel Drake to credits 2021-09-21 08:34:58 +03:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS futex: Move to kernel/futex/ 2022-12-31 13:14:04 +01:00
Makefile Linux 5.15.85 2022-12-21 17:36:38 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.