linux/io_uring
Jens Axboe 09f7520048 io_uring: free io_buffer_list entries via RCU
commit 5cf4f52e6d upstream.

mmap_lock nests under uring_lock out of necessity, as we may be doing
user copies with uring_lock held. However, for mmap of provided buffer
rings, we attempt to grab uring_lock with mmap_lock already held from
do_mmap(). This makes lockdep, rightfully, complain:

WARNING: possible circular locking dependency detected
6.7.0-rc1-00009-gff3337ebaf94-dirty #4438 Not tainted
------------------------------------------------------
buf-ring.t/442 is trying to acquire lock:
ffff00020e1480a8 (&ctx->uring_lock){+.+.}-{3:3}, at: io_uring_validate_mmap_request.isra.0+0x4c/0x140

but task is already holding lock:
ffff0000dc226190 (&mm->mmap_lock){++++}-{3:3}, at: vm_mmap_pgoff+0x124/0x264

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&mm->mmap_lock){++++}-{3:3}:
       __might_fault+0x90/0xbc
       io_register_pbuf_ring+0x94/0x488
       __arm64_sys_io_uring_register+0x8dc/0x1318
       invoke_syscall+0x5c/0x17c
       el0_svc_common.constprop.0+0x108/0x130
       do_el0_svc+0x2c/0x38
       el0_svc+0x4c/0x94
       el0t_64_sync_handler+0x118/0x124
       el0t_64_sync+0x168/0x16c

-> #0 (&ctx->uring_lock){+.+.}-{3:3}:
       __lock_acquire+0x19a0/0x2d14
       lock_acquire+0x2e0/0x44c
       __mutex_lock+0x118/0x564
       mutex_lock_nested+0x20/0x28
       io_uring_validate_mmap_request.isra.0+0x4c/0x140
       io_uring_mmu_get_unmapped_area+0x3c/0x98
       get_unmapped_area+0xa4/0x158
       do_mmap+0xec/0x5b4
       vm_mmap_pgoff+0x158/0x264
       ksys_mmap_pgoff+0x1d4/0x254
       __arm64_sys_mmap+0x80/0x9c
       invoke_syscall+0x5c/0x17c
       el0_svc_common.constprop.0+0x108/0x130
       do_el0_svc+0x2c/0x38
       el0_svc+0x4c/0x94
       el0t_64_sync_handler+0x118/0x124
       el0t_64_sync+0x168/0x16c

From that mmap(2) path, we really just need to ensure that the buffer
list doesn't go away from underneath us. For the lower indexed entries,
they never go away until the ring is freed and we can always sanely
reference those as long as the caller has a file reference. For the
higher indexed ones in our xarray, we just need to ensure that the
buffer list remains valid while we return the address of it.

Free the higher indexed io_buffer_list entries via RCU. With that we can
avoid needing ->uring_lock inside mmap(2), and simply hold the RCU read
lock around the buffer list lookup and address check.

To ensure that the arrayed lookup either returns a valid fully formulated
entry via RCU lookup, add an 'is_ready' flag that we access with store
and release memory ordering. This isn't needed for the xarray lookups,
but doesn't hurt either. Since this isn't a fast path, retain it across
both types. Similarly, for the allocated array inside the ctx, ensure
we use the proper load/acquire as setup could in theory be running in
parallel with mmap.

While in there, add a few lockdep checks for documentation purposes.

Cc: stable@vger.kernel.org
Fixes: c56e022c0a ("io_uring: add support for user mapped provided buffer ring")
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-08 08:52:18 +01:00
..
advise.c io_uring: always go async for unsupported fadvise flags 2023-01-29 15:18:26 -07:00
advise.h
alloc_cache.h io_uring/rsrc: consolidate node caching 2023-04-12 12:09:41 -06:00
cancel.c io_uring/cancel: wire up IORING_ASYNC_CANCEL_OP for sync cancel 2023-07-17 10:05:48 -06:00
cancel.h io_uring/cancel: support opcode based lookup and cancelation 2023-07-17 10:05:48 -06:00
epoll.c io_uring: undeprecate epoll_ctl support 2023-05-26 20:22:41 -06:00
epoll.h
fdinfo.c io_uring/fdinfo: remove need for sqpoll lock for thread/pid retrieval 2023-11-28 17:19:52 +00:00
fdinfo.h
filetable.c io_uring: add helpers to decode the fixed file file_ptr 2023-06-20 09:36:22 -06:00
filetable.h io_uring: add helpers to decode the fixed file file_ptr 2023-06-20 09:36:22 -06:00
fs.c io_uring/fs: consider link->flags when getting path for LINKAT 2023-12-03 07:33:07 +01:00
fs.h
io_uring.c io_uring: free io_buffer_list entries via RCU 2023-12-08 08:52:18 +01:00
io_uring.h io_uring: ensure io_lockdep_assert_cq_locked() handles disabled rings 2023-10-03 08:12:54 -06:00
io-wq.c io-wq: fully initialize wqe before calling cpuhp_state_add_instance_nocalls() 2023-10-05 14:11:18 -06:00
io-wq.h io_uring: break out of iowq iopoll on teardown 2023-09-07 09:02:27 -06:00
kbuf.c io_uring: free io_buffer_list entries via RCU 2023-12-08 08:52:18 +01:00
kbuf.h io_uring: free io_buffer_list entries via RCU 2023-12-08 08:52:18 +01:00
Makefile
msg_ring.c io_uring: use io_file_from_index in io_msg_grab_file 2023-06-20 09:36:22 -06:00
msg_ring.h io_uring: get rid of double locking 2022-12-07 06:47:13 -07:00
net.c io_uring/net: ensure socket is marked connected on connect retry 2023-11-20 11:59:38 +01:00
net.h io_uring: Add KASAN support for alloc_caches 2023-04-03 07:16:14 -06:00
nop.c
nop.h
notif.c io_uring/notif: add constant for ubuf_info flags 2023-04-15 14:21:04 -06:00
notif.h io_uring/notif: add constant for ubuf_info flags 2023-04-15 14:21:04 -06:00
opdef.c io_uring: Pass whole sqe to commands 2023-05-04 08:19:05 -06:00
opdef.h io_uring: Split io_issue_def struct 2023-01-29 15:17:41 -07:00
openclose.c io_uring: correct check for O_TMPFILE 2023-08-07 12:34:23 -06:00
openclose.h
poll.c io_uring: never overflow io_aux_cqe 2023-08-11 10:42:57 -06:00
poll.h io_uring: avoid indirect function calls for the hottest task_work 2023-06-02 08:55:37 -06:00
refs.h
rsrc.c io_uring: fix off-by one bvec index 2023-12-03 07:33:07 +01:00
rsrc.h io_uring/rsrc: Annotate struct io_mapped_ubuf with __counted_by 2023-08-17 19:14:47 -06:00
rw.c assorted fixes all over the place 2023-10-27 16:44:58 -10:00
rw.h io_uring: avoid indirect function calls for the hottest task_work 2023-06-02 08:55:37 -06:00
slist.h io_uring: silence variable ‘prev’ set but not used warning 2023-03-09 10:10:58 -07:00
splice.c io_uring/splice: use fput() directly 2023-08-10 10:24:25 -06:00
splice.h
sqpoll.c io_uring/fdinfo: remove need for sqpoll lock for thread/pid retrieval 2023-11-28 17:19:52 +00:00
sqpoll.h io_uring/sqpoll: fix io-wq affinity when IORING_SETUP_SQPOLL is used 2023-08-16 13:40:28 -06:00
statx.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
statx.h
sync.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
sync.h
tctx.c io_uring: Add io_uring_setup flag to pre-register ring fd and never install it 2023-05-16 08:06:00 -06:00
tctx.h io_uring: simplify __io_uring_add_tctx_node 2022-10-07 12:25:30 -06:00
timeout.c io_uring: never overflow io_aux_cqe 2023-08-11 10:42:57 -06:00
timeout.h io_uring: remove unused return from io_disarm_next 2022-09-21 13:15:01 -06:00
uring_cmd.c io_uring: simplify big_cqe handling 2023-08-24 17:16:19 -06:00
uring_cmd.h io_uring: Remove unnecessary BUILD_BUG_ON 2023-05-04 08:19:05 -06:00
xattr.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
xattr.h