linux/io_uring
Darrick J. Wong 377698d4ab Improve iomap/xfs async dio write performance
iomap always punts async dio write completions to a workqueue, which has
 a cost in terms of efficiency (now you need an unrelated worker to
 process it) and latency (now you're bouncing a completion through an
 async worker, which is a classic slowdown scenario).
 
 io_uring handles IRQ completions via task_work, and for writes that
 don't need to do extra IO at completion time, we can safely complete
 them inline from that. This patchset adds IOCB_DIO_CALLER_COMP, which an
 IO issuer can set to inform the completion side that any extra work that
 needs doing for that completion can be punted to a safe task context.
 
 The iomap dio completion will happen in hard/soft irq context, and we
 need a saner context to process these completions. IOCB_DIO_CALLER_COMP
 is added, which can be set in a struct kiocb->ki_flags by the issuer. If
 the completion side of the iocb handling understands this flag, it can
 choose to set a kiocb->dio_complete() handler and just call ki_complete
 from IRQ context. The issuer must then ensure that this callback is
 processed from a task. io_uring punts IRQ completions to task_work
 already, so it's trivial wire it up to run more of the completion before
 posting a CQE. This is good for up to a 37% improvement in
 throughput/latency for low queue depth IO, patch 5 has the details.
 
 If we need to do real work at completion time, iomap will clear the
 IOMAP_DIO_CALLER_COMP flag.
 
 This work came about when Andres tested low queue depth dio writes for
 postgres and compared it to doing sync dio writes, showing that the
 async processing slows us down a lot.
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmTJllIQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpiZTEADDnmmnxPGpZElCRSauiwuUoV1yHFnUytXK
 MUmCbV2Yf2Fuh4rpuf5JkQpmn/80NdSQDrUFD/1jJQ2bT4SPAE/WmjsenZf2mmmd
 DxKQ7vv+P4LgbU+tDuNbCnQsqeVSCUyxdVlMf43M1OxzCuxLnSmUQZZ/EbolS0+H
 ko6rrKFjpdWr4b+VeAnhyoBj4kQLNQG1ctij92RhO+Fgs1q7sGCXd1nQ3lrSXpHS
 T+bh2/vK8QPGwvJYvvaSAvTttUs3+X1fOlK4gb+tuiocpA08py9Qn4AA6iPbFnZR
 pE3vyYR0Yj2wzlqT4ucmTMIAFHpJNBk0wDeqv9ozzcoY862xk+bQQJf14XiSn8/4
 WE/ipnRVb47xAI28UcrpBrrhdBje+hj0WCsKI7BoMrhrNWG9rOlK6FplEWyh/mWS
 Q0VdtHIlgTRZFa6597VIBBsv1QMCenNpCxZn3Kkg3BUkzhQrkW6ki7kd9JAm62dh
 VLu9vZPHufU52oDZ89EsHXQINow19X9zjhozsNv9nZFbPspt+Uk1n0aEO/sbxATI
 KPPQoztic6nVCOui1Sa8GVojZIefwioozcDdAbsq5gdir+45aq5BDFM354mtkZna
 oq8ioolKNVOnERpEizbLGzSd1YpLtJtaiO3MRrupNB6uESGdhv61SCickRSPOqMN
 U9p27/dvvg==
 =EnSA
 -----END PGP SIGNATURE-----

Merge tag 'xfs-async-dio.6-2023-08-01' of git://git.kernel.dk/linux into iomap-6.6-mergeA

Improve iomap/xfs async dio write performance

iomap always punts async dio write completions to a workqueue, which has
a cost in terms of efficiency (now you need an unrelated worker to
process it) and latency (now you're bouncing a completion through an
async worker, which is a classic slowdown scenario).

io_uring handles IRQ completions via task_work, and for writes that
don't need to do extra IO at completion time, we can safely complete
them inline from that. This patchset adds IOCB_DIO_CALLER_COMP, which an
IO issuer can set to inform the completion side that any extra work that
needs doing for that completion can be punted to a safe task context.

The iomap dio completion will happen in hard/soft irq context, and we
need a saner context to process these completions. IOCB_DIO_CALLER_COMP
is added, which can be set in a struct kiocb->ki_flags by the issuer. If
the completion side of the iocb handling understands this flag, it can
choose to set a kiocb->dio_complete() handler and just call ki_complete
from IRQ context. The issuer must then ensure that this callback is
processed from a task. io_uring punts IRQ completions to task_work
already, so it's trivial wire it up to run more of the completion before
posting a CQE. This is good for up to a 37% improvement in
throughput/latency for low queue depth IO, patch 5 has the details.

If we need to do real work at completion time, iomap will clear the
IOMAP_DIO_CALLER_COMP flag.

This work came about when Andres tested low queue depth dio writes for
postgres and compared it to doing sync dio writes, showing that the
async processing slows us down a lot.

* tag 'xfs-async-dio.6-2023-08-01' of git://git.kernel.dk/linux:
  iomap: support IOCB_DIO_CALLER_COMP
  io_uring/rw: add write support for IOCB_DIO_CALLER_COMP
  fs: add IOCB flags related to passing back dio completions
  iomap: add IOMAP_DIO_INLINE_COMP
  iomap: only set iocb->private for polled bio
  iomap: treat a write through cache the same as FUA
  iomap: use an unsigned type for IOMAP_DIO_* defines
  iomap: cleanup up iomap_dio_bio_end_io()

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
2023-08-01 16:41:49 -07:00
..
advise.c io_uring: always go async for unsupported fadvise flags 2023-01-29 15:18:26 -07:00
advise.h
alloc_cache.h io_uring/rsrc: consolidate node caching 2023-04-12 12:09:41 -06:00
cancel.c io_uring: use io_file_from_index in __io_sync_cancel 2023-06-20 09:36:22 -06:00
cancel.h
epoll.c io_uring: undeprecate epoll_ctl support 2023-05-26 20:22:41 -06:00
epoll.h
fdinfo.c capability: just use a 'u64' instead of a 'u32[2]' array 2023-03-01 10:01:22 -08:00
fdinfo.h
filetable.c io_uring: add helpers to decode the fixed file file_ptr 2023-06-20 09:36:22 -06:00
filetable.h io_uring: add helpers to decode the fixed file file_ptr 2023-06-20 09:36:22 -06:00
fs.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
fs.h
io_uring.c io_uring: Fix io_uring mmap() by using architecture-provided get_unmapped_area() 2023-07-21 09:41:29 -06:00
io_uring.h io_uring: make io_cq_unlock_post static 2023-06-23 08:19:40 -06:00
io-wq.c io_uring/io-wq: clear current->worker_private on exit 2023-06-14 12:54:55 -06:00
io-wq.h
kbuf.c for-6.4/io_uring-2023-04-21 2023-04-26 12:40:31 -07:00
kbuf.h io_uring: add support for user mapped provided buffer ring 2023-04-03 07:14:21 -06:00
Makefile
msg_ring.c io_uring: use io_file_from_index in io_msg_grab_file 2023-06-20 09:36:22 -06:00
msg_ring.h io_uring: get rid of double locking 2022-12-07 06:47:13 -07:00
net.c io_uring-6.5-2023-07-03 2023-07-03 18:43:10 -07:00
net.h io_uring: Add KASAN support for alloc_caches 2023-04-03 07:16:14 -06:00
nop.c
nop.h
notif.c io_uring/notif: add constant for ubuf_info flags 2023-04-15 14:21:04 -06:00
notif.h io_uring/notif: add constant for ubuf_info flags 2023-04-15 14:21:04 -06:00
opdef.c io_uring: Pass whole sqe to commands 2023-05-04 08:19:05 -06:00
opdef.h io_uring: Split io_issue_def struct 2023-01-29 15:17:41 -07:00
openclose.c fsnotify: move fsnotify_open() hook into do_dentry_open() 2023-06-12 10:43:45 +02:00
openclose.h
poll.c for-6.5/io_uring-2023-06-23 2023-06-26 12:30:26 -07:00
poll.h io_uring: avoid indirect function calls for the hottest task_work 2023-06-02 08:55:37 -06:00
refs.h
rsrc.c - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
rsrc.h io_uring/rsrc: disassociate nodes and rsrc_data 2023-04-18 19:38:26 -06:00
rw.c io_uring/rw: add write support for IOCB_DIO_CALLER_COMP 2023-08-01 17:32:45 -06:00
rw.h io_uring: avoid indirect function calls for the hottest task_work 2023-06-02 08:55:37 -06:00
slist.h io_uring: silence variable ‘prev’ set but not used warning 2023-03-09 10:10:58 -07:00
splice.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
splice.h
sqpoll.c io_uring: unlock sqd->lock before sq thread release CPU 2023-05-25 09:30:13 -06:00
sqpoll.h io_uring: make io_sqpoll_wait_sq return void 2023-01-29 15:17:40 -07:00
statx.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
statx.h
sync.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
sync.h
tctx.c io_uring: Add io_uring_setup flag to pre-register ring fd and never install it 2023-05-16 08:06:00 -06:00
tctx.h io_uring: simplify __io_uring_add_tctx_node 2022-10-07 12:25:30 -06:00
timeout.c io_uring: cleanup io_aux_cqe() API 2023-06-07 14:59:22 -06:00
timeout.h io_uring: remove unused return from io_disarm_next 2022-09-21 13:15:01 -06:00
uring_cmd.c io_uring/cmd: add cmd lazy tw wake helper 2023-05-25 08:54:06 -06:00
uring_cmd.h io_uring: Remove unnecessary BUILD_BUG_ON 2023-05-04 08:19:05 -06:00
xattr.c io_uring: for requests that require async, force it 2023-01-29 15:18:26 -07:00
xattr.h