linux/fs/btrfs
Qu Wenruo 0391c9085a btrfs: do not wait for short bulk allocation
commit 1db7959aac upstream.

[BUG]
There is a recent report that when memory pressure is high (including
cached pages), btrfs can spend most of its time on memory allocation in
btrfs_alloc_page_array() for compressed read/write.

[CAUSE]
For btrfs_alloc_page_array() we always go alloc_pages_bulk_array(), and
even if the bulk allocation failed (fell back to single page
allocation) we still retry but with extra memalloc_retry_wait().

If the bulk alloc only returned one page a time, we would spend a lot of
time on the retry wait.

The behavior was introduced in commit 395cb57e85 ("btrfs: wait between
incomplete batch memory allocations").

[FIX]
Although the commit mentioned that other filesystems do the wait, it's
not the case at least nowadays.

All the mainlined filesystems only call memalloc_retry_wait() if they
failed to allocate any page (not only for bulk allocation).
If there is any progress, they won't call memalloc_retry_wait() at all.

For example, xfs_buf_alloc_pages() would only call memalloc_retry_wait()
if there is no allocation progress at all, and the call is not for
metadata readahead.

So I don't believe we should call memalloc_retry_wait() unconditionally
for short allocation.

Call memalloc_retry_wait() if it fails to allocate any page for tree
block allocation (which goes with __GFP_NOFAIL and may not need the
special handling anyway), and reduce the latency for
btrfs_alloc_page_array().

Reported-by: Julian Taylor <julian.taylor@1und1.de>
Tested-by: Julian Taylor <julian.taylor@1und1.de>
Link: https://lore.kernel.org/all/8966c095-cbe7-4d22-9784-a647d1bf27c3@1und1.de/
Fixes: 395cb57e85 ("btrfs: wait between incomplete batch memory allocations")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-05-17 11:56:23 +02:00
..
tests btrfs: convert btrfs_block_group::needs_free_space to runtime flag 2023-08-23 17:52:28 +02:00
acl.c btrfs: reserve correct number of items for inode creation 2022-05-16 17:03:08 +02:00
async-thread.c btrfs: simplify WQ_HIGHPRI handling in struct btrfs_workqueue 2022-05-16 17:03:15 +02:00
async-thread.h btrfs: remove unused typedefs get_extent_t and btrfs_work_func_t 2022-07-25 17:45:36 +02:00
backref.c btrfs: fix information leak in btrfs_ioctl_logical_to_ino() 2024-05-02 16:29:28 +02:00
backref.h btrfs: ignore fiemap path cache if we have multiple leaves for a data extent 2022-10-11 14:48:07 +02:00
block-group.c btrfs: zoned: don't skip block groups with 100% zone unusable 2024-04-03 15:19:48 +02:00
block-group.h btrfs: add and use helper to check if block group is used 2024-02-23 09:12:28 +01:00
block-rsv.c btrfs: fix data race at btrfs_use_block_rsv() when accessing block reserve 2024-03-26 18:20:26 -04:00
block-rsv.h btrfs: fix data race at btrfs_use_block_rsv() when accessing block reserve 2024-03-26 18:20:26 -04:00
btrfs_inode.h btrfs: use a runtime flag to indicate an inode is a free space inode 2022-09-26 12:28:07 +02:00
check-integrity.c fs/btrfs: Use the enum req_op and blk_opf_t types 2022-07-14 12:14:32 -06:00
check-integrity.h btrfs: check-integrity: split submit_bio from btrfsic checking 2022-05-16 17:03:12 +02:00
compression.c fs: fix leaked psi pressure state 2022-11-08 15:57:25 -08:00
compression.h for-5.20-tag 2022-08-03 14:54:52 -07:00
ctree.c btrfs: error out when reallocating block for defrag using a stale transaction 2023-10-25 12:03:11 +02:00
ctree.h btrfs: fix infinite directory reads 2024-01-31 16:17:05 -08:00
delalloc-space.c btrfs: don't reserve space for checksums when writing to nocow files 2024-02-23 09:12:29 +01:00
delalloc-space.h btrfs: add the ability to use NO_FLUSH for data reservations 2022-09-29 17:08:28 +02:00
delayed-inode.c btrfs: record delayed inode root in transaction 2024-04-17 11:18:26 +02:00
delayed-inode.h btrfs: fix infinite directory reads 2024-01-31 16:17:05 -08:00
delayed-ref.c btrfs: prevent transaction block reserve underflow when starting transaction 2023-10-25 12:03:09 +02:00
delayed-ref.h btrfs: prevent transaction block reserve underflow when starting transaction 2023-10-25 12:03:09 +02:00
dev-replace.c btrfs: dev-replace: properly validate device names 2024-03-06 14:45:10 +00:00
dev-replace.h btrfs: add struct declarations in dev-replace.h 2022-09-26 12:28:07 +02:00
dir-item.c btrfs: use struct fscrypt_str instead of struct qstr 2023-10-10 22:00:36 +02:00
discard.c btrfs: hold block group refcount during async discard 2023-03-10 09:34:06 +01:00
discard.h
disk-io.c btrfs: fix double free of anonymous device after snapshot creation failure 2024-03-06 14:45:10 +00:00
disk-io.h btrfs: fix double free of anonymous device after snapshot creation failure 2024-03-06 14:45:10 +00:00
export.c btrfs: export: handle invalid inode or root reference in btrfs_get_parent() 2024-04-13 13:05:01 +02:00
export.h btrfs: fix type of parameter generation in btrfs_get_dentry 2022-10-24 15:28:58 +02:00
extent_io.c btrfs: do not wait for short bulk allocation 2024-05-17 11:56:23 +02:00
extent_io.h btrfs: move extent io tree unrelated prototypes to their appropriate header 2022-09-26 12:28:04 +02:00
extent_map.c btrfs: fix incorrect splitting in btrfs_drop_extent_map_range 2023-08-23 17:52:31 +02:00
extent_map.h btrfs: get the next extent map during fiemap/lseek more efficiently 2023-04-26 14:28:38 +02:00
extent-io-tree.c btrfs: fix off-by-one in delalloc search during lseek 2023-01-12 12:01:56 +01:00
extent-io-tree.h btrfs: stop tracking failed reads in the I/O tree 2022-09-26 12:28:05 +02:00
extent-tree.c btrfs: zoned: optimize hint byte for zoned allocator 2024-01-31 16:17:10 -08:00
file-item.c btrfs: mark the len field in struct btrfs_ordered_sum as unsigned 2024-01-10 17:10:35 +01:00
file.c btrfs: fix qgroup_free_reserved_data int overflow 2024-01-10 17:10:35 +01:00
free-space-cache.c btrfs: zoned: no longer count fresh BG region as zone unusable 2024-01-01 12:39:06 +00:00
free-space-cache.h btrfs: remove use btrfs_remove_free_space_cache instead of variant 2022-09-26 12:27:58 +02:00
free-space-tree.c btrfs: convert btrfs_block_group::needs_free_space to runtime flag 2023-08-23 17:52:28 +02:00
free-space-tree.h btrfs: make clear_cache mount option to rebuild FST without disabling it 2023-05-17 11:53:42 +02:00
inode-item.c btrfs: use struct fscrypt_str instead of struct qstr 2023-10-10 22:00:36 +02:00
inode-item.h btrfs: use struct fscrypt_str instead of struct qstr 2023-10-10 22:00:36 +02:00
inode.c btrfs: make btrfs_clear_delalloc_extent() free delalloc reserve 2024-05-17 11:56:06 +02:00
ioctl.c btrfs: fix double free of anonymous device after snapshot creation failure 2024-03-06 14:45:10 +00:00
Kconfig btrfs: use generic Kconfig option for 256kB page size limit 2022-01-20 08:52:55 +02:00
locking.c btrfs: add block-group tree to lockdep classes 2023-07-19 16:22:13 +02:00
locking.h btrfs: implement a nowait option for tree searches 2022-09-26 12:46:42 +02:00
lzo.c btrfs: replace kmap() with kmap_local_page() in lzo.c 2022-07-25 17:45:33 +02:00
Makefile btrfs: move extent state init and alloc functions to their own file 2022-09-26 12:28:03 +02:00
misc.h btrfs: convert the io_failure_tree to a plain rb_tree 2022-09-26 12:28:02 +02:00
ordered-data.c btrfs: fix qgroup_free_reserved_data int overflow 2024-01-10 17:10:35 +01:00
ordered-data.h btrfs: mark the len field in struct btrfs_ordered_sum as unsigned 2024-01-10 17:10:35 +01:00
orphan.c
print-tree.c btrfs: print-tree: parent bytenr must be aligned to sector size 2023-05-17 11:53:42 +02:00
print-tree.h
props.c btrfs: remove the unnecessary result variables 2022-09-26 12:28:00 +02:00
props.h btrfs: move common inode creation code into btrfs_create_new_inode() 2022-05-16 17:03:08 +02:00
qgroup.c btrfs: qgroup: correctly model root qgroup rsv in convert 2024-04-17 11:18:26 +02:00
qgroup.h btrfs: fix qgroup_free_reserved_data int overflow 2024-01-10 17:10:35 +01:00
raid56.c btrfs: raid56: avoid double freeing for rbio if full_stripe_write() failed 2022-10-24 15:26:56 +02:00
raid56.h btrfs: properly abstract the parity raid bio handling 2022-09-26 12:27:59 +02:00
rcu-string.h btrfs: replace strncpy() with strscpy() 2023-01-12 12:01:55 +01:00
ref-verify.c btrfs: ref-verify: free ref cache before clearing mount opt 2024-01-31 16:17:07 -08:00
ref-verify.h
reflink.c btrfs: replace delete argument with EXTENT_CLEAR_ALL_BITS 2022-09-26 12:28:05 +02:00
reflink.h
relocation.c btrfs: set page extent mapped after read_folio in relocate_one_page 2023-09-19 12:28:06 +02:00
root-tree.c btrfs: use struct fscrypt_str instead of struct qstr 2023-10-10 22:00:36 +02:00
scrub.c btrfs: zoned: use zone aware sb location for scrub 2024-04-03 15:19:48 +02:00
send.c btrfs: fix kvcalloc() arguments order in btrfs_ioctl_send() 2024-05-17 11:56:16 +02:00
send.h btrfs: send: allow protocol version 3 with CONFIG_BTRFS_DEBUG 2022-10-11 14:46:55 +02:00
space-info.c btrfs: fix data races when accessing the reserved amount of block reserves 2024-03-26 18:20:26 -04:00
space-info.h btrfs: move btrfs_init_async_reclaim_work prototype to space-info.h 2022-09-26 12:28:06 +02:00
struct-funcs.c btrfs: remove redundant check in up check_setget_bounds 2022-07-25 17:45:33 +02:00
subpage.c btrfs: convert process_page_range() to use filemap_get_folios_contig() 2022-09-11 20:26:03 -07:00
subpage.h btrfs: make nodesize >= PAGE_SIZE case to reuse the non-subpage routine 2022-05-16 17:03:11 +02:00
super.c btrfs: add dmesg output for first mount and last unmount of a filesystem 2023-12-08 08:51:16 +01:00
sysfs.c btrfs: sysfs: validate scrub_speed_max value 2024-01-31 16:16:58 -08:00
sysfs.h
transaction.c btrfs: always clear PERTRANS metadata during commit 2024-05-17 11:56:06 +02:00
transaction.h btrfs: pass btrfs_fs_info for deleting snapshots and cleaner 2022-03-14 13:13:52 +01:00
tree-checker.c btrfs: tree-checker: fix inline ref size in error messages 2024-01-31 16:17:07 -08:00
tree-checker.h btrfs: tree-checker: check extent buffer owner against owner rootid 2022-05-16 17:03:09 +02:00
tree-defrag.c btrfs: move the auto defrag code to defrag.c 2023-02-22 12:59:40 +01:00
tree-log.c btrfs: initialize start_slot in btrfs_log_prealloc_extents 2023-10-25 12:03:09 +02:00
tree-log.h btrfs: use struct fscrypt_str instead of struct qstr 2023-10-10 22:00:36 +02:00
tree-mod-log.c btrfs: fix race when picking most recent mod log operation for an old root 2021-04-20 19:27:17 +02:00
tree-mod-log.h
ulist.c
ulist.h
uuid-tree.c btrfs: drop the _nr from the item helpers 2022-01-03 15:09:43 +01:00
verity.c btrfs: send: add support for fs-verity 2022-09-26 12:27:55 +02:00
volumes.c btrfs: add missing mutex_unlock in btrfs_relocate_sys_chunks() 2024-05-17 11:56:19 +02:00
volumes.h btrfs: add a helper to read the superblock metadata_uuid 2023-09-23 11:11:08 +02:00
xattr.c btrfs: check if root is readonly while setting security xattr 2022-08-22 18:06:30 +02:00
xattr.h
zlib.c btrfs: zlib: zero-initialize zlib workspace 2023-02-14 19:11:40 +01:00
zoned.c btrfs: zoned: no longer count fresh BG region as zone unusable 2024-01-01 12:39:06 +00:00
zoned.h btrfs: zoned: clone zoned device info when cloning a device 2022-11-07 14:35:21 +01:00
zstd.c btrfs: zstd: replace kmap() with kmap_local_page() 2022-07-25 17:45:40 +02:00