linux/fs/ext4
Theodore Ts'o 80fa46d6b9 ext4: limit the number of retries after discarding preallocations blocks
This patch avoids threads live-locking for hours when a large number
threads are competing over the last few free extents as they blocks
getting added and removed from preallocation pools.  From our bug
reporter:

   A reliable way for triggering this has multiple writers
   continuously write() to files when the filesystem is full, while
   small amounts of space are freed (e.g. by truncating a large file
   -1MiB at a time). In the local filesystem, this can be done by
   simply not checking the return code of write (0) and/or the error
   (ENOSPACE) that is set. Over NFS with an async mount, even clients
   with proper error checking will behave this way since the linux NFS
   client implementation will not propagate the server errors [the
   write syscalls immediately return success] until the file handle is
   closed. This leads to a situation where NFS clients send a
   continuous stream of WRITE rpcs which result in ERRNOSPACE -- but
   since the client isn't seeing this, the stream of writes continues
   at maximum network speed.

   When some space does appear, multiple writers will all attempt to
   claim it for their current write. For NFS, we may see dozens to
   hundreds of threads that do this.

   The real-world scenario of this is database backup tooling (in
   particular, github.com/mdkent/percona-xtrabackup) which may write
   large files (>1TiB) to NFS for safe keeping. Some temporary files
   are written, rewound, and read back -- all before closing the file
   handle (the temp file is actually unlinked, to trigger automatic
   deletion on close/crash.) An application like this operating on an
   async NFS mount will not see an error code until TiB have been
   written/read.

   The lockup was observed when running this database backup on large
   filesystems (64 TiB in this case) with a high number of block
   groups and no free space. Fragmentation is generally not a factor
   in this filesystem (~thousands of large files, mostly contiguous
   except for the parts written while the filesystem is at capacity.)

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cc: stable@kernel.org
2022-09-22 10:51:19 -04:00
..
.kunitconfig ext4: add .kunitconfig fragment to enable ext4-specific tests 2021-02-11 23:16:30 -05:00
acl.c fs/ext4: fix comments mentioning i_mutex 2022-02-03 10:57:53 -05:00
acl.h vfs: add rcu argument to ->get_acl() callback 2021-08-18 22:08:24 +02:00
balloc.c ext4: use ext4_debug() instead of jbd_debug() 2022-08-02 23:52:19 -04:00
bitmap.c
block_validity.c ext4: add ext4_sb_block_valid() refactored out of ext4_inode_block_valid() 2022-02-25 21:34:56 -05:00
crypto.c ext4: refactor and move ext4_ioctl_get_encryption_pwsalt() 2022-05-21 22:24:24 -04:00
dir.c ext4: fix spelling errors in comments 2022-05-11 15:19:06 -04:00
ext4_extents.h ext4: fix sparse warnings 2021-08-30 23:36:50 -04:00
ext4_jbd2.c ext4: use ext4_debug() instead of jbd_debug() 2022-08-02 23:52:19 -04:00
ext4_jbd2.h fs/ext4: fix comments mentioning i_mutex 2022-02-03 10:57:53 -05:00
ext4.h ext4: use buckets for cr 1 block scan instead of rbtree 2022-09-21 22:12:03 -04:00
extents_status.c mm: shrinkers: provide shrinkers with names 2022-07-03 18:08:40 -07:00
extents_status.h ext4: fix extent_status trace points 2020-01-25 02:03:03 -05:00
extents.c ext4: fix bug in extents parsing when eh_entries == 0 and eh_depth > 0 2022-09-22 10:50:54 -04:00
fast_commit.c Add new ioctls to set and get the file system UUID in the ext4 2022-08-04 20:13:46 -07:00
fast_commit.h flexible-array transformations for 5.18-rc1 2022-03-24 11:39:32 -07:00
file.c iomap: add per-iomap_iter private data 2022-05-16 17:17:32 +02:00
fsmap.c treewide: Change list_sort to use const pointers 2021-04-08 16:04:22 -07:00
fsmap.h ext4: fsmap: fix the block/inode bitmap comment 2021-06-24 09:48:29 -04:00
fsync.c block: use an on-stack bio in blkdev_issue_flush 2021-01-27 09:51:48 -07:00
hash.c unicode: clean up the Kconfig symbol confusion 2022-01-20 19:57:24 -05:00
ialloc.c ext4: make directory inode spreading reflect flexbg size 2022-09-21 22:11:55 -04:00
indirect.c ext4: use ext4_debug() instead of jbd_debug() 2022-08-02 23:52:19 -04:00
inline.c ext4: correct max_inline_xattr_value_size computing 2022-08-02 23:52:44 -04:00
inode-test.c fs: ext4: Modify inode-test.c to use KUnit parameterized testing feature 2020-12-02 16:07:25 -07:00
inode.c Add new ioctls to set and get the file system UUID in the ext4 2022-08-04 20:13:46 -07:00
ioctl.c ext4: add ioctls to get/set the ext4 superblock uuid 2022-08-02 23:56:26 -04:00
Kconfig ext: EXT4_KUNIT_TESTS should depend on EXT4_FS instead of selecting it 2021-02-11 23:12:59 -05:00
Makefile ext4: move ext4 crypto code to its own file crypto.c 2022-05-21 22:24:24 -04:00
mballoc.c ext4: limit the number of retries after discarding preallocations blocks 2022-09-22 10:51:19 -04:00
mballoc.h ext4: use buckets for cr 1 block scan instead of rbtree 2022-09-21 22:12:03 -04:00
migrate.c ext4: recover csum seed of tmp_inode after migrating to extents 2022-08-02 23:56:02 -04:00
mmp.c fs/buffer: Combine two submit_bh() and ll_rw_block() arguments 2022-07-14 12:14:32 -06:00
move_extent.c ext4: Convert ext4 to read_folio 2022-05-09 16:21:45 -04:00
namei.c ext4: make sure ext4_append() always allocates new block 2022-08-02 23:56:17 -04:00
orphan.c ext4: use ext4_debug() instead of jbd_debug() 2022-08-02 23:52:19 -04:00
page-io.c ext4: fix incorrect comment in ext4_bio_write_page() 2022-06-16 11:03:16 -04:00
readpage.c fs: Convert block_read_full_page() to block_read_full_folio() 2022-05-09 16:21:44 -04:00
resize.c ext4: avoid resizing to a partial cluster size 2022-08-02 23:56:26 -04:00
super.c - The usual batches of cleanups from Baoquan He, Muchun Song, Miaohe 2022-08-05 16:32:45 -07:00
symlink.c ext4: fix reading leftover inlined symlinks 2022-08-02 23:37:50 -04:00
sysfs.c unicode: clean up the Kconfig symbol confusion 2022-01-20 19:57:24 -05:00
truncate.h ext4: Convert to use mapping->invalidate_lock 2021-07-13 14:29:00 +02:00
verity.c ext4: Call aops write_begin() and write_end() directly 2022-05-08 14:45:56 -04:00
xattr_hurd.c acl: handle idmapped mounts 2021-01-24 14:27:17 +01:00
xattr_security.c acl: handle idmapped mounts 2021-01-24 14:27:17 +01:00
xattr_trusted.c acl: handle idmapped mounts 2021-01-24 14:27:17 +01:00
xattr_user.c acl: handle idmapped mounts 2021-01-24 14:27:17 +01:00
xattr.c ext4: fix race when reusing xattr blocks 2022-08-02 23:56:25 -04:00
xattr.h ext4: remove EA inode entry from mbcache on inode eviction 2022-08-02 23:56:25 -04:00