linux/fs
Omar Sandoval f63a5b3769 xfs: fix internal error from AGFL exhaustion
We've been seeing XFS errors like the following:

XFS: Internal error i != 1 at line 3526 of file fs/xfs/libxfs/xfs_btree.c.  Caller xfs_btree_insert+0x1ec/0x280
...
Call Trace:
 xfs_corruption_error+0x94/0xa0
 xfs_btree_insert+0x221/0x280
 xfs_alloc_fixup_trees+0x104/0x3e0
 xfs_alloc_ag_vextent_size+0x667/0x820
 xfs_alloc_fix_freelist+0x5d9/0x750
 xfs_free_extent_fix_freelist+0x65/0xa0
 __xfs_free_extent+0x57/0x180
...

This is the XFS_IS_CORRUPT() check in xfs_btree_insert() when
xfs_btree_insrec() fails.

After converting this into a panic and dissecting the core dump, I found
that xfs_btree_insrec() is failing because it's trying to split a leaf
node in the cntbt when the AG free list is empty. In particular, it's
failing to get a block from the AGFL _while trying to refill the AGFL_.

If a single operation splits every level of the bnobt and the cntbt (and
the rmapbt if it is enabled) at once, the free list will be empty. Then,
when the next operation tries to refill the free list, it allocates
space. If the allocation does not use a full extent, it will need to
insert records for the remaining space in the bnobt and cntbt. And if
those new records go in full leaves, the leaves (and potentially more
nodes up to the old root) need to be split.

Fix it by accounting for the additional splits that may be required to
refill the free list in the calculation for the minimum free list size.

P.S. As far as I can tell, this bug has existed for a long time -- maybe
back to xfs-history commit afdf80ae7405 ("Add XFS_AG_MAXLEVELS macros
...") in April 1994! It requires a very unlucky sequence of events, and
in fact we didn't hit it until a particular sparse mmap workload updated
from 5.12 to 5.19. But this bug existed in 5.12, so it must've been
exposed by some other change in allocation or writeback patterns. It's
also much less likely to be hit with the rmapbt enabled, since that
increases the minimum free list size and is unlikely to split at the
same time as the bnobt and cntbt.

Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-11-13 09:11:40 +05:30
..
9p Bunch of small fixes: 2023-11-04 09:20:04 -10:00
adfs adfs: convert to new timestamp accessors 2023-10-18 13:26:18 +02:00
affs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
afs asm-generic updates for v6.7 2023-11-01 15:28:33 -10:00
autofs vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
bcachefs Second bcachefs pull request for 6.7-rc1 2023-11-07 11:38:38 -08:00
befs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
bfs bfs: convert to new timestamp accessors 2023-10-18 13:26:19 +02:00
btrfs Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
cachefiles - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
ceph Two items: 2023-11-10 09:52:56 -08:00
coda coda: convert to new timestamp accessors 2023-10-18 13:26:19 +02:00
configfs configfs: convert to new timestamp accessors 2023-10-18 13:26:19 +02:00
cramfs vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
crypto This update includes the following changes: 2023-11-02 16:15:30 -10:00
debugfs Driver core changes for 6.7-rc1 2023-11-03 15:15:47 -10:00
devpts devpts: convert to new timestamp accessors 2023-10-18 13:26:20 +02:00
dlm dlm: slow down filling up processing queue 2023-10-12 15:21:00 -05:00
ecryptfs ecryptfs: move ecryptfs_xattr_handlers to .rodata 2023-10-09 16:24:17 +02:00
efivarfs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
efs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
erofs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
exfat exfat: fix ctime is not updated 2023-11-03 22:24:11 +09:00
exportfs fs: fix build error with CONFIG_EXPORTFS=m or not defined 2023-10-28 16:16:19 +02:00
ext2 vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
ext4 vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
f2fs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
fat vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
freevxfs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
fscache fscache: Use clear_and_wake_up_bit() in fscache_create_volume_work() 2023-01-30 12:51:54 +00:00
fuse vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
gfs2 gfs2 fixes 2023-11-07 11:54:17 -08:00
hfs vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
hfsplus vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
hostfs hostfs: convert to new timestamp accessors 2023-10-18 14:08:22 +02:00
hpfs hpfs: convert to new timestamp accessors 2023-10-18 14:08:22 +02:00
hugetlbfs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
iomap Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
isofs isofs: convert to new timestamp accessors 2023-10-18 14:08:22 +02:00
jbd2 Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
jffs2 vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
jfs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
kernfs Driver core changes for 6.7-rc1 2023-11-03 15:15:47 -10:00
lockd SUNRPC: change how svc threads are asked to exit. 2023-10-16 12:44:04 -04:00
minix minix: convert to new timestamp accessors 2023-10-18 14:08:23 +02:00
netfs netfs: Only call folio_start_fscache() one time for each folio 2023-09-18 12:03:46 -07:00
nfs NFS client updates for Linux 6.7 2023-11-08 13:39:16 -08:00
nfs_common NFSv4.2: remove MODULE_LICENSE in non-modules 2023-04-13 13:13:52 -07:00
nfsd vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
nilfs2 Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
nls nls: Hide new NLS_UCS2_UTILS 2023-08-31 12:07:34 -05:00
notify vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
ntfs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
ntfs3 vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
ocfs2 As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
omfs omfs: convert to new timestamp accessors 2023-10-18 14:08:25 +02:00
openpromfs openpromfs: convert to new timestamp accessors 2023-10-18 14:08:25 +02:00
orangefs vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
overlayfs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
proc As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
pstore pstore updates for v6.7-rc1 2023-10-30 19:26:39 -10:00
qnx4 qnx4: convert to new timestamp accessors 2023-10-18 14:08:26 +02:00
qnx6 qnx6: convert to new timestamp accessors 2023-10-18 14:08:26 +02:00
quota Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
ramfs ramfs: convert to new timestamp accessors 2023-10-18 14:08:26 +02:00
reiserfs Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
romfs vfs-6.7.ctime 2023-10-30 09:47:13 -10:00
smb Sixteen smb3/cifs client fixes 2023-11-11 17:17:22 -08:00
squashfs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
sysfs kernfs: sysfs: support custom llseek method for sysfs entries 2023-10-05 13:42:11 +02:00
sysv sysv: convert to new timestamp accessors 2023-10-18 14:08:28 +02:00
tracefs Tracing updates for v6.7: 2023-11-03 07:41:18 -10:00
ubifs This pull request contains updates for UBI and UBIFS 2023-11-05 08:28:32 -10:00
udf \n 2023-11-02 08:19:51 -10:00
ufs vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
unicode unicode: remove MODULE_LICENSE in non-modules 2023-04-13 13:13:54 -07:00
vboxsf vboxsf: convert to new timestamp accessors 2023-10-18 14:08:29 +02:00
verity fsverity: skip PKCS#7 parser when keyring is empty 2023-08-20 10:33:43 -07:00
xfs xfs: fix internal error from AGFL exhaustion 2023-11-13 09:11:40 +05:30
zonefs zonefs: convert to new timestamp accessors 2023-10-18 14:08:29 +02:00
aio.c aio: Annotate struct kioctx_table with __counted_by 2023-09-20 14:22:01 +02:00
anon_inodes.c treewide: mark stuff as __ro_after_init 2023-10-18 14:43:23 -07:00
attr.c fs: convert core infrastructure to new timestamp accessors 2023-10-18 13:26:15 +02:00
bad_inode.c fs: convert core infrastructure to new timestamp accessors 2023-10-18 13:26:15 +02:00
binfmt_elf_fdpic.c execve updates for v6.7-rc1 2023-10-30 19:28:19 -10:00
binfmt_elf_test.c
binfmt_elf.c binfmt_elf: Only report padzero() errors when PROT_WRITE 2023-10-03 19:48:44 -07:00
binfmt_flat.c
binfmt_misc.c execve updates for v6.7-rc1 2023-10-30 19:28:19 -10:00
binfmt_script.c
buffer.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
char_dev.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
compat_binfmt_elf.c
coredump.c v6.5/vfs.misc 2023-06-26 09:50:21 -07:00
d_path.c fs: d_path: include internal.h 2023-05-17 09:16:59 +02:00
dax.c mm: convert DAX lock/unlock page to lock/unlock folio 2023-10-04 10:32:20 -07:00
dcache.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
direct-io.c treewide: mark stuff as __ro_after_init 2023-10-18 14:43:23 -07:00
drop_caches.c fs: drop_caches: draining pages before dropping caches 2023-08-18 10:12:11 -07:00
eventfd.c eventfd: prevent underflow for eventfd semaphores 2023-07-11 11:41:34 +02:00
eventpoll.c treewide: mark stuff as __ro_after_init 2023-10-18 14:43:23 -07:00
exec.c mm/mremap: allow moves within the same VMA for stack moves 2023-10-04 10:32:20 -07:00
fcntl.c treewide: mark stuff as __ro_after_init 2023-10-18 14:43:23 -07:00
fhandle.c exportfs: add helpers to check if filesystem can encode/decode file handles 2023-10-24 17:57:45 +02:00
file_table.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
file.c file, i915: fix file reference for mmap_singleton() 2023-10-25 22:17:04 +02:00
filesystems.c
fs_context.c fs: factor out vfs_parse_monolithic_sep() helper 2023-10-12 18:53:36 +03:00
fs_parser.c ext4: journal_path mount options should follow links 2022-12-01 10:46:54 -05:00
fs_pin.c
fs_struct.c kill do_each_thread() 2023-08-21 13:46:25 -07:00
fs_types.c
fs-writeback.c vfs-6.7.misc 2023-10-30 09:14:19 -10:00
fsopen.c fsconfig: ensure that dirfd is set to aux 2023-09-22 14:09:06 +02:00
init.c fs: add a new SB_I_NOUMASK flag 2023-10-19 11:02:47 +02:00
inode.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
internal.h fs: store real path instead of fake path in backing file f_path 2023-10-19 11:03:15 +02:00
ioctl.c v6.6-vfs.super 2023-08-28 11:04:18 -07:00
Kconfig asm-generic updates for v6.7 2023-11-01 15:28:33 -10:00
Kconfig.binfmt riscv: support the elf-fdpic binfmt loader 2023-08-23 14:17:43 -07:00
kernel_read_file.c fs: Fix kernel-doc warnings 2023-08-19 12:12:12 +02:00
libfs.c vfs-6.7.fsid 2023-11-07 12:11:26 -08:00
locks.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
Makefile bcachefs: Initial commit 2023-10-22 17:08:07 -04:00
mbcache.c mbcache: dynamically allocate the mbcache shrinker 2023-10-04 10:32:25 -07:00
mnt_idmapping.c fs: export mnt_idmap_get/mnt_idmap_put 2023-11-03 23:28:33 +01:00
mount.h
mpage.c buffer: remove folio_create_empty_buffers() 2023-10-25 16:47:10 -07:00
namei.c vfs-6.7.misc 2023-10-30 09:14:19 -10:00
namespace.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
nsfs.c fs: convert core infrastructure to new timestamp accessors 2023-10-18 13:26:15 +02:00
open.c fs: store real path instead of fake path in backing file f_path 2023-10-19 11:03:15 +02:00
pipe.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
pnode.c fs: allow to mount beneath top mount 2023-05-19 04:30:22 +02:00
pnode.h fs: allow to mount beneath top mount 2023-05-19 04:30:22 +02:00
posix_acl.c fs: convert to ctime accessor functions 2023-07-13 10:28:04 +02:00
proc_namespace.c tty, proc, kernfs, random: Use copy_splice_read() 2023-05-24 08:42:16 -06:00
read_write.c fs: Fix one kernel-doc comment 2023-08-15 08:32:45 +02:00
readdir.c vfs: get rid of old '->iterate' directory operation 2023-08-06 15:08:35 +02:00
remap_range.c fs: use UB-safe check for signed addition overflow in remap_verify_area 2023-05-24 11:03:59 +02:00
select.c
seq_file.c use less confusing names for iov_iter direction initializers 2022-11-25 13:01:55 -05:00
signalfd.c
splice.c - Some swap cleanups from Ma Wupeng ("fix WARN_ON in add_to_avail_list") 2023-08-29 14:25:26 -07:00
stack.c fs: convert core infrastructure to new timestamp accessors 2023-10-18 13:26:15 +02:00
stat.c fs: convert core infrastructure to new timestamp accessors 2023-10-18 13:26:15 +02:00
statfs.c statfs: enforce statfs[64] structure initialization 2023-05-17 15:20:17 +02:00
super.c overlayfs update for 6.7-rc1 2023-11-07 11:46:31 -08:00
sync.c
sysctls.c sysctl: Refactor base paths registrations 2023-05-23 21:43:26 -07:00
timerfd.c
userfaultfd.c As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
utimes.c fs.idmapped.v6.3 2023-02-20 11:53:11 -08:00
xattr.c xattr: make the xattr array itself const 2023-10-09 16:24:16 +02:00