linux/fs
Darrick J. Wong 86d40f1e49 xfs: purge dquots after inode walk fails during quotacheck
xfs/434 and xfs/436 have been reporting occasional memory leaks of
xfs_dquot objects.  These tests themselves were the messenger, not the
culprit, since they unload the xfs module, which trips the slub
debugging code while tearing down all the xfs slab caches:

=============================================================================
BUG xfs_dquot (Tainted: G        W        ): Objects remaining in xfs_dquot on __kmem_cache_shutdown()
-----------------------------------------------------------------------------

Slab 0xffffea000606de00 objects=30 used=5 fp=0xffff888181b78a78 flags=0x17ff80000010200(slab|head|node=0|zone=2|lastcpupid=0xfff)
CPU: 0 PID: 3953166 Comm: modprobe Tainted: G        W         5.18.0-rc6-djwx #rc6 d5824be9e46a2393677bda868f9b154d917ca6a7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20171121_152543-x86-ol7-builder-01.us.oracle.com-4.el7.1 04/01/2014

Since we don't generally rmmod the xfs module between fstests, this
means that xfs/434 is really just the canary in the coal mine --
something leaked a dquot, but we don't know who.  After days of pounding
on fstests with kmemleak enabled, I finally got it to spit this out:

unreferenced object 0xffff8880465654c0 (size 536):
  comm "u10:4", pid 88, jiffies 4294935810 (age 29.512s)
  hex dump (first 32 bytes):
    60 4a 56 46 80 88 ff ff 58 ea e4 5c 80 88 ff ff  `JVF....X..\....
    00 e0 52 49 80 88 ff ff 01 00 01 00 00 00 00 00  ..RI............
  backtrace:
    [<ffffffffa0740f6c>] xfs_dquot_alloc+0x2c/0x530 [xfs]
    [<ffffffffa07443df>] xfs_qm_dqread+0x6f/0x330 [xfs]
    [<ffffffffa07462a2>] xfs_qm_dqget+0x132/0x4e0 [xfs]
    [<ffffffffa0756bb0>] xfs_qm_quotacheck_dqadjust+0xa0/0x3e0 [xfs]
    [<ffffffffa075724d>] xfs_qm_dqusage_adjust+0x35d/0x4f0 [xfs]
    [<ffffffffa06c9068>] xfs_iwalk_ag_recs+0x348/0x5d0 [xfs]
    [<ffffffffa06c95d3>] xfs_iwalk_run_callbacks+0x273/0x540 [xfs]
    [<ffffffffa06c9e8d>] xfs_iwalk_ag+0x5ed/0x890 [xfs]
    [<ffffffffa06ca22f>] xfs_iwalk_ag_work+0xff/0x170 [xfs]
    [<ffffffffa06d22c9>] xfs_pwork_work+0x79/0x130 [xfs]
    [<ffffffff81170bb2>] process_one_work+0x672/0x1040
    [<ffffffff81171b1b>] worker_thread+0x59b/0xec0
    [<ffffffff8118711e>] kthread+0x29e/0x340
    [<ffffffff810032bf>] ret_from_fork+0x1f/0x30

Now we know that quotacheck is at fault, but even this report was
canaryish -- it was triggered by xfs/494, which doesn't actually mount
any filesystems.  (kmemleak can be a little slow to notice leaks, even
with fstests repeatedly whacking it to look for them.)  Looking at the
*previous* fstest, however, showed that the test run before xfs/494 was
xfs/117.  The tipoff to the problem is in this excerpt from dmesg:

XFS (sda4): Quotacheck needed: Please wait.
XFS (sda4): Metadata corruption detected at xfs_dinode_verify.part.0+0xdb/0x7b0 [xfs], inode 0x119 dinode
XFS (sda4): Unmount and run xfs_repair
XFS (sda4): First 128 bytes of corrupted metadata buffer:
00000000: 49 4e 81 a4 03 02 00 00 00 00 00 00 00 00 00 00  IN..............
00000010: 00 00 00 01 00 00 00 00 00 90 57 54 54 1a 4c 68  ..........WTT.Lh
00000020: 81 f9 7d e1 6d ee 16 00 34 bd 7d e1 6d ee 16 00  ..}.m...4.}.m...
00000030: 34 bd 7d e1 6d ee 16 00 00 00 00 00 00 00 00 00  4.}.m...........
00000040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
00000050: 00 00 00 02 00 00 00 00 00 00 00 00 96 80 f3 ab  ................
00000060: ff ff ff ff da 57 7b 11 00 00 00 00 00 00 00 03  .....W{.........
00000070: 00 00 00 01 00 00 00 10 00 00 00 00 00 00 00 08  ................
XFS (sda4): Quotacheck: Unsuccessful (Error -117): Disabling quotas.

The dinode verifier decided that the inode was corrupt, which causes
iget to return with EFSCORRUPTED.  Since this happened during
quotacheck, it is obvious that the kernel aborted the inode walk on
account of the corruption error and disabled quotas.  Unfortunately, we
neglect to purge the dquot cache before doing that, which is how the
dquots leaked.

The problems started 10 years ago in commit b84a3a, when the dquot lists
were converted to a radix tree, but the error handling behavior was not
correctly preserved -- in that commit, if the bulkstat failed and
usrquota was enabled, the bulkstat failure code would be overwritten by
the result of flushing all the dquots to disk.  As long as that
succeeds, we'd continue the quota mount as if everything were ok, but
instead we're now operating with a corrupt inode and incorrect quota
usage counts.  I didn't notice this bug in 2019 when I wrote commit
ebd126a, which changed quotacheck to skip the dqflush when the scan
doesn't complete due to inode walk failures.

Introduced-by: b84a3a9675 ("xfs: remove the per-filesystem list of dquots")
Fixes: ebd126a651 ("xfs: convert quotacheck to use the new iwalk functions")
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2022-05-27 10:21:43 +10:00
..
9p Netfs prep for write helpers 2022-03-31 15:49:36 -07:00
adfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
affs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
afs Netfs prep for write helpers 2022-03-31 15:49:36 -07:00
autofs
befs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
bfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
btrfs for-5.18-rc1-tag 2022-04-05 08:59:37 -07:00
cachefiles Netfs prep for write helpers 2022-03-31 15:49:36 -07:00
ceph Filesystem/VFS changes for 5.18, part two 2022-04-01 13:50:50 -07:00
cifs cifs: update internal module number 2022-04-04 22:40:14 -05:00
coda Folio changes for 5.18 2022-03-22 17:03:12 -07:00
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-02-22 18:30:28 +01:00
cramfs
crypto fs: Remove ->readpages address space operation 2022-04-01 13:45:33 -04:00
debugfs debugfs: Document that debugfs_create functions need not be error checked 2022-02-25 11:56:13 +01:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-01-24 14:17:02 +01:00
dlm driver core changes for 5.17-rc1 2022-01-12 11:11:34 -08:00
ecryptfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
efivarfs
efs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
erofs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
exfat Description for this pull request: 2022-04-01 14:20:24 -07:00
exportfs
ext2 \n 2022-03-25 17:38:15 -07:00
ext4 ext4: Correct ext4_journalled_dirty_folio() conversion 2022-04-01 14:40:44 -04:00
f2fs f2fs: Get the superblock from the mapping instead of the page 2022-04-01 14:40:44 -04:00
fat Merge branch 'akpm' (patches from Andrew) 2022-03-24 14:14:07 -07:00
freevxfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
fscache Netfs prep for write helpers 2022-03-31 15:49:36 -07:00
fuse fs: Remove ->readpages address space operation 2022-04-01 13:45:33 -04:00
gfs2 gfs2 fixes 2022-03-31 15:57:50 -07:00
hfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hfsplus Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hostfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hpfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hugetlbfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
iomap iomap: Simplify is_partially_uptodate a little 2022-04-01 14:40:43 -04:00
isofs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
jbd2 Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
jffs2 This pull request contains fixes for JFFS2, UBI and UBIFS 2022-03-31 16:09:41 -07:00
jfs A couple bug fixes 2022-03-29 18:17:30 -07:00
kernfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2022-04-01 19:57:03 -07:00
ksmbd six ksmbd server fixes 2022-04-01 14:39:28 -07:00
lockd NFSD: Move svc_serv_ops::svo_function into struct svc_serv 2022-02-28 10:26:40 -05:00
minix Merge branch 'akpm' (patches from Andrew) 2022-03-24 14:14:07 -07:00
netfs netfs: Split some core bits out into their own file 2022-03-18 09:29:05 +00:00
nfs NFS client bugfixes for Linux 5.18 2022-04-08 07:39:17 -10:00
nfs_common
nfsd Folio changes for 5.18 2022-03-22 17:03:12 -07:00
nilfs2 nilfs2: get rid of nilfs_mapping_init() 2022-04-01 11:46:09 -07:00
nls
notify fsnotify: remove redundant parameter judgment 2022-03-14 09:05:25 +01:00
ntfs ntfs: Correct mark_ntfs_record_dirty() folio conversion 2022-04-01 14:40:44 -04:00
ntfs3 Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
ocfs2 ocfs2: fix crash when mount with quota enabled 2022-04-01 11:46:09 -07:00
omfs fs: Convert __set_page_dirty_buffers to block_dirty_folio 2022-03-16 13:37:04 -04:00
openpromfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
orangefs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
overlayfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
proc Updates to Tracing: 2022-04-03 12:26:01 -07:00
pstore pstore: Don't use semaphores in always-atomic-context code 2022-03-15 11:08:23 -07:00
qnx4 fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
qnx6 fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
quota quota: make dquot_quota_sync return errors from ->sync_fs 2022-01-30 08:59:47 -08:00
ramfs
reiserfs \n 2022-03-25 17:38:15 -07:00
romfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
smbfs_common smb3: fix ksmbd bigendian bug in oplock break, and move its struct to smbfs_common 2022-03-31 09:38:53 -05:00
squashfs Merge branch 'akpm' (patches from Andrew) 2022-03-22 16:11:53 -07:00
sysfs kobject: kobj_type: remove default_attrs 2022-04-05 15:39:19 +02:00
sysv Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
tracefs tracefs: Set the group ownership in apply_options() not parse_options() 2022-02-25 21:05:04 -05:00
ubifs This pull request contains fixes for JFFS2, UBI and UBIFS 2022-03-31 16:09:41 -07:00
udf \n 2022-03-25 17:38:15 -07:00
ufs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
unicode kbuild: unify cmd_copy and cmd_shipped 2022-02-14 10:37:32 +09:00
vboxsf Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
verity fs: Remove ->readpages address space operation 2022-04-01 13:45:33 -04:00
xfs xfs: purge dquots after inode walk fails during quotacheck 2022-05-27 10:21:43 +10:00
zonefs for-5.18/write-streams-2022-03-18 2022-03-26 11:51:46 -07:00
aio.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2022-04-01 19:57:03 -07:00
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c coredump: Snapshot the vmas in do_coredump 2022-03-08 12:55:29 -06:00
binfmt_elf_test.c binfmt_elf: Introduce KUnit test 2022-03-03 20:38:56 -08:00
binfmt_elf.c execve updates for v5.18-rc1 2022-03-21 19:16:02 -07:00
binfmt_flat.c coredump: Don't compile flat_core_dump when coredumps are disabled 2022-03-09 10:37:07 -06:00
binfmt_misc.c Fix regression due to "fs: move binfmt_misc sysctl to its own file" 2022-02-09 09:50:02 -08:00
binfmt_script.c
buffer.c filemap: Remove AOP_FLAG_CONT_EXPAND 2022-04-01 14:40:44 -04:00
char_dev.c
compat_binfmt_elf.c binfmt_elf: Introduce KUnit test 2022-03-03 20:38:56 -08:00
coredump.c ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
d_path.c
dax.c dax for 5.18 2022-03-24 18:12:09 -07:00
dcache.c mm: dcache: use kmem_cache_alloc_lru() to allocate dentry 2022-03-22 15:57:03 -07:00
direct-io.c block: remove the per-bio/request write hint 2022-03-07 12:45:57 -07:00
drop_caches.c
eventfd.c
eventpoll.c eventpoll: simplify sysctl declaration with register_sysctl() 2022-01-22 08:33:35 +02:00
exec.c ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
fcntl.c fs: remove fs.f_write_hint 2022-03-08 17:55:03 -07:00
fhandle.c
file_table.c SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() 2022-04-07 16:19:47 -04:00
file.c fs: fix fd table size alignment properly 2022-03-29 23:29:18 -07:00
filesystems.c
fs_context.c vfs: fs_context: fix up param length parsing in legacy_parse_param 2022-01-18 09:23:19 +02:00
fs_parser.c fs_parse: allow parameter value to be empty 2021-12-09 14:09:36 -05:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c Merge branch 'akpm' (patches from Andrew) 2022-03-22 16:11:53 -07:00
fsopen.c
init.c
inode.c fs: introduce alloc_inode_sb() to allocate filesystems specific inode 2022-03-22 15:57:03 -07:00
internal.h Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2022-04-01 19:57:03 -07:00
io_uring.c io_uring: fix race between timeout flush and removal 2022-04-08 14:50:05 -06:00
io-wq.c ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
io-wq.h io_uring: defer file assignment 2022-04-07 11:17:28 -06:00
ioctl.c Fixes for 5.18-rc1: 2022-04-01 19:35:56 -07:00
Kconfig Folio changes for 5.18 2022-03-22 17:03:12 -07:00
Kconfig.binfmt execve updates for v5.18-rc1 2022-03-21 19:16:02 -07:00
kernel_read_file.c
libfs.c fs: Convert __set_page_dirty_no_writeback to noop_dirty_folio 2022-03-16 13:37:05 -04:00
locks.c fs: move locking sysctls where they are used 2022-01-22 08:33:36 +02:00
Makefile Fix from Christoph Hellwig merging the CONFIG_UNICODE_UTF8_DATA into the 2022-02-01 11:13:24 -08:00
mbcache.c
mount.h
mpage.c for-5.18/alloc-cleanups-2022-03-25 2022-03-26 11:59:30 -07:00
namei.c \n 2022-01-28 17:51:31 +02:00
namespace.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2022-04-01 19:57:03 -07:00
no-block.c
nsfs.c
open.c fs: remove fs.f_write_hint 2022-03-08 17:55:03 -07:00
pipe.c fs/pipe.c: local vars have to match types of proper pipe_inode_info fields 2022-03-23 19:00:34 -07:00
pnode.c
pnode.h
posix_acl.c fs: support mapped mounts of mapped filesystems 2021-12-05 10:28:57 +01:00
proc_namespace.c
read_write.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2022-04-01 19:57:03 -07:00
readdir.c
remap_range.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-11 09:03:05 -08:00
seq_file.c seq_file: fix NULL pointer arithmetic warning 2022-02-01 11:31:55 -05:00
signalfd.c Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2022-01-17 05:49:30 +02:00
splice.c mm: Convert remove_mapping() to take a folio 2022-03-21 12:59:01 -04:00
stack.c
stat.c io-uring: Make statx API stable 2022-03-10 09:33:55 -07:00
statfs.c
super.c vfs: make freeze_super abort when sync_filesystem returns error 2022-01-30 08:59:47 -08:00
sync.c vfs: make sync_filesystem return errors from ->sync_fs 2022-01-30 08:59:47 -08:00
sysctls.c fs: move namespace sysctls and declare fs base directory 2022-01-22 08:33:36 +02:00
timerfd.c
userfaultfd.c userfaultfd: provide unmasked address on page-fault 2022-03-22 15:57:08 -07:00
utimes.c
xattr.c