linux/fs
Zhihao Cheng 68f4c6eba7 fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages
Commit 505a666ee3 ("writeback: plug writeback in wb_writeback() and
writeback_inodes_wb()") has us holding a plug during wb_writeback, which
may cause a potential ABBA dead lock:

    wb_writeback		fat_file_fsync
blk_start_plug(&plug)
for (;;) {
  iter i-1: some reqs have been added into plug->mq_list  // LOCK A
  iter i:
    progress = __writeback_inodes_wb(wb, work)
    . writeback_sb_inodes // fat's bdev
    .   __writeback_single_inode
    .   . generic_writepages
    .   .   __block_write_full_page
    .   .   . . 	    __generic_file_fsync
    .   .   . . 	      sync_inode_metadata
    .   .   . . 	        writeback_single_inode
    .   .   . . 		  __writeback_single_inode
    .   .   . . 		    fat_write_inode
    .   .   . . 		      __fat_write_inode
    .   .   . . 		        sync_dirty_buffer	// fat's bdev
    .   .   . . 			  lock_buffer(bh)	// LOCK B
    .   .   . . 			    submit_bh
    .   .   . . 			      blk_mq_get_tag	// LOCK A
    .   .   . trylock_buffer(bh)  // LOCK B
    .   .   .   redirty_page_for_writepage
    .   .   .     wbc->pages_skipped++
    .   .   --wbc->nr_to_write
    .   wrote += write_chunk - wbc.nr_to_write  // wrote > 0
    .   requeue_inode
    .     redirty_tail_locked
    if (progress)    // progress > 0
      continue;
  iter i+1:
      queue_io
      // similar process with iter i, infinite for-loop !
}
blk_finish_plug(&plug)   // flush plug won't be called

Above process triggers a hungtask like:
[  399.044861] INFO: task bb:2607 blocked for more than 30 seconds.
[  399.046824]       Not tainted 5.18.0-rc1-00005-gefae4d9eb6a2-dirty
[  399.051539] task:bb              state:D stack:    0 pid: 2607 ppid:
2426 flags:0x00004000
[  399.051556] Call Trace:
[  399.051570]  __schedule+0x480/0x1050
[  399.051592]  schedule+0x92/0x1a0
[  399.051602]  io_schedule+0x22/0x50
[  399.051613]  blk_mq_get_tag+0x1d3/0x3c0
[  399.051640]  __blk_mq_alloc_requests+0x21d/0x3f0
[  399.051657]  blk_mq_submit_bio+0x68d/0xca0
[  399.051674]  __submit_bio+0x1b5/0x2d0
[  399.051708]  submit_bio_noacct+0x34e/0x720
[  399.051718]  submit_bio+0x3b/0x150
[  399.051725]  submit_bh_wbc+0x161/0x230
[  399.051734]  __sync_dirty_buffer+0xd1/0x420
[  399.051744]  sync_dirty_buffer+0x17/0x20
[  399.051750]  __fat_write_inode+0x289/0x310
[  399.051766]  fat_write_inode+0x2a/0xa0
[  399.051783]  __writeback_single_inode+0x53c/0x6f0
[  399.051795]  writeback_single_inode+0x145/0x200
[  399.051803]  sync_inode_metadata+0x45/0x70
[  399.051856]  __generic_file_fsync+0xa3/0x150
[  399.051880]  fat_file_fsync+0x1d/0x80
[  399.051895]  vfs_fsync_range+0x40/0xb0
[  399.051929]  __x64_sys_fsync+0x18/0x30

In my test, 'need_resched()' (which is imported by 590dca3a71 "fs-writeback:
unplug before cond_resched in writeback_sb_inodes") in function
'writeback_sb_inodes()' seldom comes true, unless cond_resched() is deleted
from write_cache_pages().

Fix it by correcting wrote number according number of skipped pages
in writeback_sb_inodes().

Goto Link to find a reproducer.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=215837
Cc: stable@vger.kernel.org # v4.3
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220510133805.1988292-1-chengzhihao1@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-05-19 06:29:41 -06:00
..
9p Netfs prep for write helpers 2022-03-31 15:49:36 -07:00
adfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
affs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
afs fscache: Remove the cookie parameter from fscache_clear_page_bits() 2022-04-08 23:54:37 +01:00
autofs
befs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
bfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
btrfs for-5.18-rc5-tag 2022-05-06 14:32:16 -07:00
cachefiles cachefiles: Fix KASAN slab-out-of-bounds in cachefiles_set_volume_xattr 2022-04-08 23:32:40 +01:00
ceph ceph: check folio PG_private bit instead of folio->private 2022-05-10 09:48:31 +02:00
cifs cifs: destage any unwritten data to the server before calling copychunk_write 2022-04-20 22:54:54 -05:00
coda Folio changes for 5.18 2022-03-22 17:03:12 -07:00
configfs configfs: fix a race in configfs_{,un}register_subsystem() 2022-02-22 18:30:28 +01:00
cramfs
crypto fs: Remove ->readpages address space operation 2022-04-01 13:45:33 -04:00
debugfs debugfs: Document that debugfs_create functions need not be error checked 2022-02-25 11:56:13 +01:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-01-24 14:17:02 +01:00
dlm driver core changes for 5.17-rc1 2022-01-12 11:11:34 -08:00
ecryptfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
efivarfs
efs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
erofs erofs: fix use-after-free of on-stack io[] 2022-04-15 23:51:43 +08:00
exfat Description for this pull request: 2022-04-01 14:20:24 -07:00
exportfs
ext2 \n 2022-03-25 17:38:15 -07:00
ext4 Fix some syzbot-detected bugs, as well as other bugs found by I/O 2022-04-22 18:18:27 -07:00
f2fs f2fs: should not truncate blocks during roll-forward recovery 2022-04-21 18:57:09 -07:00
fat Merge branch 'akpm' (patches from Andrew) 2022-03-24 14:14:07 -07:00
freevxfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
fscache fscache: remove FSCACHE_OLD_API Kconfig option 2022-04-08 23:54:37 +01:00
fuse fs: Remove ->readpages address space operation 2022-04-01 13:45:33 -04:00
gfs2 gfs2: Stop using glock holder auto-demotion for now 2022-05-13 22:32:52 +02:00
hfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hfsplus Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hostfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hpfs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
hugetlbfs mm, hugetlb: allow for "high" userspace addresses 2022-04-21 20:01:09 -07:00
iomap iomap: Simplify is_partially_uptodate a little 2022-04-01 14:40:43 -04:00
isofs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
jbd2 Fix some syzbot-detected bugs, as well as other bugs found by I/O 2022-04-22 18:18:27 -07:00
jffs2 This pull request contains fixes for JFFS2, UBI and UBIFS 2022-03-31 16:09:41 -07:00
jfs A couple bug fixes 2022-03-29 18:17:30 -07:00
kernfs kernfs: fix NULL dereferencing in kernfs_remove 2022-04-27 19:32:07 +02:00
ksmbd ksmbd: set fixed sector size to FS_SECTOR_SIZE_INFORMATION 2022-04-14 20:56:13 -05:00
lockd NFSD: Move svc_serv_ops::svo_function into struct svc_serv 2022-02-28 10:26:40 -05:00
minix Merge branch 'akpm' (patches from Andrew) 2022-03-24 14:14:07 -07:00
netfs netfs: Split some core bits out into their own file 2022-03-18 09:29:05 +00:00
nfs nfs: fix broken handling of the softreval mount option 2022-05-09 13:02:54 -04:00
nfs_common
nfsd NFSD bug fixes for 5.18-rc: 2022-04-12 14:23:19 -10:00
nilfs2 nilfs2: get rid of nilfs_mapping_init() 2022-04-01 11:46:09 -07:00
nls
notify fanotify: do not allow setting dirent events in mask of non-dir 2022-05-09 11:49:09 +02:00
ntfs ntfs: Correct mark_ntfs_record_dirty() folio conversion 2022-04-01 14:40:44 -04:00
ntfs3 Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
ocfs2 ocfs2: fix crash when mount with quota enabled 2022-04-01 11:46:09 -07:00
omfs fs: Convert __set_page_dirty_buffers to block_dirty_folio 2022-03-16 13:37:04 -04:00
openpromfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
orangefs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
overlayfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
proc procfs: prevent unprivileged processes accessing fdinfo dir 2022-05-09 17:34:28 -07:00
pstore pstore: Don't use semaphores in always-atomic-context code 2022-03-15 11:08:23 -07:00
qnx4 fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
qnx6 fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
quota quota: make dquot_quota_sync return errors from ->sync_fs 2022-01-30 08:59:47 -08:00
ramfs
reiserfs \n 2022-03-25 17:38:15 -07:00
romfs fs: allocate inode by using alloc_inode_sb() 2022-03-22 15:57:03 -07:00
smbfs_common smb3: fix ksmbd bigendian bug in oplock break, and move its struct to smbfs_common 2022-03-31 09:38:53 -05:00
squashfs Merge branch 'akpm' (patches from Andrew) 2022-03-22 16:11:53 -07:00
sysfs kobject: kobj_type: remove default_attrs 2022-04-05 15:39:19 +02:00
sysv Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
tracefs tracefs: Set the group ownership in apply_options() not parse_options() 2022-02-25 21:05:04 -05:00
ubifs This pull request contains fixes for JFFS2, UBI and UBIFS 2022-03-31 16:09:41 -07:00
udf udf: Avoid using stale lengthOfImpUse 2022-05-10 13:30:32 +02:00
ufs Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
unicode kbuild: unify cmd_copy and cmd_shipped 2022-02-14 10:37:32 +09:00
vboxsf Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
verity fs: Remove ->readpages address space operation 2022-04-01 13:45:33 -04:00
xfs xfs: reorder iunlink remove operation in xfs_ifree 2022-04-21 08:45:16 +10:00
zonefs zonefs: Fix management of open zones 2022-04-21 08:39:20 +09:00
aio.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2022-04-01 19:57:03 -07:00
anon_inodes.c
attr.c
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c coredump: Snapshot the vmas in do_coredump 2022-03-08 12:55:29 -06:00
binfmt_elf_test.c binfmt_elf: Introduce KUnit test 2022-03-03 20:38:56 -08:00
binfmt_elf.c revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE" 2022-04-15 14:49:56 -07:00
binfmt_flat.c coredump: Don't compile flat_core_dump when coredumps are disabled 2022-03-09 10:37:07 -06:00
binfmt_misc.c Fix regression due to "fs: move binfmt_misc sysctl to its own file" 2022-02-09 09:50:02 -08:00
binfmt_script.c
buffer.c filemap: Remove AOP_FLAG_CONT_EXPAND 2022-04-01 14:40:44 -04:00
char_dev.c
compat_binfmt_elf.c binfmt_elf: Introduce KUnit test 2022-03-03 20:38:56 -08:00
coredump.c ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
d_path.c
dax.c dax for 5.18 2022-03-24 18:12:09 -07:00
dcache.c mm: dcache: use kmem_cache_alloc_lru() to allocate dentry 2022-03-22 15:57:03 -07:00
direct-io.c block: remove the per-bio/request write hint 2022-03-07 12:45:57 -07:00
drop_caches.c
eventfd.c
eventpoll.c eventpoll: simplify sysctl declaration with register_sysctl() 2022-01-22 08:33:35 +02:00
exec.c ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
fcntl.c fs: remove fs.f_write_hint 2022-03-08 17:55:03 -07:00
fhandle.c
file_table.c SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() 2022-04-07 16:19:47 -04:00
file.c fs: fix fd table size alignment properly 2022-03-29 23:29:18 -07:00
filesystems.c
fs_context.c vfs: fs_context: fix up param length parsing in legacy_parse_param 2022-01-18 09:23:19 +02:00
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c fs-writeback: writeback_sb_inodes:Recalculate 'wrote' according skipped pages 2022-05-19 06:29:41 -06:00
fsopen.c
init.c
inode.c fs: introduce alloc_inode_sb() to allocate filesystems specific inode 2022-03-22 15:57:03 -07:00
internal.h Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2022-04-01 19:57:03 -07:00
io_uring.c io_uring: assign non-fixed early for async work 2022-05-02 08:09:39 -06:00
io-wq.c ptrace: Cleanups for v5.18 2022-03-28 17:29:53 -07:00
io-wq.h io_uring: stop using io_wq_work as an fd placeholder 2022-04-11 17:06:20 -06:00
ioctl.c Fixes for 5.18-rc1: 2022-04-01 19:35:56 -07:00
Kconfig Folio changes for 5.18 2022-03-22 17:03:12 -07:00
Kconfig.binfmt execve updates for v5.18-rc1 2022-03-21 19:16:02 -07:00
kernel_read_file.c
libfs.c fs: Convert __set_page_dirty_no_writeback to noop_dirty_folio 2022-03-16 13:37:05 -04:00
locks.c fs: move locking sysctls where they are used 2022-01-22 08:33:36 +02:00
Makefile Fix from Christoph Hellwig merging the CONFIG_UNICODE_UTF8_DATA into the 2022-02-01 11:13:24 -08:00
mbcache.c
mount.h
mpage.c for-5.18/alloc-cleanups-2022-03-25 2022-03-26 11:59:30 -07:00
namei.c VFS: filename_create(): fix incorrect intent. 2022-04-14 15:53:43 -07:00
namespace.c fs: unset MNT_WRITE_HOLD on failure 2022-04-21 17:57:37 +02:00
no-block.c
nsfs.c
open.c fs: remove fs.f_write_hint 2022-03-08 17:55:03 -07:00
pipe.c Revert "fs/pipe: use kvcalloc to allocate a pipe_buffer array" 2022-04-20 12:07:53 -07:00
pnode.c
pnode.h
posix_acl.c fs: fix acl translation 2022-04-19 10:19:02 -07:00
proc_namespace.c
read_write.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2022-04-01 19:57:03 -07:00
readdir.c
remap_range.c Filesystem folio changes for 5.18 2022-03-22 18:26:56 -07:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-11 09:03:05 -08:00
seq_file.c seq_file: fix NULL pointer arithmetic warning 2022-02-01 11:31:55 -05:00
signalfd.c Merge branch 'signal-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2022-01-17 05:49:30 +02:00
splice.c mm: Convert remove_mapping() to take a folio 2022-03-21 12:59:01 -04:00
stack.c
stat.c stat: fix inconsistency between struct stat and struct compat_stat 2022-04-12 13:35:08 -10:00
statfs.c
super.c vfs: make freeze_super abort when sync_filesystem returns error 2022-01-30 08:59:47 -08:00
sync.c vfs: make sync_filesystem return errors from ->sync_fs 2022-01-30 08:59:47 -08:00
sysctls.c fs: move namespace sysctls and declare fs base directory 2022-01-22 08:33:36 +02:00
timerfd.c
userfaultfd.c userfaultfd: provide unmasked address on page-fault 2022-03-22 15:57:08 -07:00
utimes.c
xattr.c fs: fix acl translation 2022-04-19 10:19:02 -07:00