linux/fs
Ye Bin c0f1db7380 jbd2: fix soft lockup in journal_finish_inode_data_buffers()
[ Upstream commit 6c02757c93 ]

There's issue when do io test:
WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170]
CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G  OE
Call trace:
 dump_backtrace+0x0/0x1a0
 show_stack+0x24/0x30
 dump_stack+0xb0/0x100
 watchdog_timer_fn+0x254/0x3f8
 __hrtimer_run_queues+0x11c/0x380
 hrtimer_interrupt+0xfc/0x2f8
 arch_timer_handler_phys+0x38/0x58
 handle_percpu_devid_irq+0x90/0x248
 generic_handle_irq+0x3c/0x58
 __handle_domain_irq+0x68/0xc0
 gic_handle_irq+0x90/0x320
 el1_irq+0xcc/0x180
 queued_spin_lock_slowpath+0x1d8/0x320
 jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2]
 kjournald2+0xec/0x2f0 [jbd2]
 kthread+0x134/0x138
 ret_from_fork+0x10/0x18

Analyzed informations from vmcore as follows:
(1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list';
(2) Now is processing the 855th jbd2_inode;
(3) JBD2 task has TIF_NEED_RESCHED flag;
(4) There's no pags in address_space around the 855th jbd2_inode;
(5) There are some process is doing drop caches;
(6) Mounted with 'nodioread_nolock' option;
(7) 128 CPUs;

According to informations from vmcore we know 'journal->j_list_lock' spin lock
competition is fierce. So journal_finish_inode_data_buffers() maybe process
slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors().
However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK,
will not call cond_resched(). So may lead to soft lockup.
journal_finish_inode_data_buffers
  filemap_fdatawait_range_keep_errors
    __filemap_fdatawait_range
      while (index <= end)
        nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK);
        if (!nr_pages)
           break;    --> If 'nr_pages' is equal zero will break, then will not call cond_resched()
        for (i = 0; i < nr_pages; i++)
          wait_on_page_writeback(page);
        cond_resched();

To solve above issue, add scheduling point in the journal_finish_inode_data_buffers();

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-25 14:37:37 -08:00
..
9p 9p: missing chunk of "fs/9p: Don't update file type when updating file attributes" 2022-06-22 14:13:12 +02:00
adfs
affs affs: initialize fsdata in affs_truncate() 2023-02-01 08:23:11 +01:00
afs afs: Fix overwriting of result of DNS query 2024-01-05 15:12:25 +01:00
autofs autofs: fix memory leak of waitqueues in autofs_catatonic_mode 2023-09-23 11:01:04 +02:00
befs
bfs bfs: don't use WARNING: string when it's just info. 2021-01-06 14:56:52 +01:00
btrfs btrfs: do not allow non subvolume root targets for snapshot 2024-01-05 15:12:26 +01:00
cachefiles fs/cachefiles: Remove wait_bit_key layout dependency 2021-03-30 14:32:07 +02:00
ceph ceph: fix type promotion bug on 32bit systems 2023-10-25 11:54:15 +02:00
cifs smb: client: fix OOB in smbCalcSize() 2024-01-05 15:12:29 +01:00
coda coda: Avoid partial allocation of sig_inputArgs 2023-03-11 16:39:51 +01:00
configfs Revert "configfs: fix a race in configfs_lookup()" 2023-09-21 09:45:15 +02:00
cramfs
crypto fscrypt: fix keyring memory leak on mount failure 2022-11-10 18:14:25 +01:00
debugfs debugfs: fix automount d_fsdata usage 2024-01-25 14:37:36 -08:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-02-01 17:25:39 +01:00
dlm dlm: fix plock lookup when using multiple lockspaces 2023-09-19 12:20:22 +02:00
ecryptfs Revert "ecryptfs: replace BUG_ON with error handling code" 2021-05-26 12:06:55 +02:00
efivarfs
efs
erofs erofs: ensure that the post-EOF tails are all zeroed 2023-09-19 12:20:02 +02:00
exfat exfat: support handle zero-size directory 2023-11-28 16:54:52 +00:00
exportfs
ext2 ext2: fix datatype of block number in ext2_xattr_set2() 2023-09-23 11:01:07 +02:00
ext4 ext4: prevent the normalized size from exceeding EXT_MAX_BLOCKS 2023-12-20 15:44:37 +01:00
f2fs f2fs: explicitly null-terminate the xattr list 2024-01-25 14:37:35 -08:00
fat fat: add ratelimit to fat*_ent_bread() 2022-06-09 10:20:58 +02:00
freevxfs
fscache fscache: Fix cookie key hashing 2021-09-18 13:40:15 +02:00
fuse fuse: dax: set fc->dax to NULL in fuse_dax_conn_free() 2023-12-20 15:44:30 +01:00
gfs2 gfs2: Silence "suspicious RCU usage in gfs2_permission" warning 2023-11-28 16:54:53 +00:00
hfs hfs: fix missing hfs_bnode_get() in __hfs_bnode_create 2023-03-11 16:39:55 +01:00
hfsplus fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode() 2023-05-30 12:57:47 +01:00
hostfs hostfs: fix memory handling in follow_link() 2021-04-14 08:42:06 +02:00
hpfs
hugetlbfs hugetlbfs: fix null-ptr-deref in hugetlbfs_parse_param() 2023-01-14 10:16:20 +01:00
iomap xfs: use current->journal_info for detecting transaction recursion 2022-07-07 17:52:19 +02:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-11-12 14:58:33 +01:00
jbd2 jbd2: fix soft lockup in journal_finish_inode_data_buffers() 2024-01-25 14:37:37 -08:00
jffs2 jffs2: reduce stack usage in jffs2_build_xattr_subsystem() 2023-07-27 08:44:13 +02:00
jfs jfs: fix array-index-out-of-bounds in diAlloc 2023-11-28 16:54:51 +00:00
kernfs kernfs: fix missing kernfs_idr_lock to remove an ID from the IDR 2023-07-27 08:44:05 +02:00
lockd fs: lockd: avoid possible wrong NULL parameter 2023-09-19 12:20:15 +02:00
minix minix: fix bug when opening a file with O_DIRECT 2022-04-13 21:01:01 +02:00
nfs NFSv4.1: fix SP4_MACH_CRED protection for pnfs IO 2023-11-28 16:54:53 +00:00
nfs_common
nfsd nfsd: lock_rename() needs both directories to live on the same fs 2023-12-08 08:46:10 +01:00
nilfs2 nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() 2023-12-13 18:27:02 +01:00
nls fs/nls: make load_nls() take a const parameter 2023-09-19 12:20:04 +02:00
notify fanotify: disallow mount/sb marks on kernel internal pseudo fs 2023-07-27 08:44:15 +02:00
ntfs ntfs: check overflow when iterating ATTR_RECORDs 2022-11-25 17:45:57 +01:00
ocfs2 fs: ocfs2: namei: check return value of ocfs2_add_entry() 2023-09-19 12:20:09 +02:00
omfs
openpromfs
orangefs orangefs: Fix kmemleak in orangefs_{kernel,client}_debug_init() 2023-01-14 10:16:20 +01:00
overlayfs ima: detect changes to the backing overlay file 2023-11-28 16:54:57 +00:00
proc watchdog: move softlockup_panic back to early_param 2023-11-28 16:54:56 +00:00
pstore pstore/platform: Add check for kstrdup 2023-11-20 11:06:44 +01:00
qnx4 qnx4: work around gcc false positive warning bug 2021-09-30 10:11:08 +02:00
qnx6
quota quota: explicitly forbid quota files from being encrypted 2023-11-28 16:54:58 +00:00
ramfs shmem: use ramfs_kill_sb() for kill_sb method of ramfs-based tmpfs 2023-07-27 08:44:13 +02:00
reiserfs reiserfs: Check the return value from __getblk() 2023-09-19 12:20:06 +02:00
romfs
squashfs revert "squashfs: harden sanity check in squashfs_read_xattr_id_table" 2023-02-22 12:55:56 +01:00
sysfs
sysv fs/sysv: Null check to prevent null-ptr-deref bug 2023-08-11 11:57:53 +02:00
tracefs tracefs: Add missing lockdown check to tracefs_create_dir() 2023-09-23 11:01:10 +02:00
ubifs ubifs: Free memory for tmpfile name 2023-05-17 11:47:35 +02:00
udf udf: initialize newblock to 0 2023-09-19 12:20:23 +02:00
ufs
unicode
vboxsf vboxfs: fix broken legacy mount signature checking 2021-10-17 10:43:33 +02:00
verity fsverity: skip PKCS#7 parser when keyring is empty 2023-09-19 12:20:22 +02:00
xfs xfs: verify buffer contents when we skip log replay 2023-06-14 11:09:59 +02:00
zonefs zonefs: Fix error message in zonefs_file_dio_append() 2023-04-05 11:23:51 +02:00
aio.c aio: fix mremap after fork null-deref 2023-02-22 12:55:54 +01:00
anon_inodes.c
attr.c attr: block mode changes of symlinks 2023-09-23 11:01:09 +02:00
bad_inode.c
binfmt_aout.c
binfmt_elf_fdpic.c fs: binfmt_elf_efpic: fix personality for ELF-FDPIC 2023-10-10 21:53:35 +02:00
binfmt_elf.c fs/binfmt_elf: Fix memory leak in load_elf_binary() 2022-11-03 23:57:49 +09:00
binfmt_em86.c
binfmt_flat.c binfmt_flat: do not stop relocating GOT entries prematurely on riscv 2022-06-09 10:20:47 +02:00
binfmt_misc.c binfmt_misc: fix shift-out-of-bounds in check_special_flags 2023-01-14 10:16:13 +01:00
binfmt_script.c
block_dev.c block: Don't invalidate pagecache for invalid falloc modes 2024-01-15 18:48:03 +01:00
buffer.c mm: fs: initialize fsdata passed to write_begin/write_end interface 2022-11-25 17:45:56 +01:00
char_dev.c chardev: fix error handling in cdev_device_add() 2023-01-14 10:15:59 +01:00
compat_binfmt_elf.c
coredump.c coredump: Limit what can interrupt coredumps 2023-01-04 11:39:22 +01:00
d_path.c
dax.c dax: fix cache flush on PMD-mapped pages 2022-06-09 10:21:16 +02:00
dcache.c
dcookies.c
direct-io.c fs: direct-io: fix missing sdio->boundary 2021-04-14 08:41:58 +02:00
drop_caches.c
eventfd.c eventfd: prevent underflow for eventfd semaphores 2023-09-19 12:20:06 +02:00
eventpoll.c epoll: ep_autoremove_wake_function should use list_del_init_careful 2023-06-21 15:45:37 +02:00
exec.c exec: Copy oldsighand->action under spin-lock 2022-11-03 23:57:49 +09:00
fcntl.c fcntl: fix potential deadlocks for &fown_struct.lock 2022-10-30 09:41:18 +01:00
fhandle.c
file_table.c SUNRPC: Ensure we flush any closed sockets before xs_xprt_free() 2022-05-18 10:23:48 +02:00
file.c file: reinstate f_pos locking optimization for regular files 2023-08-11 11:57:53 +02:00
filesystems.c
fs_context.c fs: avoid empty option when generating legacy mount string 2023-07-27 08:44:13 +02:00
fs_parser.c
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c writeback: fix call of incorrect macro 2023-05-17 11:48:10 +02:00
fsopen.c
init.c
inode.c fs: add ctime accessors infrastructure 2023-12-08 08:46:15 +01:00
internal.h fs: Establish locking order for unrelated directories 2023-07-27 08:44:13 +02:00
ioctl.c fs: fix an infinite loop in iomap_fiemap 2022-05-25 09:17:54 +02:00
Kconfig tmpfs: disallow CONFIG_TMPFS_INODE64 on alpha 2021-02-17 11:02:21 +01:00
Kconfig.binfmt
kernel_read_file.c vfs: check fd has read access in kernel_read_file_from_fd() 2021-10-27 09:56:51 +02:00
libfs.c libfs: add DEFINE_SIMPLE_ATTRIBUTE_SIGNED for signed value 2023-01-14 10:15:19 +01:00
locks.c locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock 2023-09-23 11:01:04 +02:00
Makefile io_uring: import 5.15-stable io_uring 2023-01-04 11:39:23 +01:00
mbcache.c mbcache: Avoid nesting of cache->c_list_lock under bit locks 2023-01-14 10:16:50 +01:00
mount.h
mpage.c
namei.c fs: Fix error checking for d_hash_and_lookup() 2023-09-19 12:20:06 +02:00
namespace.c fs: warn about impending deprecation of mandatory locks 2021-08-26 08:35:57 -04:00
no-block.c
nsfs.c
open.c open: make RESOLVE_CACHED correctly test for O_TMPFILE 2023-08-11 11:57:53 +02:00
pipe.c pipe: Fix missing lock in pipe_resize_ring() 2022-06-06 08:42:41 +02:00
pnode.c pnode: terminate at peers of source 2023-01-14 10:16:27 +01:00
pnode.h mount: fix mounting of detached mounts onto targets that reside on shared mounts 2021-03-17 17:06:13 +01:00
posix_acl.c
proc_namespace.c proc mountinfo: make splice available again 2020-12-30 11:54:02 +01:00
read_write.c vfs: fix copy_file_range() averts filesystem freeze protection 2022-12-19 12:27:30 +01:00
readdir.c readdir: make sure to verify directory entry for legacy interfaces too 2021-04-21 13:00:54 +02:00
remap_range.c fs/remap: constrain dedupe of EOF blocks 2022-07-21 21:20:01 +02:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-29 10:26:11 +01:00
seq_file.c seq_file: disallow extremely large seq buffer allocations 2021-07-20 16:05:59 +02:00
signalfd.c io_uring: disable polling pollfree files 2022-09-05 10:28:58 +02:00
splice.c Revert "fs: check FMODE_LSEEK to control internal pipe splicing" 2022-10-17 17:26:07 +02:00
stack.c
stat.c stat: fix inconsistency between struct stat and struct compat_stat 2022-04-27 13:53:54 +02:00
statfs.c statfs: enforce statfs[64] structure initialization 2023-05-30 12:57:55 +01:00
super.c fs: Protect reconfiguration of sb read-write from racing writes 2023-08-11 11:57:54 +02:00
sync.c vfs: make sync_filesystem return errors from ->sync_fs 2022-08-31 17:15:14 +02:00
timerfd.c
userfaultfd.c userfaultfd: open userfaultfds with O_RDONLY 2022-10-26 13:25:17 +02:00
utimes.c
xattr.c fs: don't audit the capability check in simple_xattr_list() 2023-01-14 10:15:16 +01:00