linux/fs
Ye Bin c8af7ad729 jbd2: fix soft lockup in journal_finish_inode_data_buffers()
[ Upstream commit 6c02757c93 ]

There's issue when do io test:
WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170]
CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G  OE
Call trace:
 dump_backtrace+0x0/0x1a0
 show_stack+0x24/0x30
 dump_stack+0xb0/0x100
 watchdog_timer_fn+0x254/0x3f8
 __hrtimer_run_queues+0x11c/0x380
 hrtimer_interrupt+0xfc/0x2f8
 arch_timer_handler_phys+0x38/0x58
 handle_percpu_devid_irq+0x90/0x248
 generic_handle_irq+0x3c/0x58
 __handle_domain_irq+0x68/0xc0
 gic_handle_irq+0x90/0x320
 el1_irq+0xcc/0x180
 queued_spin_lock_slowpath+0x1d8/0x320
 jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2]
 kjournald2+0xec/0x2f0 [jbd2]
 kthread+0x134/0x138
 ret_from_fork+0x10/0x18

Analyzed informations from vmcore as follows:
(1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list';
(2) Now is processing the 855th jbd2_inode;
(3) JBD2 task has TIF_NEED_RESCHED flag;
(4) There's no pags in address_space around the 855th jbd2_inode;
(5) There are some process is doing drop caches;
(6) Mounted with 'nodioread_nolock' option;
(7) 128 CPUs;

According to informations from vmcore we know 'journal->j_list_lock' spin lock
competition is fierce. So journal_finish_inode_data_buffers() maybe process
slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors().
However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK,
will not call cond_resched(). So may lead to soft lockup.
journal_finish_inode_data_buffers
  filemap_fdatawait_range_keep_errors
    __filemap_fdatawait_range
      while (index <= end)
        nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK);
        if (!nr_pages)
           break;    --> If 'nr_pages' is equal zero will break, then will not call cond_resched()
        for (i = 0; i < nr_pages; i++)
          wait_on_page_writeback(page);
        cond_resched();

To solve above issue, add scheduling point in the journal_finish_inode_data_buffers();

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-25 14:52:29 -08:00
..
9p 9p: v9fs_listxattr: fix %s null argument warning 2023-11-28 16:56:19 +00:00
adfs
affs affs: initialize fsdata in affs_truncate() 2023-02-01 08:27:06 +01:00
afs afs: Fix use-after-free due to get/remove race in volume tree 2024-01-05 15:13:30 +01:00
autofs autofs: fix memory leak of waitqueues in autofs_catatonic_mode 2023-09-23 11:09:54 +02:00
befs
bfs
btrfs btrfs: do not allow non subvolume root targets for snapshot 2023-12-20 15:17:41 +01:00
cachefiles fs: add is_idmapped_mnt() helper 2022-07-02 16:41:14 +02:00
ceph ceph: fix type promotion bug on 32bit systems 2023-10-19 23:05:36 +02:00
cifs smb: client: fix OOB in smbCalcSize() 2024-01-05 15:13:38 +01:00
coda coda: Avoid partial allocation of sig_inputArgs 2023-03-10 09:39:50 +01:00
configfs configfs: fix possible memory leak in configfs_create_dir() 2022-12-31 13:14:15 +01:00
cramfs
crypto fscrypt: fix keyring memory leak on mount failure 2022-11-10 18:15:37 +01:00
debugfs debugfs: fix automount d_fsdata usage 2024-01-25 14:52:27 -08:00
devpts fsnotify: fix fsnotify hooks in pseudo filesystems 2022-02-01 17:27:01 +01:00
dlm dlm: fix plock lookup when using multiple lockspaces 2023-09-19 12:22:52 +02:00
ecryptfs fs: add is_idmapped_mnt() helper 2022-07-02 16:41:14 +02:00
efivarfs
efs
erofs erofs: ensure that the post-EOF tails are all zeroed 2023-09-06 21:28:37 +01:00
exfat exfat: support handle zero-size directory 2023-11-28 16:56:19 +00:00
exportfs exportfs: support idmapped mounts 2022-06-09 10:23:32 +02:00
ext2 ext2: fix datatype of block number in ext2_xattr_set2() 2023-09-23 11:09:57 +02:00
ext4 ext4: prevent the normalized size from exceeding EXT_MAX_BLOCKS 2023-12-20 15:17:41 +01:00
f2fs f2fs: explicitly null-terminate the xattr list 2024-01-25 14:52:27 -08:00
fat fat: add ratelimit to fat*_ent_bread() 2022-06-09 10:22:42 +02:00
freevxfs
fscache fscache: Remove an unused static variable 2021-10-04 22:13:12 +01:00
fuse fuse: share lookup state between submount and its parent 2024-01-05 15:13:36 +01:00
gfs2 gfs2: Silence "suspicious RCU usage in gfs2_permission" warning 2023-11-28 16:56:22 +00:00
hfs hfs: fix missing hfs_bnode_get() in __hfs_bnode_create 2023-03-10 09:39:57 +01:00
hfsplus fs: hfsplus: remove WARN_ON() from hfsplus_cat_{read,write}_inode() 2023-05-24 17:36:43 +01:00
hostfs hostfs: support splice_write 2021-08-26 22:28:02 +02:00
hpfs
hugetlbfs hugetlbfs: fix null-ptr-deref in hugetlbfs_parse_param() 2022-12-31 13:14:44 +01:00
iomap iomap: update ki_pos a little later in iomap_dio_complete 2023-12-08 08:48:05 +01:00
isofs isofs: Fix out of bound access for corrupted isofs image 2021-11-12 15:05:50 +01:00
jbd2 jbd2: fix soft lockup in journal_finish_inode_data_buffers() 2024-01-25 14:52:29 -08:00
jffs2 jffs2: reduce stack usage in jffs2_build_xattr_subsystem() 2023-07-23 13:47:34 +02:00
jfs jfs: fix array-index-out-of-bounds in diAlloc 2023-11-28 16:56:18 +00:00
kernfs kernfs: fix missing kernfs_idr_lock to remove an ID from the IDR 2023-07-23 13:47:23 +02:00
ksmbd ksmbd: fix slab-out-of-bounds in smb_strndup_from_utf16() 2024-01-05 15:13:39 +01:00
lockd fs: lockd: avoid possible wrong NULL parameter 2023-09-19 12:22:43 +02:00
minix minix: fix bug when opening a file with O_DIRECT 2022-04-13 20:59:10 +02:00
netfs netfs: fix parameter of cleanup() 2021-12-29 12:28:59 +01:00
nfs NFSv4.1: fix SP4_MACH_CRED protection for pnfs IO 2023-11-28 16:56:21 +00:00
nfs_common nfs: Fix kerneldoc warning shown up by W=1 2021-10-04 22:02:17 +01:00
nfsd nfsd: fix file memleak on client_opens_release 2023-11-28 16:56:34 +00:00
nilfs2 nilfs2: prevent WARNING in nilfs_sufile_set_segment_usage() 2023-12-13 18:36:43 +01:00
nls fs/nls: make load_nls() take a const parameter 2023-09-19 12:22:27 +02:00
notify fanotify: disallow mount/sb marks on kernel internal pseudo fs 2023-07-23 13:47:36 +02:00
ntfs ntfs: check overflow when iterating ATTR_RECORDs 2022-11-26 09:24:52 +01:00
ntfs3 fs/ntfs3: Avoid possible memory leak 2023-11-08 17:26:46 +01:00
ocfs2 fs: ocfs2: namei: check return value of ocfs2_add_entry() 2023-09-19 12:22:34 +02:00
omfs
openpromfs
orangefs orangefs: Fix kmemleak in orangefs_{kernel,client}_debug_init() 2022-12-31 13:14:44 +01:00
overlayfs ima: detect changes to the backing overlay file 2023-11-28 16:56:29 +00:00
proc proc: sysctl: prevent aliased sysctls from getting passed to init 2023-12-03 07:31:24 +01:00
pstore pstore/platform: Add check for kstrdup 2023-11-20 11:08:13 +01:00
qnx4 qnx4: work around gcc false positive warning bug 2021-09-21 08:36:48 -07:00
qnx6
quota quota: explicitly forbid quota files from being encrypted 2023-11-28 16:56:31 +00:00
ramfs shmem: use ramfs_kill_sb() for kill_sb method of ramfs-based tmpfs 2023-07-23 13:47:33 +02:00
reiserfs reiserfs: Check the return value from __getblk() 2023-09-19 12:22:30 +02:00
romfs
smbfs_common cifs: Fix crash on unload of cifs_arc4.ko 2021-12-14 10:57:12 +01:00
squashfs revert "squashfs: harden sanity check in squashfs_read_xattr_id_table" 2023-02-22 12:57:07 +01:00
sysfs
sysv fs/sysv: Null check to prevent null-ptr-deref bug 2023-08-11 15:13:58 +02:00
tracefs tracefs: Add missing lockdown check to tracefs_create_dir() 2023-09-23 11:10:02 +02:00
ubifs ubifs: Fix memory leak in do_rename 2023-05-17 11:50:14 +02:00
udf udf: initialize newblock to 0 2023-09-19 12:22:53 +02:00
ufs
unicode
vboxsf vboxfs: fix broken legacy mount signature checking 2021-09-27 11:26:21 -07:00
verity fsverity: skip PKCS#7 parser when keyring is empty 2023-09-19 12:22:52 +02:00
xfs xfs: Fix unreferenced object reported by kmemleak in xfs_sysfs_init() 2023-11-28 16:56:26 +00:00
zonefs zonefs: Fix error message in zonefs_file_dio_append() 2023-04-05 11:25:01 +02:00
aio.c aio: fix mremap after fork null-deref 2023-02-22 12:57:05 +01:00
anon_inodes.c
attr.c attr: block mode changes of symlinks 2023-09-23 11:10:01 +02:00
bad_inode.c
binfmt_aout.c binfmt: a.out: Fix bogus semicolon 2021-09-05 10:15:05 -07:00
binfmt_elf_fdpic.c fs: binfmt_elf_efpic: fix personality for ELF-FDPIC 2023-10-06 13:18:24 +02:00
binfmt_elf.c fs/binfmt_elf: Fix memory leak in load_elf_binary() 2022-11-03 23:59:12 +09:00
binfmt_flat.c binfmt_flat: do not stop relocating GOT entries prematurely on riscv 2022-06-09 10:22:26 +02:00
binfmt_misc.c binfmt_misc: fix shift-out-of-bounds in check_special_flags 2022-12-31 13:14:39 +01:00
binfmt_script.c
buffer.c mm: fs: initialize fsdata passed to write_begin/write_end interface 2022-11-26 09:24:51 +01:00
char_dev.c chardev: fix error handling in cdev_device_add() 2022-12-31 13:14:30 +01:00
compat_binfmt_elf.c
coredump.c coredump: Use the vma snapshot in fill_files_note 2022-04-08 14:24:18 +02:00
d_path.c d_path: make 'prepend()' fill up the buffer exactly on overflow 2021-09-02 10:07:29 -07:00
dax.c fsdax: Fix infinite loop in dax_iomap_rw() 2022-09-28 11:11:56 +02:00
dcache.c
direct-io.c
drop_caches.c fs: drop_caches: fix skipping over shadow cache inodes 2021-09-03 09:58:10 -07:00
eventfd.c eventfd: prevent underflow for eventfd semaphores 2023-09-19 12:22:30 +02:00
eventpoll.c epoll: ep_autoremove_wake_function should use list_del_init_careful 2023-06-21 15:59:14 +02:00
exec.c exec: Copy oldsighand->action under spin-lock 2022-11-03 23:59:12 +09:00
fcntl.c Merge branch 'akpm' (patches from Andrew) 2021-09-03 10:08:28 -07:00
fhandle.c
file_table.c locks: fix TOCTOU race when granting write lease 2022-10-26 12:34:58 +02:00
file.c file: reinstate f_pos locking optimization for regular files 2023-08-11 15:13:58 +02:00
filesystems.c
fs_context.c fs: avoid empty option when generating legacy mount string 2023-07-23 13:47:34 +02:00
fs_parser.c namei: Standardize callers of filename_lookup() 2021-09-07 16:07:47 -04:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c writeback, cgroup: switch inodes with dirty timestamps to release dying cgwbs 2023-11-20 11:08:13 +01:00
fsopen.c
init.c
inode.c fs: add ctime accessors infrastructure 2023-12-08 08:48:04 +01:00
internal.h nfs: use vfs setgid helper 2023-08-30 16:18:19 +02:00
ioctl.c fs: fix an infinite loop in iomap_fiemap 2022-05-25 09:57:26 +02:00
Kconfig ksmbd: have a dependency on cifs ARC4 2024-01-05 15:13:36 +01:00
Kconfig.binfmt
kernel_read_file.c vfs: check fd has read access in kernel_read_file_from_fd() 2021-10-18 20:22:03 -10:00
libfs.c libfs: add DEFINE_SIMPLE_ATTRIBUTE_SIGNED for signed value 2022-12-31 13:14:03 +01:00
locks.c locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock 2023-09-23 11:09:55 +02:00
Makefile io_uring: move to separate directory 2022-12-14 11:37:31 +01:00
mbcache.c mbcache: Avoid nesting of cache->c_list_lock under bit locks 2023-01-12 11:59:20 +01:00
mount.h
mpage.c
namei.c ksmbd: fix racy issue from using ->d_parent and ->d_name 2023-12-23 10:41:55 +01:00
namespace.c fs: drop peer group ids under namespace lock 2023-04-13 16:48:25 +02:00
no-block.c
nsfs.c
open.c open: make RESOLVE_CACHED correctly test for O_TMPFILE 2023-08-11 15:13:57 +02:00
pipe.c pipe: Fix missing lock in pipe_resize_ring() 2022-06-06 08:43:37 +02:00
pnode.c pnode: terminate at peers of source 2023-01-12 11:58:47 +01:00
pnode.h
posix_acl.c fs: fix acl translation 2022-07-02 16:41:17 +02:00
proc_namespace.c fs: add is_idmapped_mnt() helper 2022-07-02 16:41:14 +02:00
read_write.c vfs: fix copy_file_range() averts filesystem freeze protection 2022-12-19 12:36:39 +01:00
readdir.c
remap_range.c fs/remap: constrain dedupe of EOF blocks 2022-07-21 21:24:14 +02:00
select.c select: Fix indefinitely sleeping task in poll_schedule_timeout() 2022-01-29 10:58:25 +01:00
seq_file.c rxrpc: Fix locking issue 2022-07-12 16:35:08 +02:00
signalfd.c signalfd: use wake_up_pollfree() 2021-12-14 10:57:15 +01:00
splice.c Revert "fs: check FMODE_LSEEK to control internal pipe splicing" 2022-10-26 12:34:17 +02:00
stack.c
stat.c stat: fix inconsistency between struct stat and struct compat_stat 2022-04-27 14:38:57 +02:00
statfs.c statfs: enforce statfs[64] structure initialization 2023-05-24 17:36:54 +01:00
super.c fs: Protect reconfiguration of sb read-write from racing writes 2023-08-11 15:13:58 +02:00
sync.c vfs: make sync_filesystem return errors from ->sync_fs 2022-04-27 14:38:50 +02:00
timerfd.c
userfaultfd.c userfaultfd: open userfaultfds with O_RDONLY 2022-10-26 12:34:36 +02:00
utimes.c
xattr.c fs: don't audit the capability check in simple_xattr_list() 2022-12-31 13:14:01 +01:00