Go to file
Ye Bin c0f1db7380 jbd2: fix soft lockup in journal_finish_inode_data_buffers()
[ Upstream commit 6c02757c93 ]

There's issue when do io test:
WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170]
CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G  OE
Call trace:
 dump_backtrace+0x0/0x1a0
 show_stack+0x24/0x30
 dump_stack+0xb0/0x100
 watchdog_timer_fn+0x254/0x3f8
 __hrtimer_run_queues+0x11c/0x380
 hrtimer_interrupt+0xfc/0x2f8
 arch_timer_handler_phys+0x38/0x58
 handle_percpu_devid_irq+0x90/0x248
 generic_handle_irq+0x3c/0x58
 __handle_domain_irq+0x68/0xc0
 gic_handle_irq+0x90/0x320
 el1_irq+0xcc/0x180
 queued_spin_lock_slowpath+0x1d8/0x320
 jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2]
 kjournald2+0xec/0x2f0 [jbd2]
 kthread+0x134/0x138
 ret_from_fork+0x10/0x18

Analyzed informations from vmcore as follows:
(1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list';
(2) Now is processing the 855th jbd2_inode;
(3) JBD2 task has TIF_NEED_RESCHED flag;
(4) There's no pags in address_space around the 855th jbd2_inode;
(5) There are some process is doing drop caches;
(6) Mounted with 'nodioread_nolock' option;
(7) 128 CPUs;

According to informations from vmcore we know 'journal->j_list_lock' spin lock
competition is fierce. So journal_finish_inode_data_buffers() maybe process
slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors().
However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK,
will not call cond_resched(). So may lead to soft lockup.
journal_finish_inode_data_buffers
  filemap_fdatawait_range_keep_errors
    __filemap_fdatawait_range
      while (index <= end)
        nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK);
        if (!nr_pages)
           break;    --> If 'nr_pages' is equal zero will break, then will not call cond_resched()
        for (i = 0; i < nr_pages; i++)
          wait_on_page_writeback(page);
        cond_resched();

To solve above issue, add scheduling point in the journal_finish_inode_data_buffers();

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-25 14:37:37 -08:00
arch powerpc: update ppc_save_regs to save current r1 in pt_regs 2024-01-15 18:48:07 +01:00
block blk-throttle: fix lockdep warning of "cgroup_mutex or RCU read lock required!" 2023-12-20 15:44:31 +01:00
certs certs/blacklist_hashes.c: fix const confusion in certs blacklist 2022-06-22 14:13:17 +02:00
crypto crypto: pcrypt - Fix hungtask for PADATA_RESET 2023-11-28 16:54:51 +00:00
Documentation dt-bindings: nvmem: mxs-ocotp: Document fsl,ocotp 2024-01-05 15:12:28 +01:00
drivers drm/crtc: Fix uninit-value bug in drm_mode_setcrtc 2024-01-25 14:37:37 -08:00
fs jbd2: fix soft lockup in journal_finish_inode_data_buffers() 2024-01-25 14:37:37 -08:00
include ipv6: remove max_size check inline with ipv4 2024-01-15 18:48:07 +01:00
init x86/mm: Initialize text poking earlier 2023-08-08 19:57:39 +02:00
io_uring io_uring/af_unix: disable sending io_uring over sockets 2023-12-13 18:27:06 +01:00
ipc ipc/sem: Fix dangling sem_array access in semtimedop race 2022-12-08 11:24:00 +01:00
kernel tracing: Fix blocked reader of snapshot buffer 2024-01-05 15:12:31 +01:00
lib lib/vsprintf: Fix %pfwf when current node refcount == 0 2024-01-05 15:12:28 +01:00
LICENSES LICENSES/deprecated: add Zlib license text 2020-09-16 14:33:49 +02:00
mm mm: fix unmap_mapping_range high bits shift bug 2024-01-15 18:48:06 +01:00
net neighbour: Don't let neigh_forced_gc() disable preemption for long 2024-01-25 14:37:37 -08:00
samples samples/hw_breakpoint: fix building without module unloading 2023-09-23 11:01:09 +02:00
scripts sign-file: Fix incorrect return values check 2023-12-20 15:44:29 +01:00
security keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry 2024-01-05 15:12:25 +01:00
sound ASoC: da7219: Support low DC impedance headset 2024-01-25 14:37:36 -08:00
tools tools headers UAPI: Sync linux/perf_event.h with the kernel sources 2023-12-13 18:27:06 +01:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-02-01 17:25:48 +01:00
virt KVM: fix memoryleak in kvm_init() 2023-04-05 11:23:43 +02:00
.clang-format RDMA 5.10 pull request 2020-10-17 11:18:18 -07:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore kbuild: generate Module.symvers only when vmlinux exists 2021-05-19 10:12:59 +02:00
.mailmap mailmap: add two more addresses of Uwe Kleine-König 2020-12-06 10:19:07 -08:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Jason Cooper to CREDITS 2020-11-30 10:20:34 +01:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS Remove DECnet support from kernel 2023-06-21 15:45:38 +02:00
Makefile Linux 5.10.208 2024-01-15 18:48:08 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.