Go to file
Ye Bin 6c02757c93 jbd2: fix soft lockup in journal_finish_inode_data_buffers()
There's issue when do io test:
WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170]
CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G  OE
Call trace:
 dump_backtrace+0x0/0x1a0
 show_stack+0x24/0x30
 dump_stack+0xb0/0x100
 watchdog_timer_fn+0x254/0x3f8
 __hrtimer_run_queues+0x11c/0x380
 hrtimer_interrupt+0xfc/0x2f8
 arch_timer_handler_phys+0x38/0x58
 handle_percpu_devid_irq+0x90/0x248
 generic_handle_irq+0x3c/0x58
 __handle_domain_irq+0x68/0xc0
 gic_handle_irq+0x90/0x320
 el1_irq+0xcc/0x180
 queued_spin_lock_slowpath+0x1d8/0x320
 jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2]
 kjournald2+0xec/0x2f0 [jbd2]
 kthread+0x134/0x138
 ret_from_fork+0x10/0x18

Analyzed informations from vmcore as follows:
(1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list';
(2) Now is processing the 855th jbd2_inode;
(3) JBD2 task has TIF_NEED_RESCHED flag;
(4) There's no pags in address_space around the 855th jbd2_inode;
(5) There are some process is doing drop caches;
(6) Mounted with 'nodioread_nolock' option;
(7) 128 CPUs;

According to informations from vmcore we know 'journal->j_list_lock' spin lock
competition is fierce. So journal_finish_inode_data_buffers() maybe process
slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors().
However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK,
will not call cond_resched(). So may lead to soft lockup.
journal_finish_inode_data_buffers
  filemap_fdatawait_range_keep_errors
    __filemap_fdatawait_range
      while (index <= end)
        nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK);
        if (!nr_pages)
           break;    --> If 'nr_pages' is equal zero will break, then will not call cond_resched()
        for (i = 0; i < nr_pages; i++)
          wait_on_page_writeback(page);
        cond_resched();

To solve above issue, add scheduling point in the journal_finish_inode_data_buffers();

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-12-12 10:25:46 -05:00
arch parisc architecture fixes for kernel v6.7-rc3: 2023-11-26 09:59:39 -08:00
block vfs-6.7-rc3.fixes 2023-11-24 09:45:40 -08:00
certs This update includes the following changes: 2023-11-02 16:15:30 -10:00
crypto This push fixes a regression in ahash and hides the Kconfig sub-options for the jitter RNG. 2023-11-09 17:04:58 -08:00
Documentation USB / PHY / Thunderbolt fixes and ids for 6.7-rc3 2023-11-25 18:22:42 -08:00
drivers USB / PHY / Thunderbolt fixes and ids for 6.7-rc3 2023-11-25 18:22:42 -08:00
fs jbd2: fix soft lockup in journal_finish_inode_data_buffers() 2023-12-12 10:25:46 -05:00
include jbd2: increase the journal IO's priority 2023-11-30 23:27:39 -05:00
init As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
io_uring io_uring: fix off-by one bvec index 2023-11-20 15:21:38 -07:00
ipc Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
kernel Fix lockdep block chain corruption resulting in KASAN warnings. 2023-11-26 08:30:11 -08:00
lib parisc architecture fixes for kernel v6.7-rc3: 2023-11-26 09:59:39 -08:00
LICENSES LICENSES: Add the copyleft-next-0.3.1 license 2022-11-08 15:44:01 +01:00
mm vfs-6.7-rc3.fixes 2023-11-24 09:45:40 -08:00
net tls: fix NULL deref on tls_sw_splice_eof() with empty record 2023-11-23 08:51:45 -08:00
rust Kbuild updates for v6.7 2023-11-04 08:07:19 -10:00
samples Landlock updates for v6.7-rc1 2023-11-03 09:28:53 -10:00
scripts scripts/checkstack.pl: match all stack sizes for s390 2023-11-22 15:06:23 +01:00
security + Features 2023-11-03 09:48:17 -10:00
sound sound fixes for 6.7-rc2 2023-11-17 09:05:31 -05:00
tools parisc architecture fixes for kernel v6.7-rc3: 2023-11-26 09:59:39 -08:00
usr arch: Remove Itanium (IA-64) architecture 2023-09-11 08:13:17 +00:00
virt ARM: 2023-09-07 13:52:20 -07:00
.clang-format iommu: Add for_each_group_device() 2023-05-23 08:15:51 +02:00
.cocciconfig
.get_maintainer.ignore get_maintainer: add Alan to .get_maintainer.ignore 2022-08-20 15:17:44 -07:00
.gitattributes .gitattributes: set diff driver for Rust source code files 2023-05-31 17:48:25 +02:00
.gitignore kbuild: rpm-pkg: generate kernel.spec in rpmbuild/SPECS/ 2023-10-03 20:49:09 +09:00
.mailmap As usual, lots of singleton and doubleton patches all over the tree and 2023-11-02 20:53:31 -10:00
.rustfmt.toml rust: add .rustfmt.toml 2022-09-28 09:02:20 +02:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS USB: Remove Wireless USB and UWB documentation 2023-08-09 14:17:32 +02:00
Kbuild Kbuild updates for v6.1 2022-10-10 12:00:45 -07:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS eventfs fixes: 2023-11-26 19:48:20 -08:00
Makefile Linux 6.7-rc3 2023-11-26 19:59:33 -08:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.