Go to file
Ye Bin c8af7ad729 jbd2: fix soft lockup in journal_finish_inode_data_buffers()
[ Upstream commit 6c02757c93 ]

There's issue when do io test:
WARN: soft lockup - CPU#45 stuck for 11s! [jbd2/dm-2-8:4170]
CPU: 45 PID: 4170 Comm: jbd2/dm-2-8 Kdump: loaded Tainted: G  OE
Call trace:
 dump_backtrace+0x0/0x1a0
 show_stack+0x24/0x30
 dump_stack+0xb0/0x100
 watchdog_timer_fn+0x254/0x3f8
 __hrtimer_run_queues+0x11c/0x380
 hrtimer_interrupt+0xfc/0x2f8
 arch_timer_handler_phys+0x38/0x58
 handle_percpu_devid_irq+0x90/0x248
 generic_handle_irq+0x3c/0x58
 __handle_domain_irq+0x68/0xc0
 gic_handle_irq+0x90/0x320
 el1_irq+0xcc/0x180
 queued_spin_lock_slowpath+0x1d8/0x320
 jbd2_journal_commit_transaction+0x10f4/0x1c78 [jbd2]
 kjournald2+0xec/0x2f0 [jbd2]
 kthread+0x134/0x138
 ret_from_fork+0x10/0x18

Analyzed informations from vmcore as follows:
(1) There are about 5k+ jbd2_inode in 'commit_transaction->t_inode_list';
(2) Now is processing the 855th jbd2_inode;
(3) JBD2 task has TIF_NEED_RESCHED flag;
(4) There's no pags in address_space around the 855th jbd2_inode;
(5) There are some process is doing drop caches;
(6) Mounted with 'nodioread_nolock' option;
(7) 128 CPUs;

According to informations from vmcore we know 'journal->j_list_lock' spin lock
competition is fierce. So journal_finish_inode_data_buffers() maybe process
slowly. Theoretically, there is scheduling point in the filemap_fdatawait_range_keep_errors().
However, if inode's address_space has no pages which taged with PAGECACHE_TAG_WRITEBACK,
will not call cond_resched(). So may lead to soft lockup.
journal_finish_inode_data_buffers
  filemap_fdatawait_range_keep_errors
    __filemap_fdatawait_range
      while (index <= end)
        nr_pages = pagevec_lookup_range_tag(&pvec, mapping, &index, end, PAGECACHE_TAG_WRITEBACK);
        if (!nr_pages)
           break;    --> If 'nr_pages' is equal zero will break, then will not call cond_resched()
        for (i = 0; i < nr_pages; i++)
          wait_on_page_writeback(page);
        cond_resched();

To solve above issue, add scheduling point in the journal_finish_inode_data_buffers();

Signed-off-by: Ye Bin <yebin10@huawei.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20231211112544.3879780-1-yebin10@huawei.com
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-25 14:52:29 -08:00
arch x86/kprobes: fix incorrect return address calculation in kprobe_emulate_call_indirect 2024-01-15 18:51:22 +01:00
block block: Don't invalidate pagecache for invalid falloc modes 2024-01-15 18:51:07 +01:00
certs certs/blacklist_hashes.c: fix const confusion in certs blacklist 2022-06-22 14:22:01 +02:00
crypto crypto: pcrypt - Fix hungtask for PADATA_RESET 2023-11-28 16:56:18 +00:00
Documentation dt-bindings: nvmem: mxs-ocotp: Document fsl,ocotp 2024-01-05 15:13:34 +01:00
drivers platform/x86: intel-vbtn: Fix missing tablet-mode-switch events 2024-01-25 14:52:29 -08:00
fs jbd2: fix soft lockup in journal_finish_inode_data_buffers() 2024-01-25 14:52:29 -08:00
include kallsyms: Make module_kallsyms_on_each_symbol generally available 2024-01-15 18:51:26 +01:00
init proc: sysctl: prevent aliased sysctls from getting passed to init 2023-12-03 07:31:24 +01:00
io_uring io_uring/af_unix: disable sending io_uring over sockets 2023-12-13 18:36:46 +01:00
ipc ipc/sem: Fix dangling sem_array access in semtimedop race 2022-12-08 11:28:45 +01:00
kernel tracing/kprobes: Fix symbol counting logic by looking at modules as well 2024-01-15 18:51:26 +01:00
lib lib/vsprintf: Fix %pfwf when current node refcount == 0 2024-01-05 15:13:35 +01:00
LICENSES LICENSES/dual/CC-BY-4.0: Git rid of "smart quotes" 2021-07-15 06:31:24 -06:00
mm mm: fix unmap_mapping_range high bits shift bug 2024-01-15 18:51:23 +01:00
net neighbour: Don't let neigh_forced_gc() disable preemption for long 2024-01-25 14:52:29 -08:00
samples samples/hw_breakpoint: fix building without module unloading 2023-09-23 11:10:01 +02:00
scripts sign-file: Fix incorrect return values check 2023-12-20 15:17:37 +01:00
security keys, dns: Allow key types (eg. DNS) to be reclaimed immediately on expiry 2024-01-05 15:13:30 +01:00
sound ASoC: ops: add correct range check for limiting volume 2024-01-25 14:52:28 -08:00
tools perf inject: Fix GEN_ELF_TEXT_OFFSET for jit 2024-01-15 18:51:25 +01:00
usr usr/include/Makefile: add linux/nfc.h to the compile-test coverage 2022-02-01 17:27:15 +01:00
virt KVM: Grab a reference to KVM for VM and vCPU stats file descriptors 2023-08-03 10:22:40 +02:00
.clang-format clang-format: Update with the latest for_each macro list 2021-05-12 23:32:39 +02:00
.cocciconfig
.get_maintainer.ignore Opt out of scripts/get_maintainer.pl 2019-05-16 10:53:40 -07:00
.gitattributes .gitattributes: use 'dts' diff driver for dts files 2019-12-04 19:44:11 -08:00
.gitignore .gitignore: ignore only top-level modules.builtin 2021-05-02 00:43:35 +09:00
.mailmap mailmap: add Andrej Shadura 2021-10-18 20:22:03 -10:00
COPYING COPYING: state that all contributions really are covered by this file 2020-02-10 13:32:20 -08:00
CREDITS MAINTAINERS: Move Daniel Drake to credits 2021-09-21 08:34:58 +03:00
Kbuild kbuild: rename hostprogs-y/always to hostprogs/always-y 2020-02-04 01:53:07 +09:00
Kconfig kbuild: ensure full rebuild when the compiler is updated 2020-05-12 13:28:33 +09:00
MAINTAINERS iio: stx104: Move to addac subdirectory 2023-08-26 14:23:27 +02:00
Makefile Linux 5.15.147 2024-01-15 18:51:28 +01:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.