linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-27 14:14:24 +08:00

History

Petr Pavlu c2274b908d ring-buffer: Fix a race between readers and resize checks The reader code in rb_get_reader_page() swaps a new reader page into the ring buffer by doing cmpxchg on old->list.prev->next to point it to the new page. Following that, if the operation is successful, old->list.next->prev gets updated too. This means the underlying doubly-linked list is temporarily inconsistent, page->prev->next or page->next->prev might not be equal back to page for some page in the ring buffer. The resize operation in ring_buffer_resize() can be invoked in parallel. It calls rb_check_pages() which can detect the described inconsistency and stop further tracing: [ 190.271762] ------------[ cut here ]------------ [ 190.271771] WARNING: CPU: 1 PID: 6186 at kernel/trace/ring_buffer.c:1467 rb_check_pages.isra.0+0x6a/0xa0 [ 190.271789] Modules linked in: [...] [ 190.271991] Unloaded tainted modules: intel_uncore_frequency(E):1 skx_edac(E):1 [ 190.272002] CPU: 1 PID: 6186 Comm: cmd.sh Kdump: loaded Tainted: G E 6.9.0-rc6-default #5 158d3e1e6d0b091c34c3b96bfd99a1c58306d79f [ 190.272011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.0-0-gd239552c-rebuilt.opensuse.org 04/01/2014 [ 190.272015] RIP: 0010:rb_check_pages.isra.0+0x6a/0xa0 [ 190.272023] Code: [...] [ 190.272028] RSP: 0018:ffff9c37463abb70 EFLAGS: 00010206 [ 190.272034] RAX: ffff8eba04b6cb80 RBX: 0000000000000007 RCX: ffff8eba01f13d80 [ 190.272038] RDX: ffff8eba01f130c0 RSI: ffff8eba04b6cd00 RDI: ffff8eba0004c700 [ 190.272042] RBP: ffff8eba0004c700 R08: 0000000000010002 R09: 0000000000000000 [ 190.272045] R10: 00000000ffff7f52 R11: ffff8eba7f600000 R12: ffff8eba0004c720 [ 190.272049] R13: ffff8eba00223a00 R14: 0000000000000008 R15: ffff8eba067a8000 [ 190.272053] FS: 00007f1bd64752c0(0000) GS:ffff8eba7f680000(0000) knlGS:0000000000000000 [ 190.272057] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 190.272061] CR2: 00007f1bd6662590 CR3: 000000010291e001 CR4: 0000000000370ef0 [ 190.272070] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 190.272073] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 190.272077] Call Trace: [ 190.272098] <TASK> [ 190.272189] ring_buffer_resize+0x2ab/0x460 [ 190.272199] __tracing_resize_ring_buffer.part.0+0x23/0xa0 [ 190.272206] tracing_resize_ring_buffer+0x65/0x90 [ 190.272216] tracing_entries_write+0x74/0xc0 [ 190.272225] vfs_write+0xf5/0x420 [ 190.272248] ksys_write+0x67/0xe0 [ 190.272256] do_syscall_64+0x82/0x170 [ 190.272363] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 190.272373] RIP: 0033:0x7f1bd657d263 [ 190.272381] Code: [...] [ 190.272385] RSP: 002b:00007ffe72b643f8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 190.272391] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f1bd657d263 [ 190.272395] RDX: 0000000000000002 RSI: 0000555a6eb538e0 RDI: 0000000000000001 [ 190.272398] RBP: 0000555a6eb538e0 R08: 000000000000000a R09: 0000000000000000 [ 190.272401] R10: 0000555a6eb55190 R11: 0000000000000246 R12: 00007f1bd6662500 [ 190.272404] R13: 0000000000000002 R14: 00007f1bd6667c00 R15: 0000000000000002 [ 190.272412] </TASK> [ 190.272414] ---[ end trace 0000000000000000 ]--- Note that ring_buffer_resize() calls rb_check_pages() only if the parent trace_buffer has recording disabled. Recent commit `d78ab79270` ("tracing: Stop current tracer when resizing buffer") causes that it is now always the case which makes it more likely to experience this issue. The window to hit this race is nonetheless very small. To help reproducing it, one can add a delay loop in rb_get_reader_page(): ret = rb_head_page_replace(reader, cpu_buffer->reader_page); if (!ret) goto spin; for (unsigned i = 0; i < 1U << 26; i++) /* inserted delay loop / __asm__ __volatile__ ("" : : : "memory"); rb_list_head(reader->list.next)->prev = &cpu_buffer->reader_page->list; .. and then run the following commands on the target system: echo 1 > /sys/kernel/tracing/events/sched/sched_switch/enable while true; do echo 16 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1 echo 8 > /sys/kernel/tracing/buffer_size_kb; sleep 0.1 done & while true; do for i in /sys/kernel/tracing/per_cpu/; do timeout 0.1 cat $i/trace_pipe; sleep 0.2 done done To fix the problem, make sure ring_buffer_resize() doesn't invoke rb_check_pages() concurrently with a reader operating on the same ring_buffer_per_cpu by taking its cpu_buffer->reader_lock. Link: https://lore.kernel.org/linux-trace-kernel/20240517134008.24529-3-petr.pavlu@suse.com Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: `659f451ff2` ("ring-buffer: Add integrity check at end of iter read") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> [ Fixed whitespace ] Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>		2024-05-21 19:03:35 -04:00
..
bpf	sysctl changes for v6.10-rc1	2024-05-17 17:31:24 -07:00
cgroup	cgroup/rstat: add cgroup_rstat_cpu_lock helpers and tracepoints	2024-05-14 09:43:17 -10:00
configs	hardening updates for 6.10-rc1	2024-05-13 14:14:05 -07:00
debug	kdb: Fix a potential buffer overflow in kdb_local()	2024-01-17 17:19:06 +00:00
dma	swiotlb: initialise restricted pool list_head when SWIOTLB_DYNAMIC=y	2024-05-02 14:57:04 +02:00
entry	entry: Respect changes to system call number by trace_sys_enter()	2024-03-12 13:23:32 +01:00
events	Probes updates for v6.10:	2024-05-17 18:29:30 -07:00
futex	printk: Change type of CONFIG_BASE_SMALL to bool	2024-05-06 17:39:09 +02:00
gcov	gcov: annotate struct gcov_iterator with __counted_by	2023-10-18 14:43:22 -07:00
irq	Updates for the interrupt subsystem:	2024-05-14 09:47:14 -07:00
kcsan	kcsan, compiler_types: Introduce __data_racy type qualifier	2024-05-07 11:39:50 -07:00
livepatch	livepatch: Rename KLP_* to KLP_TRANSITION_*	2024-05-09 15:48:01 +02:00
locking	locking/pvqspinlock: Use try_cmpxchg() in qspinlock_paravirt.h	2024-04-12 11:42:39 +02:00
module	mm/execmem, arch: convert remaining overrides of module_alloc to execmem	2024-05-14 00:31:43 -07:00
power	cgroup: Changes for v6.10	2024-05-15 17:06:08 -07:00
printk	sysctl changes for v6.10-rc1	2024-05-17 17:31:24 -07:00
rcu	Merge branches 'fixes.2024.04.15a', 'misc.2024.04.12a', 'rcu-sync-normal-improve.2024.04.15a', 'rcu-tasks.2024.04.15a' and 'rcutorture.2024.04.15a' into rcu-merge.2024.04.15a	2024-05-01 13:04:02 +02:00
sched	sysctl changes for v6.10-rc1	2024-05-17 17:31:24 -07:00
time	sysctl changes for v6.10-rc1	2024-05-17 17:31:24 -07:00
trace	ring-buffer: Fix a race between readers and resize checks	2024-05-21 19:03:35 -04:00
.gitignore
acct.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
async.c	async: Use a dedicated unbound workqueue with raised min_active	2024-02-09 11:13:59 -10:00
audit_fsnotify.c
audit_tree.c	As usual, lots of singleton and doubleton patches all over the tree and	2023-11-02 20:53:31 -10:00
audit_watch.c	audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare()	2023-11-14 17:34:27 -05:00
audit.c	audit: use KMEM_CACHE() instead of kmem_cache_create()	2024-01-25 10:12:22 -05:00
audit.h	audit: correct audit_filter_inodes() definition	2023-07-21 12:17:25 -04:00
auditfilter.c	audit: remove unnecessary assignment in audit_dupe_lsm_field()	2024-01-25 09:59:27 -05:00
auditsc.c	audit,io_uring: io_uring openat triggers audit reference count underflow	2023-10-13 18:34:46 +02:00
backtracetest.c	backtracetest: Convert from tasklet to BH workqueue	2024-02-05 13:22:34 -10:00
bounds.c	bounds: Use the right number of bits for power-of-two CONFIG_NR_CPUS	2024-04-29 08:29:29 -07:00
capability.c	lsm: constify the 'target' parameter in security_capget()	2023-08-08 16:48:47 -04:00
cfi.c
compat.c	sched_getaffinity: don't assume 'cpumask_size()' is fully initialized	2023-03-14 19:32:38 -07:00
configs.c
context_tracking.c	context_tracking: Make context_tracking_key __ro_after_init	2024-03-22 11:18:18 +01:00
cpu_pm.c	cpuidle, cpu_pm: Remove RCU fiddling from cpu_pm_{enter,exit}()	2023-01-13 11:48:15 +01:00
cpu.c	cgroup: Changes for v6.10	2024-05-15 17:06:08 -07:00
crash_core.c	crash: add a new kexec flag for hotplug support	2024-04-23 14:59:01 +10:00
crash_reserve.c	crash: use macro to add crashk_res into iomem early for specific arch	2024-03-26 11:14:12 -07:00
cred.c	cred: Use KMEM_CACHE() instead of kmem_cache_create()	2024-02-23 17:33:31 -05:00
delayacct.c	delayacct: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:54 +02:00
dma.c
elfcorehdr.c	crash: remove dependency of FA_DUMP on CRASH_DUMP	2024-02-23 17:48:22 -08:00
exec_domain.c
exit.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
exit.h	exit: add internal include file with helpers	2023-09-21 12:03:50 -06:00
extable.c
fail_function.c	kernel/fail_function: fix memory leak with using debugfs_lookup()	2023-02-08 13:36:22 +01:00
fork.c	fork: defer linking file vma until vma is fully initialized	2024-04-16 15:39:51 -07:00
freezer.c	Linux 6.7-rc6	2023-12-23 15:52:13 +01:00
gen_kheaders.sh	Revert "kheaders: substituting --sort in archive creation"	2023-05-28 16:20:21 +09:00
groups.c	groups: Convert group_info.usage to refcount_t	2023-09-29 11:28:39 -07:00
hung_task.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
iomem.c	kernel/iomem.c: remove __weak ioremap_cache helper	2023-08-21 13:37:28 -07:00
irq_work.c	trace: Add trace_ipi_send_cpu()	2023-03-24 11:01:29 +01:00
jump_label.c	jump_label,module: Don't alloc static_key_mod for __ro_after_init keys	2024-03-22 11:18:16 +01:00
kallsyms_internal.h
kallsyms_selftest.c	mm/vmalloc: remove vmap_area_list	2024-02-23 17:48:19 -08:00
kallsyms_selftest.h
kallsyms.c	kallsyms: Change func signature for cleanup_symbol_name()	2023-08-25 15:00:36 -07:00
kcmp.c	file: convert to SLAB_TYPESAFE_BY_RCU	2023-10-19 11:02:48 +02:00
Kconfig.freezer
Kconfig.hz
Kconfig.kexec	crash: clean up kdump related config items	2024-02-23 17:48:22 -08:00
Kconfig.locks
Kconfig.preempt
kcov.c	kcov: add prototypes for helper functions	2023-06-09 17:44:17 -07:00
kexec_core.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
kexec_elf.c
kexec_file.c	crash: add a new kexec flag for hotplug support	2024-04-23 14:59:01 +10:00
kexec_internal.h	crash: remove dependency of FA_DUMP on CRASH_DUMP	2024-02-23 17:48:22 -08:00
kexec.c	crash: add a new kexec flag for hotplug support	2024-04-23 14:59:01 +10:00
kheaders.c	kheaders: Use array declaration instead of char	2023-03-24 20:10:59 -07:00
kprobes.c	Probes updates for v6.10:	2024-05-17 18:29:30 -07:00
ksyms_common.c	kallsyms: make kallsyms_show_value() as generic function	2023-06-08 12:27:20 -07:00
ksysfs.c	Driver core changes for 6.9-rc1	2024-03-21 13:34:15 -07:00
kthread.c	kunit: Handle test faults	2024-05-06 14:22:02 -06:00
latencytop.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
Makefile	crash: split crash dumping code out from kexec_core.c	2024-02-23 17:48:22 -08:00
module_signature.c
notifier.c	notifiers: add tracepoints to the notifiers infrastructure	2023-04-08 13:45:38 -07:00
nsproxy.c	pidfd: add pidfs	2024-03-01 12:23:37 +01:00
numa.c	kernel/numa.c: Move logging out of numa.h	2023-12-20 19:26:30 -05:00
padata.c	padata: Disable BH when taking works lock on MT path	2024-04-12 15:07:51 +08:00
panic.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
params.c	params: Fix multi-line comment style	2023-12-01 09:51:44 -08:00
pid_namespace.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
pid_sysctl.h	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
pid.c	pidfs: remove config option	2024-03-13 12:53:53 -07:00
profile.c	profiling: Remove create_prof_cpu_mask().	2024-04-27 11:17:48 -07:00
ptrace.c	ptrace_attach: shift send(SIGSTOP) into ptrace_set_stopped()	2024-02-22 15:38:52 -08:00
range.c
reboot.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
regset.c
relay.c	kernel: relay: remove relay_file_splice_read dead code, doesn't work	2023-12-29 12:22:27 -08:00
resource_kunit.c
resource.c	Quite a lot of kexec work this time around. Many singleton patches in	2024-01-09 11:46:20 -08:00
rseq.c	rseq: Extend struct rseq with per-memory-map concurrency ID	2022-12-27 12:52:12 +01:00
scftorture.c	scftorture: Pause testing after memory-allocation failure	2023-07-14 15:02:57 -07:00
scs.c
seccomp.c	sysctl changes for v6.10-rc1	2024-05-17 17:31:24 -07:00
signal.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
smp.c	CSD lock commits for v6.7	2023-10-30 17:56:53 -10:00
smpboot.c	kthread: add kthread_stop_put	2023-10-04 10:41:57 -07:00
smpboot.h
softirq.c	softirq: Fix suspicious RCU usage in __do_softirq()	2024-04-29 05:03:51 +02:00
stackleak.c	sysctl changes for v6.10-rc1	2024-05-17 17:31:24 -07:00
stacktrace.c	stacktrace: fix kernel-doc typo	2023-12-29 12:22:29 -08:00
static_call_inline.c
static_call.c
stop_machine.c
sys_ni.c	lsm/stable-6.8 PR 20240105	2024-01-09 12:57:46 -08:00
sys.c	powerpc/dexcr: Add DEXCR prctl interface	2024-05-06 22:04:31 +10:00
sysctl-test.c
sysctl.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
task_work.c	task_work: add kerneldoc annotation for 'data' argument	2023-09-19 13:21:32 -07:00
taskstats.c	taskstats: fill_stats_for_tgid: use for_each_thread()	2023-10-04 10:41:57 -07:00
torture.c	torture: Print out torture module parameters	2023-09-24 17:24:01 +02:00
tracepoint.c	tracepoint: Allow livepatch module add trace event	2023-02-18 14:34:36 -05:00
tsacct.c
ucount.c	sysctl changes for v6.10-rc1	2024-05-17 17:31:24 -07:00
uid16.c
uid16.h
umh.c	umh: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
up.c	smp: Change function signatures to use call_single_data_t	2023-09-13 14:59:24 +02:00
user_namespace.c	user_namespace: remove unnecessary NULL values from kbuf	2024-02-22 15:38:52 -08:00
user-return-notifier.c
user.c	printk: Change type of CONFIG_BASE_SMALL to bool	2024-05-06 17:39:09 +02:00
usermode_driver.c
utsname_sysctl.c	kernel misc: Remove the now superfluous sentinel elements from ctl_table array	2024-04-24 09:43:53 +02:00
utsname.c
vhost_task.c	vhost: Fix worker hangs due to missed wake up calls	2023-06-08 15:43:09 -04:00
vmcore_info.c	mm: turn folio_test_hugetlb into a PageType	2024-04-24 19:34:26 -07:00
watch_queue.c	watch_queue: fix kcalloc() arguments order	2023-12-21 13:17:54 +01:00
watchdog_buddy.c	watchdog/hardlockup: move SMP barriers from common code to buddy code	2023-06-19 16:25:28 -07:00
watchdog_perf.c	watchdog/perf: add a weak function for an arch to detect if perf can use NMIs	2023-06-09 17:44:21 -07:00
watchdog.c	sysctl changes for v6.10-rc1	2024-05-17 17:31:24 -07:00
workqueue_internal.h	workqueue: Drop the special locking rule for worker->flags and worker_pool->flags	2023-08-07 15:57:22 -10:00
workqueue.c	Merge branch 'for-6.10' into test-merge-for-6.10	2024-05-15 11:40:33 -10:00