linux/kernel
Hao Jia aeca695a19 sched/core: Avoid obvious double update_rq_clock warning
[ Upstream commit 2679a83731 ]

When we use raw_spin_rq_lock() to acquire the rq lock and have to
update the rq clock while holding the lock, the kernel may issue
a WARN_DOUBLE_CLOCK warning.

Since we directly use raw_spin_rq_lock() to acquire rq lock instead of
rq_lock(), there is no corresponding change to rq->clock_update_flags.
In particular, we have obtained the rq lock of other CPUs, the
rq->clock_update_flags of this CPU may be RQCF_UPDATED at this time, and
then calling update_rq_clock() will trigger the WARN_DOUBLE_CLOCK warning.

So we need to clear RQCF_UPDATED of rq->clock_update_flags to avoid
the WARN_DOUBLE_CLOCK warning.

For the sched_rt_period_timer() and migrate_task_rq_dl() cases
we simply replace raw_spin_rq_lock()/raw_spin_rq_unlock() with
rq_lock()/rq_unlock().

For the {pull,push}_{rt,dl}_task() cases, we add the
double_rq_clock_clear_update() function to clear RQCF_UPDATED of
rq->clock_update_flags, and call double_rq_clock_clear_update()
before double_lock_balance()/double_rq_lock() returns to avoid the
WARN_DOUBLE_CLOCK warning.

Some call trace reports:
Call Trace 1:
 <IRQ>
 sched_rt_period_timer+0x10f/0x3a0
 ? enqueue_top_rt_rq+0x110/0x110
 __hrtimer_run_queues+0x1a9/0x490
 hrtimer_interrupt+0x10b/0x240
 __sysvec_apic_timer_interrupt+0x8a/0x250
 sysvec_apic_timer_interrupt+0x9a/0xd0
 </IRQ>
 <TASK>
 asm_sysvec_apic_timer_interrupt+0x12/0x20

Call Trace 2:
 <TASK>
 activate_task+0x8b/0x110
 push_rt_task.part.108+0x241/0x2c0
 push_rt_tasks+0x15/0x30
 finish_task_switch+0xaa/0x2e0
 ? __switch_to+0x134/0x420
 __schedule+0x343/0x8e0
 ? hrtimer_start_range_ns+0x101/0x340
 schedule+0x4e/0xb0
 do_nanosleep+0x8e/0x160
 hrtimer_nanosleep+0x89/0x120
 ? hrtimer_init_sleeper+0x90/0x90
 __x64_sys_nanosleep+0x96/0xd0
 do_syscall_64+0x34/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Call Trace 3:
 <TASK>
 deactivate_task+0x93/0xe0
 pull_rt_task+0x33e/0x400
 balance_rt+0x7e/0x90
 __schedule+0x62f/0x8e0
 do_task_dead+0x3f/0x50
 do_exit+0x7b8/0xbb0
 do_group_exit+0x2d/0x90
 get_signal+0x9df/0x9e0
 ? preempt_count_add+0x56/0xa0
 ? __remove_hrtimer+0x35/0x70
 arch_do_signal_or_restart+0x36/0x720
 ? nanosleep_copyout+0x39/0x50
 ? do_nanosleep+0x131/0x160
 ? audit_filter_inodes+0xf5/0x120
 exit_to_user_mode_prepare+0x10f/0x1e0
 syscall_exit_to_user_mode+0x17/0x30
 do_syscall_64+0x40/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Call Trace 4:
 update_rq_clock+0x128/0x1a0
 migrate_task_rq_dl+0xec/0x310
 set_task_cpu+0x84/0x1e4
 try_to_wake_up+0x1d8/0x5c0
 wake_up_process+0x1c/0x30
 hrtimer_wakeup+0x24/0x3c
 __hrtimer_run_queues+0x114/0x270
 hrtimer_interrupt+0xe8/0x244
 arch_timer_handler_phys+0x30/0x50
 handle_percpu_devid_irq+0x88/0x140
 generic_handle_domain_irq+0x40/0x60
 gic_handle_irq+0x48/0xe0
 call_on_irq_stack+0x2c/0x60
 do_interrupt_handler+0x80/0x84

Steps to reproduce:
1. Enable CONFIG_SCHED_DEBUG when compiling the kernel
2. echo 1 > /sys/kernel/debug/clear_warn_once
   echo "WARN_DOUBLE_CLOCK" > /sys/kernel/debug/sched/features
   echo "NO_RT_PUSH_IPI" > /sys/kernel/debug/sched/features
3. Run some rt/dl tasks that periodically work and sleep, e.g.
Create 2*n rt or dl (90% running) tasks via rt-app (on a system
with n CPUs), and Dietmar Eggemann reports Call Trace 4 when running
on PREEMPT_RT kernel.

Signed-off-by: Hao Jia <jiahao.os@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Link: https://lore.kernel.org/r/20220430085843.62939-2-jiahao.os@bytedance.com
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-09 10:22:36 +02:00
..
bpf bpf: Check PTR_TO_MEM | MEM_RDONLY in check_helper_mem_access 2022-06-06 08:43:42 +02:00
cgroup cgroup/cpuset: Remove cpus_allowed/mems_allowed setup in cpuset_init_smp() 2022-05-18 10:26:56 +02:00
configs drivers/char: remove /dev/kmem for good 2021-05-07 00:26:34 -07:00
debug lockdown: also lock down previous kgdb use 2022-05-25 09:57:37 +02:00
dma dma-mapping: remove bogus test for pfn_valid from dma_map_resource 2022-04-27 14:38:50 +02:00
entry signal: Replace force_fatal_sig with force_exit_sig when in doubt 2021-11-25 09:49:07 +01:00
events perf: Fix sys_perf_event_open() race against self 2022-05-25 09:57:27 +02:00
gcov Kconfig: Introduce ARCH_WANTS_NO_INSTR and CC_HAS_NO_PROFILE_FN_ATTR 2021-06-22 11:07:18 -07:00
irq random: remove unused irq_flags argument from add_interrupt_randomness() 2022-05-30 09:29:00 +02:00
kcsan LKMM updates: 2021-09-02 13:00:15 -07:00
livepatch livepatch: Fix build failure on 32 bits processors 2022-04-08 14:23:29 +02:00
locking locking/lockdep: Iterate lock_classes directly when reading lockdep files 2022-04-08 14:23:57 +02:00
power PM: suspend: fix return value of __setup handler 2022-04-08 14:23:07 +02:00
printk printk: fix return value of printk.devkmsg __setup handler 2022-04-08 14:23:19 +02:00
rcu rcu: Make TASKS_RUDE_RCU select IRQ_WORK 2022-06-09 10:22:32 +02:00
sched sched/core: Avoid obvious double update_rq_clock warning 2022-06-09 10:22:36 +02:00
time timekeeping: Add raw clock fallback for random_get_entropy() 2022-05-30 09:29:13 +02:00
trace bpf: Add MEM_RDONLY for helper args that are pointers to rdonly mem. 2022-05-01 17:22:26 +02:00
.gitignore .gitignore: prefix local generated files with a slash 2021-05-02 00:43:35 +09:00
acct.c kernel/acct.c: use dedicated helper to access rlimit values 2021-09-08 11:50:26 -07:00
async.c Revert "module, async: async_synchronize_full() on module init iff async is used" 2022-02-23 12:03:07 +01:00
audit_fsnotify.c
audit_tree.c audit: move put_tree() to avoid trim_trees refcount underflow and UAF 2021-08-24 18:52:36 -04:00
audit_watch.c
audit.c audit: improve audit queue handling when "audit=1" on cmdline 2022-02-08 18:34:03 +01:00
audit.h audit: log AUDIT_TIME_* records only from rules 2022-04-08 14:23:06 +02:00
auditfilter.c
auditsc.c audit: log AUDIT_TIME_* records only from rules 2022-04-08 14:23:06 +02:00
backtracetest.c
bounds.c
capability.c
cfi.c cfi: Use rcu_read_{un}lock_sched_notrace 2021-08-11 13:11:12 -07:00
compat.c arch: remove compat_alloc_user_space 2021-09-08 15:32:35 -07:00
configs.c
context_tracking.c
cpu_pm.c PM: cpu: Make notifier chain use a raw_spinlock_t 2021-08-16 18:55:32 +02:00
cpu.c random: clear fast pool, crng, and batches in cpuhp bring up 2022-05-30 09:29:09 +02:00
crash_core.c kernel/crash_core: suppress unknown crashkernel parameter warning 2021-12-29 12:28:49 +01:00
crash_dump.c
cred.c ucounts: Base set_cred_ucounts changes on the real user 2022-02-23 12:03:20 +01:00
delayacct.c delayacct: Add sysctl to enable at runtime 2021-05-12 11:43:25 +02:00
dma.c
exec_domain.c
exit.c io_uring: remove files pointer in cancellation functions 2021-08-23 13:10:37 -06:00
extable.c
fail_function.c
fork.c sched: Fix yet more sched_fork() races 2022-03-08 19:12:49 +01:00
freezer.c sched: Add get_current_state() 2021-06-18 11:43:08 +02:00
futex.c futex: Remove unused variable 'vpid' in futex_proxy_trylock_atomic() 2021-09-03 23:00:22 +02:00
gen_kheaders.sh kbuild: clean up ${quiet} checks in shell scripts 2021-05-27 04:01:50 +09:00
groups.c
hung_task.c Merge branch 'akpm' (patches from Andrew) 2021-07-02 12:08:10 -07:00
iomem.c
irq_work.c irq_work: Make irq_work_queue() NMI-safe again 2021-06-10 10:00:08 +02:00
jump_label.c jump_label: Fix jump_label_text_reserved() vs __init 2021-07-05 10:46:20 +02:00
kallsyms.c module: add printk formats to add module build ID to stacktraces 2021-07-08 11:48:22 -07:00
kcmp.c
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking/rwlock: Provide RT variant 2021-08-17 17:50:51 +02:00
Kconfig.preempt sched/core: Disable CONFIG_SCHED_CORE by default 2021-06-28 22:43:05 +02:00
kcov.c
kexec_core.c Merge branch 'rework/printk_safe-removal' into for-linus 2021-08-30 16:36:10 +02:00
kexec_elf.c
kexec_file.c kernel: kexec_file: fix error return code of kexec_calculate_store_digests() 2021-05-07 00:26:32 -07:00
kexec_internal.h
kexec.c kexec: avoid compat_alloc_user_space 2021-09-08 15:32:34 -07:00
kheaders.c
kmod.c modules: add CONFIG_MODPROBE_PATH 2021-05-07 00:26:33 -07:00
kprobes.c kprobes: Limit max data_size of the kretprobe instances 2021-12-08 09:04:41 +01:00
ksysfs.c
kthread.c Merge branch 'akpm' (patches from Andrew) 2021-06-29 17:29:11 -07:00
latencytop.c
Makefile static_call: Don't make __static_call_return0 static 2022-04-13 20:59:28 +02:00
module_signature.c
module_signing.c
module-internal.h
module.c Revert "module, async: async_synchronize_full() on module init iff async is used" 2022-02-23 12:03:07 +01:00
notifier.c notifier: Remove atomic_notifier_call_chain_robust() 2021-08-16 18:55:32 +02:00
nsproxy.c memcg: enable accounting for new namesapces and struct nsproxy 2021-09-03 09:58:12 -07:00
padata.c padata: Remove repeated verbose license text 2021-08-27 16:30:18 +08:00
panic.c Merge branch 'rework/printk_safe-removal' into for-linus 2021-08-30 16:36:10 +02:00
params.c params: lift param_set_uint_minmax to common code 2021-08-16 14:42:22 +02:00
pid_namespace.c memcg: enable accounting for new namesapces and struct nsproxy 2021-09-03 09:58:12 -07:00
pid.c kernel/pid.c: implement additional checks upon pidfd_create() parameters 2021-08-10 12:53:07 +02:00
profile.c profiling: fix shift-out-of-bounds bugs 2021-09-08 11:50:26 -07:00
ptrace.c ptrace: Reimplement PTRACE_KILL by always sending SIGKILL 2022-06-09 10:22:29 +02:00
range.c
reboot.c reboot: Add hardware protection power-off 2021-06-21 13:08:36 +01:00
regset.c
relay.c
resource_kunit.c
resource.c kernel/resource: fix kfree() of bootmem memory again 2022-04-08 14:23:43 +02:00
rseq.c rseq: Remove broken uapi field layout on 32-bit little endian 2022-04-08 14:23:10 +02:00
scftorture.c scftorture: Avoid NULL pointer exception on early exit 2021-07-27 11:39:30 -07:00
scs.c scs: Release kasan vmalloc poison in scs_free process 2021-11-18 19:16:29 +01:00
seccomp.c seccomp: Invalidate seccomp mode to catch death failures 2022-02-16 12:56:38 +01:00
signal.c signal: In get_signal test for signal_group_exit every time through the loop 2022-03-08 19:12:34 +01:00
smp.c smp: Fix offline cpu check in flush_smp_call_function_queue() 2022-04-20 09:34:21 +02:00
smpboot.c smpboot: Replace deprecated CPU-hotplug functions. 2021-08-10 14:57:42 +02:00
smpboot.h
softirq.c genirq: Change force_irqthreads to a static key 2021-08-10 22:50:07 +02:00
stackleak.c gcc-plugins/stackleak: Use noinstr in favor of notrace 2022-02-23 12:03:07 +01:00
stacktrace.c stacktrace: move filter_irq_stacks() to kernel/stacktrace.c 2022-04-13 20:59:28 +02:00
static_call_inline.c static_call: Don't make __static_call_return0 static 2022-04-13 20:59:28 +02:00
static_call.c static_call: Don't make __static_call_return0 static 2022-04-13 20:59:28 +02:00
stop_machine.c
sys_ni.c compat: remove some compat entry points 2021-09-08 15:32:35 -07:00
sys.c ucounts: Move RLIMIT_NPROC handling after set_user 2022-02-23 12:03:20 +01:00
sysctl-test.c kernel/sysctl-test: Remove some casts which are no-longer required 2021-06-23 16:41:24 -06:00
sysctl.c x86/speculation: Include unprivileged eBPF status in Spectre v2 mitigation reporting 2022-03-11 12:22:31 +01:00
task_work.c kasan: record task_work_add() call stack 2021-04-30 11:20:42 -07:00
taskstats.c
test_kprobes.c
torture.c torture: Replace deprecated CPU-hotplug functions. 2021-08-10 10:48:07 -07:00
tracepoint.c tracepoint: Fix kerneldoc comments 2021-08-16 11:39:51 -04:00
tsacct.c taskstats: Cleanup the use of task->exit_code 2022-01-27 11:05:35 +01:00
ucount.c ucounts: Handle wrapping in is_ucounts_overlimit 2022-02-23 12:03:20 +01:00
uid16.c
uid16.h
umh.c kernel/umh.c: fix some spelling mistakes 2021-05-07 00:26:34 -07:00
up.c A set of locking related fixes and updates: 2021-05-09 13:07:03 -07:00
user_namespace.c ucounts: Fix systemd LimitNPROC with private users regression 2022-03-08 19:12:42 +01:00
user-return-notifier.c
user.c fs/epoll: use a per-cpu counter for user's watches count 2021-09-08 11:50:27 -07:00
usermode_driver.c Merge branch 'work.namei' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2021-07-03 11:41:14 -07:00
utsname_sysctl.c
utsname.c
watch_queue.c watch_queue: Free the page array when watch_queue is dismantled 2022-04-08 14:24:11 +02:00
watchdog_hld.c
watchdog.c kernel: watchdog: modify the explanation related to watchdog thread 2021-06-29 10:53:46 -07:00
workqueue_internal.h workqueue: Assign a color to barrier work items 2021-08-17 07:49:10 -10:00
workqueue.c workqueue: Fix unbind_workers() VS wq_worker_running() race 2022-01-16 09:12:41 +01:00