linux/kernel
Thomas Gleixner 9c388a5ed1 watchdog/harclockup/perf: Revert a33d44843d ("watchdog/hardlockup/perf: Simplify deferred event destroy")
Guenter reported a crash in the watchdog/perf code, which is caused by
cleanup() and enable() running concurrently. The reason for this is:

The watchdog functions are serialized via the watchdog_mutex and cpu
hotplug locking, but the enable of the perf based watchdog happens in
context of the unpark callback of the smpboot thread. But that unpark
function is not synchronous inside the locking. The unparking of the thread
just wakes it up and leaves so there is no guarantee when the thread is
executing.

If it starts running _before_ the cleanup happened then it will create a
event and overwrite the dead event pointer. The new event is then cleaned
up because the event is marked dead.

    lock(watchdog_mutex);
    lockup_detector_reconfigure();
        cpus_read_lock();
	stop();
	   park()
	update();
	start();
	   unpark()
	cpus_read_unlock();		thread runs()
					  overwrite dead event ptr
	cleanup();
	  free new event, which is active inside perf....
    unlock(watchdog_mutex);

The park side is safe as that actually waits for the thread to reach
parked state.

Commit a33d44843d removed the protection against this kind of scenario
under the stupid assumption that the hotplug serialization and the
watchdog_mutex cover everything. 

Bring it back.

Reverts: a33d44843d ("watchdog/hardlockup/perf: Simplify deferred event destroy")
Reported-and-tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Thomas Feels-stupid Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Don Zickus <dzickus@redhat.com>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710312145190.1942@nanos
2017-11-01 21:18:39 +01:00
..
bpf bpf: rename sk_actions to align with bpf infrastructure 2017-10-29 11:18:48 +09:00
cgroup Revert "net: defer call to cgroup_sk_alloc()" 2017-10-10 20:24:29 -07:00
configs ANDROID: binder: add hwbinder,vndbinder to BINDER_DEVICES. 2017-08-22 18:43:23 -07:00
debug sched/headers: Prepare for new header dependencies before moving code to <linux/sched/debug.h> 2017-03-02 08:42:34 +01:00
events perf/core: Fix cgroup time when scheduling descendants 2017-10-10 10:06:55 +02:00
gcov gcov: support GCC 7.1 2017-05-12 15:57:15 -07:00
irq irqchip updates for 4.14-rc5 2017-10-16 10:26:46 +02:00
livepatch livepatch: unpatch all klp_objects if klp_module_coming fails 2017-10-11 15:38:46 +02:00
locking locking/lockdep: Fix stacktrace mess 2017-10-10 10:04:28 +02:00
power PM / s2idle: Invoke the ->wake() platform callback earlier 2017-09-29 01:26:13 +02:00
printk Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/pmladek/printk 2017-09-07 21:00:52 -07:00
rcu doc: Fix various RCU docbook comment-header problems 2017-10-19 22:26:11 -04:00
sched membarrier: Provide register expedited private command 2017-10-19 22:13:40 -04:00
time drivers/pps: aesthetic tweaks to PPS-related content 2017-09-08 18:26:51 -07:00
trace Two updates. 2017-10-04 08:34:01 -07:00
.gitignore
acct.c fs: make the buf argument to __kernel_write a void pointer 2017-09-04 19:05:15 -04:00
async.c async: Adjust system_state checks 2017-05-23 10:01:37 +02:00
audit_fsnotify.c Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2017-05-03 11:05:15 -07:00
audit_tree.c Merge branch 'fsnotify' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs 2017-05-03 11:05:15 -07:00
audit_watch.c audit/stable-4.13 PR 20170816 2017-08-16 16:48:34 -07:00
audit.c audit: update the function comments 2017-09-05 09:46:59 -04:00
audit.h ipc: mqueue: Replace timespec with timespec64 2017-09-03 20:21:24 -04:00
auditfilter.c audit: kernel generated netlink traffic should have a portid of 0 2017-05-02 10:16:05 -04:00
auditsc.c Merge branch 'work.ipc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-09-14 17:37:26 -07:00
backtracetest.c
bounds.c
capability.c
compat.c semtimedop(): move compat to native 2017-07-15 20:46:47 -04:00
configs.c
context_tracking.c
cpu_pm.c PM / CPU: replace raw_notifier with atomic_notifier 2017-07-31 13:09:49 +02:00
cpu.c cpu/hotplug: Reset node state after operation 2017-10-21 16:11:30 +02:00
crash_core.c kdump: protect vmcoreinfo data under the crash memory 2017-07-12 16:26:00 -07:00
crash_dump.c
cred.c doc: ReSTify credentials.txt 2017-05-18 10:30:19 -06:00
delayacct.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
dma.c
elfcore.c
exec_domain.c
exit.c waitid(): Avoid unbalanced user_access_end() on access_ok() error 2017-10-20 15:32:54 -04:00
extable.c extable: Enable RCU if it is not watching in kernel_text_address() 2017-09-23 16:50:20 -04:00
fork.c kmemleak: clear stale pointers from task stacks 2017-10-13 16:18:33 -07:00
freezer.c
futex_compat.c
futex.c futex: Fix more put_pi_state() vs. exit_pi_state_list() races 2017-11-01 09:05:00 +01:00
groups.c kernel/groups.c: use sort library function 2017-07-10 16:32:34 -07:00
hung_task.c kernel/hung_task.c: defer showing held locks 2017-05-08 17:15:10 -07:00
irq_work.c
jump_label.c jump_label: Provide hotplug context variants 2017-08-10 12:28:59 +02:00
kallsyms.c kernel/kallsyms.c: replace all_var with IS_ENABLED(CONFIG_KALLSYMS_ALL) 2017-07-10 16:32:34 -07:00
kcmp.c kernel/kcmp.c: drop branch leftover typo 2017-10-03 17:54:25 -07:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks
Kconfig.preempt
kcov.c kcov: support compat processes 2017-09-08 18:26:51 -07:00
kexec_core.c x86/mm, kexec: Allow kexec to be used with SME 2017-07-18 11:38:04 +02:00
kexec_file.c kexec_file: adjust declaration of kexec_purgatory 2017-07-12 16:26:02 -07:00
kexec_internal.h kexec_file: adjust declaration of kexec_purgatory 2017-07-12 16:26:02 -07:00
kexec.c kdump: protect vmcoreinfo data under the crash memory 2017-07-12 16:26:00 -07:00
kmod.c kmod: move #ifdef CONFIG_MODULES wrapper to Makefile 2017-09-08 18:26:51 -07:00
kprobes.c kprobes: Ensure that jprobe probepoints are at function entry 2017-07-08 11:05:35 +02:00
ksysfs.c kexec: move vmcoreinfo out of the kernel's .bss section 2017-07-12 16:25:59 -07:00
kthread.c kernel/kthread.c: kthread_worker: don't hog the cpu 2017-08-31 16:33:15 -07:00
latencytop.c sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h> 2017-03-02 08:42:39 +01:00
Makefile kmod: move #ifdef CONFIG_MODULES wrapper to Makefile 2017-09-08 18:26:51 -07:00
memremap.c memremap: add scheduling point to devm_memremap_pages 2017-10-03 17:54:25 -07:00
module_signing.c
module-internal.h
module.c module: fix ddebug_remove_module() 2017-07-25 15:08:32 +02:00
notifier.c kernel/notifier.c: simplify expression 2017-02-24 17:46:56 -08:00
nsproxy.c perf: Add PERF_RECORD_NAMESPACES to include namespaces related info 2017-03-13 15:57:41 -03:00
padata.c padata: Avoid nested calls to cpus_read_lock() in pcrypt_init_padata() 2017-05-26 10:10:37 +02:00
panic.c locking/refcounts, x86/asm: Implement fast refcount overflow protection 2017-08-17 10:40:26 +02:00
params.c kernel/params.c: improve STANDARD_PARAM_DEF readability 2017-10-03 17:54:26 -07:00
pid_namespace.c userns,pidns: Verify the userns for new pid namespaces 2017-07-20 07:43:58 -05:00
pid.c pids: make task_tgid_nr_ns() safe 2017-08-21 12:47:31 -07:00
profile.c sched/headers: Prepare to move sched_info_on() and force_schedstat_enabled() from <linux/sched.h> to <linux/sched/stat.h> 2017-03-02 08:42:39 +01:00
ptrace.c signal: Remove kernel interal si_code magic 2017-07-24 14:30:28 -05:00
range.c
reboot.c
relay.c Merge branch 'work.splice' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2017-05-02 11:38:06 -07:00
resource.c
seccomp.c seccomp: make function __get_seccomp_filter static 2017-10-10 11:45:29 -07:00
signal.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2017-09-11 18:34:47 -07:00
smp.c treewide: make "nr_cpu_ids" unsigned 2017-09-08 18:26:48 -07:00
smpboot.c watchdog/core, powerpc: Lock cpus across reconfiguration 2017-10-04 10:53:54 +02:00
smpboot.h
softirq.c sched/core: Remove 'task' parameter and rename tsk_restore_flags() to current_restore_flags() 2017-04-11 09:06:32 +02:00
stacktrace.c stacktrace/x86: add function for detecting reliable stack traces 2017-03-08 09:18:02 +01:00
stop_machine.c stop_machine: Provide stop_machine_cpuslocked() 2017-05-26 10:10:36 +02:00
sys_ni.c
sys.c prctl: Allow local CAP_SYS_ADMIN changing exe_file 2017-07-20 07:46:07 -05:00
sysctl_binary.c fs: fix kernel_write prototype 2017-09-04 19:05:15 -04:00
sysctl.c Merge branch 'core-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2017-10-06 08:36:41 -07:00
task_work.c task_work: Replace spin_unlock_wait() with lock/unlock pair 2017-07-25 10:08:58 -07:00
taskstats.c taskstats: add e/u/stime for TGID command 2017-05-08 17:15:12 -07:00
test_kprobes.c
torture.c torture: Fix typo suppressing CPU-hotplug statistics 2017-07-25 13:04:45 -07:00
tracepoint.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/task.h> 2017-03-02 08:42:35 +01:00
tsacct.c sched/headers: Prepare to move cputime functionality from <linux/sched.h> into <linux/sched/cputime.h> 2017-03-02 08:42:39 +01:00
ucount.c ucount: Remove the atomicity from ucount->count 2017-03-06 15:26:37 -06:00
uid16.c sched/headers: Prepare to remove <linux/cred.h> inclusion from <linux/sched.h> 2017-03-02 08:42:31 +01:00
umh.c kmod: split out umh code into its own file 2017-09-08 18:26:50 -07:00
up.c smp: Avoid using two cache lines for struct call_single_data 2017-08-29 15:14:38 +02:00
user_namespace.c userns,pidns: Verify the userns for new pid namespaces 2017-07-20 07:43:58 -05:00
user-return-notifier.c
user.c sched/headers: Prepare for new header dependencies before moving code to <linux/sched/user.h> 2017-03-02 08:42:29 +01:00
utsname_sysctl.c sched/headers: Remove <linux/rwsem.h> from <linux/sched.h> 2017-03-03 01:45:36 +01:00
utsname.c sched/headers: Prepare to move the task_lock()/unlock() APIs to <linux/sched/task.h> 2017-03-02 08:42:38 +01:00
watchdog_hld.c watchdog/harclockup/perf: Revert a33d44843d ("watchdog/hardlockup/perf: Simplify deferred event destroy") 2017-11-01 21:18:39 +01:00
watchdog.c watchdog/core: Put softlockup_threads_initialized under ifdef guard 2017-10-04 11:30:50 +02:00
workqueue_internal.h
workqueue.c workqueue: replace pool->manager_arb mutex with a flag 2017-10-10 07:13:57 -07:00