linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-27 22:53:55 +08:00

History

Tejun Heo 76bb5ab8f6 cpuset: break kernfs active protection in cpuset_write_resmask() Writing to either "cpuset.cpus" or "cpuset.mems" file flushes cpuset_hotplug_work so that cpu or memory hotunplug doesn't end up migrating tasks off a cpuset after new resources are added to it. As cpuset_hotplug_work calls into cgroup core via cgroup_transfer_tasks(), this flushing adds the dependency to cgroup core locking from cpuset_write_resmak(). This used to be okay because cgroup interface files were protected by a different mutex; however, `8353da1f91` ("cgroup: remove cgroup_tree_mutex") simplified the cgroup core locking and this dependency became a deadlock hazard - cgroup file removal performed under cgroup core lock tries to drain on-going file operation which is trying to flush cpuset_hotplug_work blocked on the same cgroup core lock. The locking simplification was done because kernfs added an a lot easier way to deal with circular dependencies involving kernfs active protection. Let's use the same strategy in cpuset and break active protection in cpuset_write_resmask(). While it isn't the prettiest, this is a very rare, likely unique, situation which also goes away on the unified hierarchy. The commands to trigger the deadlock warning without the patch and the lockdep output follow. localhost:/ # mount -t cgroup -o cpuset xxx /cpuset localhost:/ # mkdir /cpuset/tmp localhost:/ # echo 1 > /cpuset/tmp/cpuset.cpus localhost:/ # echo 0 > cpuset/tmp/cpuset.mems localhost:/ # echo $$ > /cpuset/tmp/tasks localhost:/ # echo 0 > /sys/devices/system/cpu/cpu1/online ====================================================== [ INFO: possible circular locking dependency detected ] 3.16.0-rc1-0.1-default+ #7 Not tainted ------------------------------------------------------- kworker/1:0/32649 is trying to acquire lock: (cgroup_mutex){+.+.+.}, at: [<ffffffff8110e3d7>] cgroup_transfer_tasks+0x37/0x150 but task is already holding lock: (cpuset_hotplug_work){+.+...}, at: [<ffffffff81085412>] process_one_work+0x192/0x520 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (cpuset_hotplug_work){+.+...}: ... -> #1 (s_active#175){++++.+}: ... -> #0 (cgroup_mutex){+.+.+.}: ... other info that might help us debug this: Chain exists of: cgroup_mutex --> s_active#175 --> cpuset_hotplug_work Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(cpuset_hotplug_work); lock(s_active#175); lock(cpuset_hotplug_work); lock(cgroup_mutex); * DEADLOCK * 2 locks held by kworker/1:0/32649: #0: ("events"){.+.+.+}, at: [<ffffffff81085412>] process_one_work+0x192/0x520 #1: (cpuset_hotplug_work){+.+...}, at: [<ffffffff81085412>] process_one_work+0x192/0x520 stack backtrace: CPU: 1 PID: 32649 Comm: kworker/1:0 Not tainted 3.16.0-rc1-0.1-default+ #7 ... Call Trace: [<ffffffff815a5f78>] dump_stack+0x72/0x8a [<ffffffff810c263f>] print_circular_bug+0x10f/0x120 [<ffffffff810c481e>] check_prev_add+0x43e/0x4b0 [<ffffffff810c4ee6>] validate_chain+0x656/0x7c0 [<ffffffff810c53d2>] __lock_acquire+0x382/0x660 [<ffffffff810c57a9>] lock_acquire+0xf9/0x170 [<ffffffff815aa13f>] mutex_lock_nested+0x6f/0x380 [<ffffffff8110e3d7>] cgroup_transfer_tasks+0x37/0x150 [<ffffffff811129c0>] hotplug_update_tasks_insane+0x110/0x1d0 [<ffffffff81112bbd>] cpuset_hotplug_update_tasks+0x13d/0x180 [<ffffffff811148ec>] cpuset_hotplug_workfn+0x18c/0x630 [<ffffffff810854d4>] process_one_work+0x254/0x520 [<ffffffff810875dd>] worker_thread+0x13d/0x3d0 [<ffffffff8108e0c8>] kthread+0xf8/0x100 [<ffffffff815acaec>] ret_from_fork+0x7c/0xb0 Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Li Zefan <lizefan@huawei.com> Tested-by: Li Zefan <lizefan@huawei.com>		2014-07-01 16:42:28 -04:00
..
debug	kernel/printk: use symbolic defines for console loglevels	2014-06-04 16:54:17 -07:00
events	Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-06-12 19:18:49 -07:00
gcov	gcov: add support for GCC 4.9	2014-06-10 15:34:46 -07:00
irq	genirq: Improve documentation to match current implementation	2014-05-27 10:16:44 +02:00
locking	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-06-12 18:48:15 -07:00
power	Merge branch 'pm-sleep'	2014-06-12 13:43:08 +02:00
printk	kernel/printk: use symbolic defines for console loglevels	2014-06-04 16:54:17 -07:00
rcu	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next	2014-06-03 12:57:53 -07:00
sched	Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-06-12 19:42:15 -07:00
time	Merge branch 'akpm' (patchbomb from Andrew) into next	2014-06-04 16:55:13 -07:00
trace	One bug fix that goes back to 3.10. Accessing a non existent buffer	2014-06-12 21:07:25 -07:00
.gitignore	Ignore generated file kernel/x509_certificate_list	2013-12-10 18:21:34 +00:00
acct.c	ipc, kernel: clear whitespace	2014-06-06 16:08:14 -07:00
async.c
audit_tree.c	inotify: Fix reporting of cookies for inotify events	2014-02-18 11:17:17 +01:00
audit_watch.c	inotify: Fix reporting of cookies for inotify events	2014-02-18 11:17:17 +01:00
audit.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next	2014-06-12 14:27:40 -07:00
audit.h	audit: Use struct net not pid_t to remember the network namespce to reply in	2014-03-20 10:10:53 -04:00
auditfilter.c	Merge git://git.infradead.org/users/eparis/audit	2014-04-12 12:38:53 -07:00
auditsc.c	auditsc: audit_krule mask accesses need bounds checking	2014-06-10 08:44:40 -07:00
backtracetest.c	kernel/backtracetest.c: replace no level printk by pr_info()	2014-06-04 16:54:14 -07:00
bounds.c	mm: do not allocate page->ptl dynamically, if spinlock_t fits to long	2013-12-20 12:25:45 -08:00
capability.c	fs,userns: Change inode_capable to capable_wrt_inode_uidgid	2014-06-10 13:57:22 -07:00
cgroup_freezer.c	cgroup: remove css_parent()	2014-05-16 13:22:48 -04:00
cgroup.c	cgroup: fix a race between cgroup_mount() and cgroup_kill_sb()	2014-06-30 10:16:26 -04:00
compat.c	kernel/compat.c: use sizeof() instead of sizeof	2014-06-04 16:54:19 -07:00
configs.c
context_tracking.c	asmlinkage: Add explicit __visible to drivers/, lib/, kernel/*	2014-05-05 16:07:46 -07:00
cpu_pm.c
cpu.c	More ACPI and power management updates for 3.16-rc1	2014-06-12 13:14:19 -07:00
cpuset.c	cpuset: break kernfs active protection in cpuset_write_resmask()	2014-07-01 16:42:28 -04:00
crash_dump.c
cred.c
delayacct.c	kernel/delayacct.c: remove redundant checking in __delayacct_add_tsk()	2013-11-13 12:09:12 +09:00
dma.c
elfcore.c	switch elf_core_write_extra_phdrs() to dump_emit()	2013-11-09 00:16:23 -05:00
exec_domain.c	kernel/exec_domain.c: code clean-up	2014-06-04 16:54:15 -07:00
exit.c	signals: mv {dis,}allow_signal() from sched.h/exit.c to signal.[ch]	2014-06-06 16:08:11 -07:00
extable.c	asmlinkage: Make main_extable_sort_needed visible	2014-02-13 18:13:22 -08:00
fork.c	ptrace: fix fork event messages across pid namespaces	2014-06-06 16:08:11 -07:00
freezer.c	libata, freezer: avoid block device removal while system is frozen	2013-12-19 13:50:32 -05:00
futex_compat.c	compat: Get rid of (get\|put)_compat_time(val\|spec)	2014-02-02 14:09:12 -08:00
futex.c	Merge branch 'next' (accumulated 3.16 merge window patches) into master	2014-06-08 11:31:16 -07:00
groups.c	kernel/groups.c: remove return value of set_groups	2014-04-03 16:21:05 -07:00
hrtimer.c	Merge branch 'perf/urgent' into perf/core, to resolve conflict and to prepare for new patches	2014-06-06 07:55:06 +02:00
hung_task.c	kernel/hung_task.c: convert simple_strtoul to kstrtouint	2014-06-04 16:54:15 -07:00
irq_work.c	perf/x86: Warn to early_printk() in case irq_work is too slow	2014-02-21 21:49:07 +01:00
itimer.c
jump_label.c
kallsyms.c	kernel: use macros from compiler.h instead of __attribute__((...))	2014-04-07 16:36:11 -07:00
kcmp.c
Kconfig.freezer
Kconfig.hz	kernel: remove CONFIG_USE_GENERIC_SMP_HELPERS	2013-11-15 09:32:22 +09:00
Kconfig.locks	locking/rwlocks: Introduce 'qrwlocks' - fair, queued rwlocks	2014-06-06 07:58:28 +02:00
Kconfig.preempt
kexec.c	kernel/kexec.c: convert printk to pr_foo()	2014-06-06 16:08:12 -07:00
kmod.c	signals: change wait_for_helper() to use kernel_sigaction()	2014-06-06 16:08:12 -07:00
kprobes.c	kprobes: Show blacklist entries via debugfs	2014-04-24 10:26:41 +02:00
ksysfs.c	kobject: Make support for uevent_helper optional.	2014-04-25 12:00:49 -07:00
kthread.c	kthread: fix return value of kthread_create() upon SIGKILL.	2014-06-04 16:53:51 -07:00
latencytop.c	kernel/latencytop.c: convert seq_printf to seq_puts	2014-06-04 16:54:15 -07:00
Makefile	Merge branch 'x86-asmlinkage-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip	2014-03-31 14:13:25 -07:00
module_signing.c
module-internal.h
module.c	Most of this is cleaning up various driver sysfs permissions so we can	2014-06-11 16:09:14 -07:00
notifier.c	kprobes, notifier: Use NOKPROBE_SYMBOL macro in notifier	2014-04-24 10:26:39 +02:00
nsproxy.c
padata.c	padata: Fix wrong usage of rcu_dereference()	2013-12-05 21:28:42 +08:00
panic.c	kernel/panic.c: add "crash_kexec_post_notifiers" option for kdump after panic_notifers	2014-06-06 16:08:12 -07:00
params.c	param: hand arguments after -- straight to init	2014-04-28 11:48:34 +09:30
pid_namespace.c	pid_namespace: pidns_get() should check task_active_pid_ns() != NULL	2014-04-02 16:20:21 -07:00
pid.c
posix-cpu-timers.c	posix-timers: Convert abuses of BUG_ON to WARN_ON	2013-12-09 16:56:29 +01:00
posix-timers.c
profile.c	kernel/profile.c: use static const char instead of static char	2014-06-06 16:08:13 -07:00
ptrace.c	kernel/compat: convert to COMPAT_SYSCALL_DEFINE	2014-03-06 15:35:10 +01:00
range.c
reboot.c	kernel/reboot.c: convert simple_strtoul to kstrtoint	2014-06-04 16:54:15 -07:00
relay.c	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	2014-04-12 14:49:50 -07:00
res_counter.c	kernel/res_counter.c: replace simple_strtoull by kstrtoull	2014-06-04 16:54:15 -07:00
resource.c	resources: Clarify sanity check message	2014-05-23 10:47:21 -06:00
seccomp.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next	2014-06-12 14:27:40 -07:00
signal.c	signals: introduce kernel_sigaction()	2014-06-06 16:08:12 -07:00
smp.c	smp: print more useful debug info upon receiving IPI on an offline CPU	2014-06-06 16:08:12 -07:00
smpboot.c
smpboot.h
softirq.c	Merge branch 'rcu/next' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu	2014-05-22 11:36:10 +02:00
stacktrace.c
stop_machine.c	kernel/stop_machine.c: kernel-doc warning fix	2014-06-04 16:54:15 -07:00
sys_ni.c	sys_sgetmask/sys_ssetmask: add CONFIG_SGETMASK_SYSCALL	2014-06-04 16:54:14 -07:00
sys.c	sched: Consolidate open coded implementations of nice level frobbing into nice_to_rlimit() and rlimit_to_nice()	2014-05-22 11:16:36 +02:00
sysctl_binary.c	kernel/sysctl_binary.c: use scnprintf() instead of snprintf()	2013-11-13 12:09:33 +09:00
sysctl.c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next	2014-06-12 14:27:40 -07:00
system_certificates.S	KEYS: correct alignment of system_certificate_list content in assembly file	2013-12-10 18:25:28 +00:00
system_keyring.c	KEYS: correct alignment of system_certificate_list content in assembly file	2013-12-10 18:25:28 +00:00
task_work.c
taskstats.c	genetlink: only pass array to genl_register_family_with_ops()	2013-11-19 16:39:05 -05:00
test_kprobes.c
time.c
timeconst.bc
timer.c	timer: Prevent overflow in apply_slack	2014-04-30 13:46:17 +02:00
torture.c	torture: Remove __init from torture_init_begin/end	2014-05-14 09:46:30 -07:00
tracepoint.c	kernel/tracepoint.c: kernel-doc fixes	2014-06-04 16:54:15 -07:00
tsacct.c
uid16.c
up.c	smp: Rename __smp_call_function_single() to smp_call_function_single_async()	2014-02-24 14:47:15 -08:00
user_namespace.c	kernel/user_namespace.c: kernel-doc/checkpatch fixes	2014-06-06 16:08:13 -07:00
user-return-notifier.c
user.c	kernel/user.c: drop unused field 'files' from user_struct	2014-06-04 16:54:16 -07:00
utsname_sysctl.c	sysctl: convert use of typedef ctl_table to struct ctl_table	2014-06-06 16:08:16 -07:00
utsname.c
watchdog.c	kernel/watchdog.c:touch_softlockup_watchdog(): use raw_cpu_write()	2014-04-18 16:40:08 -07:00
workqueue_internal.h	workqueue: rename manager_mutex to attach_mutex	2014-05-20 10:59:32 -04:00
workqueue.c	Merge branch 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq	2014-06-09 14:56:49 -07:00