linux/arch/powerpc/platforms/pseries
Li Zhong 42dbfc8649 powerpc/pseries: Protect remove_memory() with device hotplug lock
While testing memory hot-remove, I found following dead lock:

Process #1141 is drmgr, trying to remove some memory, i.e. memory499.
It holds the memory_hotplug_mutex, and blocks when trying to remove file
"online" under dir memory499, in kernfs_drain(), at
        wait_event(root->deactivate_waitq,
                   atomic_read(&kn->active) == KN_DEACTIVATED_BIAS);

Process #1120 is trying to online memory499 by
   echo 1 > memory499/online

In .kernfs_fop_write, it uses kernfs_get_active() to increase
&kn->active, thus blocking process #1141. While itself is blocked later
when trying to acquire memory_hotplug_mutex, which is held by process

The backtrace of both processes are shown below:

[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c000000000263ca4>] .online_pages+0x74/0x7b0
[<c00000000055b40c>] .memory_subsys_online+0x9c/0x150
[<c00000000053cbe8>] .device_online+0xb8/0x120
[<c00000000053cd04>] .online_store+0xb4/0xc0
[<c000000000538ce4>] .dev_attr_store+0x64/0xa0
[<c00000000030f4ec>] .sysfs_kf_write+0x7c/0xb0
[<c00000000030e574>] .kernfs_fop_write+0x154/0x1e0
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c

[<c000000001b18600>] 0xc000000001b18600
[<c000000000015044>] .__switch_to+0x144/0x200
[<c00000000030be14>] .__kernfs_remove+0x204/0x300
[<c00000000030d428>] .kernfs_remove_by_name_ns+0x68/0xf0
[<c00000000030fb38>] .sysfs_remove_file_ns+0x38/0x60
[<c000000000539354>] .device_remove_attrs+0x54/0xc0
[<c000000000539fd8>] .device_del+0x158/0x250
[<c00000000053a104>] .device_unregister+0x34/0xa0
[<c00000000055bc14>] .unregister_memory_section+0x164/0x170
[<c00000000024ee18>] .__remove_pages+0x108/0x4c0
[<c00000000004b590>] .arch_remove_memory+0x60/0xc0
[<c00000000026446c>] .remove_memory+0x8c/0xe0
[<c00000000007f9f4>] .pseries_remove_memblock+0xd4/0x160
[<c00000000007fcfc>] .pseries_memory_notifier+0x27c/0x290
[<c0000000008ae6cc>] .notifier_call_chain+0x8c/0x100
[<c0000000000d858c>] .__blocking_notifier_call_chain+0x6c/0xe0
[<c00000000071ddec>] .of_property_notify+0x7c/0xc0
[<c00000000071ed3c>] .of_update_property+0x3c/0x1b0
[<c0000000000756cc>] .ofdt_write+0x3dc/0x740
[<c0000000002f60fc>] .proc_reg_write+0xac/0x110
[<c000000000268450>] .vfs_write+0xe0/0x260
[<c000000000269144>] .SyS_write+0x64/0x110
[<c000000000009ffc>] syscall_exit+0x0/0x7c

This patch uses lock_device_hotplug() to protect remove_memory() called
in pseries_remove_memblock(), which is also stated before function
remove_memory():

 * NOTE: The caller must call lock_device_hotplug() to serialize hotplug
 * and online/offline operations before this call, as required by
 * try_offline_node().
 */
void __ref remove_memory(int nid, u64 start, u64 size)

With this lock held, the other process(#1120 above) trying to online the
memory block will retry the system call when calling
lock_device_hotplug_sysfs(), and finally find No such device error.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
2014-04-28 16:32:14 +10:00
..
cmm.c powerpc: Delete non-required instances of include <linux/init.h> 2014-01-15 13:46:44 +11:00
dlpar.c of: Make device nodes kobjects so they show up in sysfs 2014-03-11 20:48:26 +00:00
dtl.c powerpc: Delete non-required instances of include <linux/init.h> 2014-01-15 13:46:44 +11:00
eeh_pseries.c powerpc/eeh: Cleanup on eeh_subsystem_enabled 2014-02-17 11:19:39 +11:00
event_sources.c of/irq: simplify args to irq_create_of_mapping 2013-10-24 11:42:57 +01:00
firmware.c powerpc/pseries: Update CPU maps when device tree is updated 2013-04-26 16:08:23 +10:00
hotplug-cpu.c powerpc: Fix Oops in rtas_stop_self() 2014-04-28 13:08:47 +10:00
hotplug-memory.c powerpc/pseries: Protect remove_memory() with device hotplug lock 2014-04-28 16:32:14 +10:00
hvCall_inst.c new helper: file_inode(file) 2013-02-22 23:31:31 -05:00
hvCall.S powerpc: Merge STK_REG/PARAM/FRAMESIZE 2012-07-10 19:18:03 +10:00
hvconsole.c pseries: Move plpar_wrapper.h to powerpc common include/asm location. 2013-08-27 14:43:05 +10:00
hvcserver.c powerpc/pseries/hvcserver: Fix strncpy buffer limit in location code 2013-03-05 16:56:27 +11:00
io_event_irq.c powerpc/le: Enable RTAS events support 2014-04-07 10:33:12 +10:00
iommu.c Revert "pseries/iommu: Remove DDW on kexec" 2014-01-15 13:46:45 +11:00
Kconfig powerpc/perf: Add kconfig option for hypervisor provided counters 2014-03-24 09:48:32 +11:00
kexec.c pseries: Move plpar_wrapper.h to powerpc common include/asm location. 2013-08-27 14:43:05 +10:00
lpar.c powerpc/mm: Use HPTE constants when updating hpte bits 2013-12-09 11:40:27 +11:00
lparcfg.c powerpc/pseries: Fix endian issues in /proc/ppc64/lparcfg 2013-12-13 15:48:35 +11:00
Makefile powerpc/pseries/cpuidle: Move processor_idle.c to drivers/cpuidle. 2014-01-29 17:02:22 +11:00
mobility.c powerpc/pseries: Device tree should only be updated once after suspend/migrate 2014-03-07 15:54:49 +11:00
msi.c powerpc/pseries: Fix endian issues in MSI code 2013-12-13 15:48:38 +11:00
nvram.c powerpc: Convert last uses of __FUNCTION__ to __func__ 2014-04-09 12:53:32 +10:00
offline_states.h powerpc/smp: soft-replugged CPUs must go back to start_secondary 2011-04-01 15:37:09 +11:00
pci_dlpar.c powerpc/PCI: Use list_for_each_entry() for bus traversal 2014-02-14 11:20:51 -07:00
pci.c powerpc/pseries: Add Gen3 definitions for PCIE link speed 2014-02-17 11:19:35 +11:00
power.c [POWERPC] Fix warning in pseries/power.c 2008-02-20 13:33:37 +11:00
pseries_energy.c powerpc: Fix a number of sparse warnings 2013-08-14 11:50:24 +10:00
pseries.h powerpc/pseries: Make dlpar_configure_connector parent node aware 2013-08-27 14:45:14 +10:00
ras.c powerpc/le: Enable RTAS events support 2014-04-07 10:33:12 +10:00
reconfig.c of: Make device nodes kobjects so they show up in sysfs 2014-03-11 20:48:26 +00:00
rng.c powerpc/pseries: Fix SMP=n build of rng.c 2013-11-21 10:33:45 +11:00
scanlog.c ppc: Clean up scanlog 2013-05-01 17:29:45 -04:00
setup.c Merge branch 'linus' into sched/core 2014-02-21 21:37:09 +01:00
smp.c powerpc/pseries: Do not start secondaries in Open Firmware 2013-09-25 14:19:00 +10:00
suspend.c powerpc/pseries: Expose in kernel device tree update to drmgr 2014-03-07 15:54:50 +11:00