nvmet: fix ns enable/disable possible hang

[ Upstream commit f97914e35f ]

When disabling an nvmet namespace, there is a period where the
subsys->lock is released, as the ns disable waits for backend IO to
complete, and the ns percpu ref to be properly killed. The original
intent was to avoid taking the subsystem lock for a prolong period as
other processes may need to acquire it (for example new incoming
connections).

However, it opens up a window where another process may come in and
enable the ns, (re)intiailizing the ns percpu_ref, causing the disable
sequence to hang.

Solve this by taking the global nvmet_config_sem over the entire configfs
enable/disable sequence.

Fixes: a07b4970f4 ("nvmet: add a generic NVMe target")
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Signed-off-by: Keith Busch <kbusch@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
This commit is contained in:
Sagi Grimberg 2024-05-21 23:20:28 +03:00 committed by Greg Kroah-Hartman
parent 36989c6825
commit ca3b4293dc

View File

@ -538,10 +538,18 @@ static ssize_t nvmet_ns_enable_store(struct config_item *item,
if (kstrtobool(page, &enable))
return -EINVAL;
/*
* take a global nvmet_config_sem because the disable routine has a
* window where it releases the subsys-lock, giving a chance to
* a parallel enable to concurrently execute causing the disable to
* have a misaccounting of the ns percpu_ref.
*/
down_write(&nvmet_config_sem);
if (enable)
ret = nvmet_ns_enable(ns);
else
nvmet_ns_disable(ns);
up_write(&nvmet_config_sem);
return ret ? ret : count;
}