btrfs: add proper safety check before resuming dev-replace

The device replace is paused by unmount or read only remount, and
resumed on next mount or write remount.

The exclusive status should be checked properly as it's a global
invariant and we must not allow 2 operations run. In this case, the
balance can be also paused and resumed under same conditions. It's
always checked first so dev-replace could see the EXCL_OP already taken,
BUT, the ioctl would never let start both at the same time.

Replace the WARN_ON with message and return 0, indicating no error as
this is purely theoretical and the user will be informed. Resolving that
manually should be possible by waiting for the other operation to finish
or cancel the paused state.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit is contained in:
David Sterba 2018-03-20 19:51:04 +01:00
parent a17c95df4c
commit 010a47bde9

View File

@ -896,7 +896,17 @@ int btrfs_resume_dev_replace_async(struct btrfs_fs_info *fs_info)
}
btrfs_dev_replace_write_unlock(dev_replace);
WARN_ON(test_and_set_bit(BTRFS_FS_EXCL_OP, &fs_info->flags));
/*
* This could collide with a paused balance, but the exclusive op logic
* should never allow both to start and pause. We don't want to allow
* dev-replace to start anyway.
*/
if (test_and_set_bit(BTRFS_FS_EXCL_OP, &fs_info->flags)) {
btrfs_info(fs_info,
"cannot resume dev-replace, other exclusive operation running");
return 0;
}
task = kthread_run(btrfs_dev_replace_kthread, fs_info, "btrfs-devrepl");
return PTR_ERR_OR_ZERO(task);
}