linux/drivers/md
Nikos Tsironis 65fc7c3704 dm snapshot: Don't sleep holding the snapshot lock
When completing a pending exception, pending_complete() waits for all
conflicting reads to drain, before inserting the final, completed
exception. Conflicting reads are snapshot reads redirected to the
origin, because the relevant chunk is not remapped to the COW device the
moment we receive the read.

The completed exception must be inserted into the exception table after
all conflicting reads drain to ensure snapshot reads don't return
corrupted data. This is required because inserting the completed
exception into the exception table signals that the relevant chunk is
remapped and both origin writes and snapshot merging will now overwrite
the chunk in origin.

This wait is done holding the snapshot lock to ensure that
pending_complete() doesn't starve if new snapshot reads keep coming for
this chunk.

In preparation for the next commit, where we use a spinlock instead of a
mutex to protect the exception tables, we remove the need for holding
the lock while waiting for conflicting reads to drain.

We achieve this in two steps:

1. pending_complete() inserts the completed exception before waiting for
   conflicting reads to drain and removes the pending exception after
   all conflicting reads drain.

   This ensures that new snapshot reads will be redirected to the COW
   device, instead of the origin, and thus pending_complete() will not
   starve. Moreover, we use the existence of both a completed and
   a pending exception to signify that the COW is done but there are
   conflicting reads in flight.

2. In __origin_write() we check first if there is a pending exception
   and then if there is a completed exception. If there is a pending
   exception any submitted BIO is delayed on the pe->origin_bios list and
   DM_MAPIO_SUBMITTED is returned. This ensures that neither writes to the
   origin nor snapshot merging can overwrite the origin chunk, until all
   conflicting reads drain, and thus snapshot reads will not return
   corrupted data.

Summarizing, we now have the following possible combinations of pending
and completed exceptions for a chunk, along with their meaning:

A. No exceptions exist: The chunk has not been remapped yet.
B. Only a pending exception exists: The chunk is currently being copied
   to the COW device.
C. Both a pending and a completed exception exist: COW for this chunk
   has completed but there are snapshot reads in flight which had been
   redirected to the origin before the chunk was remapped.
D. Only the completed exception exists: COW has been completed and there
   are no conflicting reads in flight.

Co-developed-by: Ilias Tsitsimpis <iliastsi@arrikto.com>
Signed-off-by: Nikos Tsironis <ntsironis@arrikto.com>
Acked-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-04-18 16:18:27 -04:00
..
bcache block: allow bio_for_each_segment_all() to iterate over multi-page bvec 2019-02-15 08:40:11 -07:00
persistent-data dm block manager: remove redundant unlikely annotation 2019-03-05 14:53:48 -05:00
dm-bio-prison-v1.c dm: adjust structure members to improve alignment 2018-06-08 11:53:14 -04:00
dm-bio-prison-v1.h
dm-bio-prison-v2.c dm: adjust structure members to improve alignment 2018-06-08 11:53:14 -04:00
dm-bio-prison-v2.h
dm-bio-record.h block: replace bi_bdev with a gendisk pointer and partitions index 2017-08-23 12:49:55 -06:00
dm-bufio.c Merge branch 'akpm' (patches from Andrew) 2018-12-28 16:55:46 -08:00
dm-builtin.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dm-cache-background-tracker.c dm cache background tracker: fix sparse warning 2018-04-30 15:40:40 -04:00
dm-cache-background-tracker.h
dm-cache-block-types.h
dm-cache-metadata.c dm cache metadata: Fix loading discard bitset 2019-04-18 16:18:25 -04:00
dm-cache-metadata.h
dm-cache-policy-internal.h
dm-cache-policy-smq.c dm: remove unnecessary unlikely() around WARN_ON_ONCE() 2018-10-16 14:34:59 -04:00
dm-cache-policy.c
dm-cache-policy.h
dm-cache-target.c dm cache: add support for discard passdown to the origin device 2019-03-05 14:53:52 -05:00
dm-core.h dm: disable DISCARD if the underlying storage no longer supports it 2019-04-04 15:33:59 -04:00
dm-crypt.c dm crypt: fix endianness annotations around org_sector_of_dmreq 2019-04-18 16:16:01 -04:00
dm-delay.c dm: Check for device sector overflow if CONFIG_LBDAF is not set 2018-12-18 09:02:26 -05:00
dm-era-target.c dm: allow targets to return output from messages they are sent 2018-04-03 15:04:10 -04:00
dm-exception-store.c
dm-exception-store.h
dm-flakey.c dm flakey: Properly corrupt multi-page bios. 2018-12-18 09:02:27 -05:00
dm-init.c dm init: fix const confusion for dm_allowed_targets array 2019-04-01 16:16:37 -04:00
dm-integrity.c dm integrity: fix deadlock with overlapping I/O 2019-04-05 18:49:08 -04:00
dm-io.c dm: Use kzalloc for all structs with embedded biosets/mempools 2018-06-05 08:47:43 -06:00
dm-ioctl.c dm: add support to directly boot to a mapped device 2019-03-05 14:53:50 -05:00
dm-kcopyd.c dm kcopyd: Fix bug causing workqueue stalls 2018-12-18 09:02:26 -05:00
dm-linear.c dm: Check for device sector overflow if CONFIG_LBDAF is not set 2018-12-18 09:02:26 -05:00
dm-log-userspace-base.c dm: convert to bioset_init()/mempool_init() 2018-05-30 15:33:32 -06:00
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c dax: Introduce a ->copy_to_iter dax operation 2018-05-22 23:18:31 -07:00
dm-log.c
dm-mpath.c dm mpath: only flush workqueue when needed 2018-12-18 09:02:25 -05:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-queue-length.c dm mpath selector: more evenly distribute ties 2018-01-29 13:44:58 -05:00
dm-raid1.c dm: Check for device sector overflow if CONFIG_LBDAF is not set 2018-12-18 09:02:26 -05:00
dm-raid.c dm: eliminate 'split_discard_bios' flag from DM target interface 2019-02-20 23:24:55 -05:00
dm-region-hash.c - Error path bug fix for overflow tests (Dan) 2018-06-12 18:28:00 -07:00
dm-round-robin.c
dm-rq.c dm: disable DISCARD if the underlying storage no longer supports it 2019-04-04 15:33:59 -04:00
dm-rq.h dm: remove unused _rq_tio_cache and _rq_cache 2019-03-05 14:48:50 -05:00
dm-service-time.c dm mpath selector: more evenly distribute ties 2018-01-29 13:44:58 -05:00
dm-snap-persistent.c dm bufio: move dm-bufio.h to include/linux/ 2018-04-03 15:04:23 -04:00
dm-snap-transient.c
dm-snap.c dm snapshot: Don't sleep holding the snapshot lock 2019-04-18 16:18:27 -04:00
dm-stats.c mm: convert totalram_pages and totalhigh_pages variables to atomic 2018-12-28 12:11:47 -08:00
dm-stats.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dm-stripe.c dax: Introduce a ->copy_to_iter dax operation 2018-05-22 23:18:31 -07:00
dm-switch.c dm switch: use struct_size() in kzalloc() 2019-03-05 14:48:51 -05:00
dm-sysfs.c dm: remove legacy request-based IO path 2018-10-11 11:36:09 -04:00
dm-table.c dm table: propagate BDI_CAP_STABLE_WRITES to fix sporadic checksum errors 2019-04-01 16:26:02 -04:00
dm-target.c dm: remove unused macro DM_MOD_NAME_SIZE 2018-04-03 15:04:15 -04:00
dm-thin-metadata.c dm thin: fix passdown_double_checking_shared_status() 2019-01-15 16:10:41 -05:00
dm-thin-metadata.h dm thin: fix passdown_double_checking_shared_status() 2019-01-15 16:10:41 -05:00
dm-thin.c dm thin: add sanity checks to thin-pool and external snapshot creation 2019-03-05 14:53:49 -05:00
dm-uevent.c
dm-uevent.h
dm-unstripe.c dm: Check for device sector overflow if CONFIG_LBDAF is not set 2018-12-18 09:02:26 -05:00
dm-verity-fec.c dm verity fec: remove redundant unlikely annotation 2019-03-05 14:53:48 -05:00
dm-verity-fec.h dm: convert to bioset_init()/mempool_init() 2018-05-30 15:33:32 -06:00
dm-verity-target.c dm verity: log the hash algorithm implementation 2018-12-18 09:02:27 -05:00
dm-verity.h dm verity: add 'check_at_most_once' option to only validate hashes once 2018-04-03 15:04:29 -04:00
dm-writecache.c dm writecache: fix typo in name for writeback_wq 2019-03-05 14:53:51 -05:00
dm-zero.c
dm-zoned-metadata.c dm zoned: Fix zone report handling 2019-04-18 16:17:58 -04:00
dm-zoned-reclaim.c dm kcopyd: return void from dm_kcopyd_copy() 2018-07-31 17:33:21 -04:00
dm-zoned-target.c dm zoned: Silence a static checker warning 2019-04-18 16:16:01 -04:00
dm-zoned.h
dm.c dm: disable DISCARD if the underlying storage no longer supports it 2019-04-04 15:33:59 -04:00
dm.h dm: remove legacy request-based IO path 2018-10-11 11:36:09 -04:00
Kconfig dm: add support to directly boot to a mapped device 2019-03-05 14:53:50 -05:00
Makefile dm: add support to directly boot to a mapped device 2019-03-05 14:53:50 -05:00
md-bitmap.c md/bitmap: use mddev_suspend/resume instead of ->quiesce() 2018-10-10 11:03:34 -07:00
md-bitmap.h md: Avoid namespace collision with bitmap API 2018-08-01 15:49:39 -07:00
md-cluster.c md-cluster: remove suspend_info 2018-10-18 09:41:25 -07:00
md-cluster.h md-cluster: introduce resync_info_get interface for sanity check 2018-10-18 09:36:35 -07:00
md-faulty.c md: convert to bioset_init()/mempool_init() 2018-05-30 15:33:32 -06:00
md-linear.c md-linear: use struct_size() in kzalloc() 2019-02-04 10:37:11 -08:00
md-linear.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
md-multipath.c treewide: kzalloc() -> kcalloc() 2018-06-12 16:19:22 -07:00
md-multipath.h md: convert to bioset_init()/mempool_init() 2018-05-30 15:33:32 -06:00
md.c md: Make bio_alloc_mddev use bio_alloc_bioset 2019-01-14 06:31:56 -07:00
md.h md-cluster/raid10: support add disk under grow mode 2018-10-18 09:34:56 -07:00
raid0.c blkcg: remove bio->bi_css and instead use bio->bi_blkg 2018-12-07 22:26:37 -07:00
raid0.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
raid1-10.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
raid1.c for-5.1/block-20190302 2019-03-08 14:12:17 -08:00
raid1.h md: convert to bioset_init()/mempool_init() 2018-05-30 15:33:32 -06:00
raid5-cache.c md/raid5: fix 'out of memory' during raid cache recovery 2019-01-28 11:44:40 -08:00
raid5-log.h raid5: set write hint for PPL 2019-03-12 10:15:18 -07:00
raid5-ppl.c for-5.1/block-post-20190315 2019-03-16 12:36:39 -07:00
raid5.c for-5.1/block-post-20190315 2019-03-16 12:36:39 -07:00
raid5.h md: convert to kvmalloc 2019-03-12 10:04:02 -07:00
raid10.c md: Fix failed allocation of md_register_thread 2019-03-12 10:15:18 -07:00
raid10.h md: convert to bioset_init()/mempool_init() 2018-05-30 15:33:32 -06:00