linux/fs/btrfs
Filipe Manana f13e01b89d btrfs: ensure fast fsync waits for ordered extents after a write failure
If a write path in COW mode fails, either before submitting a bio for the
new extents or an actual IO error happens, we can end up allowing a fast
fsync to log file extent items that point to unwritten extents.

This is because dropping the extent maps happens when completing ordered
extents, at btrfs_finish_one_ordered(), and the completion of an ordered
extent is executed in a work queue.

This can result in a fast fsync to start logging file extent items based
on existing extent maps before the ordered extents complete, therefore
resulting in a log that has file extent items that point to unwritten
extents, resulting in a corrupt file if a crash happens after and the log
tree is replayed the next time the fs is mounted.

This can happen for both direct IO writes and buffered writes.

For example consider a direct IO write, in COW mode, that fails at
btrfs_dio_submit_io() because btrfs_extract_ordered_extent() returned an
error:

1) We call btrfs_finish_ordered_extent() with the 'uptodate' parameter
   set to false, meaning an error happened;

2) That results in marking the ordered extent with the BTRFS_ORDERED_IOERR
   flag;

3) btrfs_finish_ordered_extent() queues the completion of the ordered
   extent - so that btrfs_finish_one_ordered() will be executed later in
   a work queue. That function will drop extent maps in the range when
   it's executed, since the extent maps point to unwritten locations
   (signaled by the BTRFS_ORDERED_IOERR flag);

4) After calling btrfs_finish_ordered_extent() we keep going down the
   write path and unlock the inode;

5) After that a fast fsync starts and locks the inode;

6) Before the work queue executes btrfs_finish_one_ordered(), the fsync
   task sees the extent maps that point to the unwritten locations and
   logs file extent items based on them - it does not know they are
   unwritten, and the fast fsync path does not wait for ordered extents
   to complete, which is an intentional behaviour in order to reduce
   latency.

For the buffered write case, here's one example:

1) A fast fsync begins, and it starts by flushing delalloc and waiting for
   the writeback to complete by calling filemap_fdatawait_range();

2) Flushing the dellaloc created a new extent map X;

3) During the writeback some IO error happened, and at the end io callback
   (end_bbio_data_write()) we call btrfs_finish_ordered_extent(), which
   sets the BTRFS_ORDERED_IOERR flag in the ordered extent and queues its
   completion;

4) After queuing the ordered extent completion, the end io callback clears
   the writeback flag from all pages (or folios), and from that moment the
   fast fsync can proceed;

5) The fast fsync proceeds sees extent map X and logs a file extent item
   based on extent map X, resulting in a log that points to an unwritten
   data extent - because the ordered extent completion hasn't run yet, it
   happens only after the logging.

To fix this make btrfs_finish_ordered_extent() set the inode flag
BTRFS_INODE_NEEDS_FULL_SYNC in case an error happened for a COW write,
so that a fast fsync will wait for ordered extent completion.

Note that this issues of using extent maps that point to unwritten
locations can not happen for reads, because in read paths we start by
locking the extent range and wait for any ordered extents in the range
to complete before looking for extent maps.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-05-28 16:35:12 +02:00
..
tests btrfs: use btrfs_is_testing() everywhere 2024-05-07 21:31:07 +02:00
accessors.c btrfs: remove unused included headers 2024-03-04 16:24:46 +01:00
accessors.h btrfs: move balance args conversion helpers to volumes.c 2024-03-04 16:24:52 +01:00
acl.c btrfs: remove unused included headers 2024-03-04 16:24:46 +01:00
acl.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
async-thread.c btrfs: remove unused included headers 2024-03-04 16:24:46 +01:00
async-thread.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
backref.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
backref.h btrfs: uninline some static inline helpers from backref.h 2024-03-04 16:24:53 +01:00
bio.c btrfs: introduce offload_csum_mode to tweak checksum offloading behavior 2024-03-04 16:24:52 +01:00
bio.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
block-group.c btrfs: zoned: don't skip block groups with 100% zone unusable 2024-03-26 16:42:39 +01:00
block-group.h btrfs: mark btrfs_put_caching_control() static 2024-03-05 17:13:23 +01:00
block-rsv.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
block-rsv.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
btrfs_inode.h btrfs: ensure fast fsync waits for ordered extents after a write failure 2024-05-28 16:35:12 +02:00
compression.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
compression.h btrfs: compression: migrate compression/decompression paths to folios 2024-05-07 21:31:02 +02:00
ctree.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
ctree.h btrfs: add forward declarations and headers, part 3 2024-03-04 16:24:49 +01:00
defrag.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
defrag.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
delalloc-space.c btrfs: remove unused included headers 2024-03-04 16:24:46 +01:00
delalloc-space.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
delayed-inode.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
delayed-inode.h btrfs: uninline btrfs_init_delayed_root() 2024-03-04 16:24:53 +01:00
delayed-ref.c btrfs: replace btrfs_delayed_*_ref with btrfs_*_ref 2024-05-07 21:31:05 +02:00
delayed-ref.h btrfs: replace btrfs_delayed_*_ref with btrfs_*_ref 2024-05-07 21:31:05 +02:00
dev-replace.c for-6.9-tag 2024-03-12 12:28:34 -07:00
dev-replace.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
dir-item.c btrfs: abort transaction on generation mismatch when marking eb as dirty 2023-10-12 16:44:07 +02:00
dir-item.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
discard.c btrfs: unexport btrfs_run_discard_work and make it static 2023-06-19 13:59:25 +02:00
discard.h btrfs: unexport btrfs_run_discard_work and make it static 2023-06-19 13:59:25 +02:00
disk-io.c btrfs: count super block write errors in device instead of tracking folio error state 2024-05-07 21:31:11 +02:00
disk-io.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
export.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
export.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
extent_io.c btrfs: count super block write errors in device instead of tracking folio error state 2024-05-07 21:31:11 +02:00
extent_io.h btrfs: add a cached state to extent_clear_unlock_delalloc 2024-05-07 21:31:10 +02:00
extent_map.c btrfs: add tracepoints for extent map shrinker events 2024-05-07 21:31:07 +02:00
extent_map.h btrfs: add extra comments on extent_map members 2024-05-07 21:31:08 +02:00
extent-io-tree.c btrfs: rename err to ret in convert_extent_bit() 2024-05-07 21:31:01 +02:00
extent-io-tree.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
extent-tree.c btrfs: fix end of tree detection when searching for data extent ref 2024-05-15 17:57:39 +02:00
extent-tree.h btrfs: add forward declarations and headers, part 3 2024-03-04 16:24:49 +01:00
file-item.c btrfs: simplify the inline extent map creation 2024-05-07 21:31:08 +02:00
file-item.h btrfs: remove search_commit parameter from btrfs_lookup_csums_list() 2024-05-07 21:31:03 +02:00
file.c btrfs: ensure fast fsync waits for ordered extents after a write failure 2024-05-28 16:35:12 +02:00
file.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
free-space-cache.c btrfs: remove SLAB_MEM_SPREAD flag use 2024-03-05 17:13:23 +01:00
free-space-cache.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
free-space-tree.c btrfs: move transaction abort to the error site btrfs_rebuild_free_space_tree() 2024-03-04 16:24:48 +01:00
free-space-tree.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
fs.c btrfs: sysfs: update fs features directory asynchronously 2023-02-13 17:50:35 +01:00
fs.h btrfs: remove duplicate included header from fs.h 2024-05-07 21:31:10 +02:00
inode-item.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
inode-item.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
inode.c btrfs: add a cached state to extent_clear_unlock_delalloc 2024-05-07 21:31:10 +02:00
ioctl.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
ioctl.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
Kconfig btrfs: check-integrity: remove CONFIG_BTRFS_FS_CHECK_INTEGRITY option 2023-10-12 16:44:05 +02:00
locking.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
locking.h btrfs: locking: rename __btrfs_tree_lock() and __btrfs_tree_read_lock() 2024-05-07 21:31:00 +02:00
lru_cache.c btrfs: fix typos found by codespell 2023-12-15 23:00:04 +01:00
lru_cache.h btrfs: open code trivial btrfs_lru_cache_size() 2024-03-04 16:24:53 +01:00
lzo.c btrfs: compression: migrate compression/decompression paths to folios 2024-05-07 21:31:02 +02:00
Makefile btrfs: add support for inserting raid stripe extents 2023-10-12 16:44:09 +02:00
messages.c btrfs: remove colon from messages with state 2024-04-18 01:46:35 +02:00
messages.h btrfs: constify fs_info parameter in __btrfs_panic() 2023-12-15 20:27:02 +01:00
misc.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
ordered-data.c btrfs: ensure fast fsync waits for ordered extents after a write failure 2024-05-28 16:35:12 +02:00
ordered-data.h btrfs: handle errors in btrfs_reloc_clone_csums properly 2024-05-07 21:31:09 +02:00
orphan.c btrfs: remove unused included headers 2024-03-04 16:24:46 +01:00
orphan.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
print-tree.c btrfs: new inline ref storing owning subvol of data extents 2023-10-12 16:44:11 +02:00
print-tree.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
props.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
props.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
qgroup.c btrfs: qgroup: fix qgroup id collision across mounts 2024-05-15 17:57:09 +02:00
qgroup.h btrfs: qgroup: validate btrfs_qgroup_inherit parameter 2024-03-05 17:13:24 +01:00
raid56.c btrfs: raid56: extra debugging for raid6 syndrome generation 2024-03-04 16:24:52 +01:00
raid56.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
raid-stripe-tree.c btrfs: remove unused included headers 2024-03-04 16:24:46 +01:00
raid-stripe-tree.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
rcu-string.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
ref-verify.c btrfs: rename btrfs_data_ref->ino to ->objectid 2024-05-07 21:31:05 +02:00
ref-verify.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
reflink.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
reflink.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
relocation.c btrfs: handle errors in btrfs_reloc_clone_csums properly 2024-05-07 21:31:09 +02:00
relocation.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
root-tree.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
root-tree.h btrfs: qgroup: fix qgroup prealloc rsv leak in subvolume operations 2024-04-02 19:18:23 +02:00
scrub.c btrfs: scrub: initialize ret in scrub_simple_mirror() to fix compilation warning 2024-05-15 17:57:32 +02:00
scrub.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
send.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
send.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
space-info.c btrfs: remove unused included headers 2024-03-04 16:24:46 +01:00
space-info.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
subpage.c btrfs: subpage: make writer lock utilize bitmap 2024-03-05 17:13:23 +01:00
subpage.h btrfs: subpage: make reader lock utilize bitmap 2024-03-05 17:13:23 +01:00
super.c btrfs: re-introduce 'norecovery' mount option 2024-05-21 15:27:17 +02:00
super.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
sysfs.c btrfs: use btrfs_is_testing() everywhere 2024-05-07 21:31:07 +02:00
sysfs.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
transaction.c btrfs: rename werr and err to ret in __btrfs_wait_marked_extents() 2024-05-07 21:31:08 +02:00
transaction.h btrfs: remove no longer used btrfs_transaction_in_commit() 2024-03-04 16:24:52 +01:00
tree-checker.c btrfs: use btrfs_is_testing() everywhere 2024-05-07 21:31:07 +02:00
tree-checker.h btrfs: make sure that WRITTEN is set on all metadata blocks 2024-05-02 22:11:13 +02:00
tree-log.c btrfs: pass the extent map tree's inode to clear_em_logging() 2024-05-07 21:31:06 +02:00
tree-log.h btrfs: uninline some static inline helpers from tree-log.h 2024-03-04 16:24:53 +01:00
tree-mod-log.c btrfs: change root->root_key.objectid to btrfs_root_id() 2024-05-07 21:31:06 +02:00
tree-mod-log.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
ulist.c btrfs: remove unused included headers 2024-03-04 16:24:46 +01:00
ulist.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
uuid-tree.c btrfs: unify handling of return values of btrfs_insert_empty_items() 2024-03-04 16:24:48 +01:00
uuid-tree.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
verity.c btrfs: remove unused included headers 2024-03-04 16:24:46 +01:00
verity.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
volumes.c btrfs: remove no longer used btrfs_clone_chunk_map() 2024-05-07 21:31:03 +02:00
volumes.h btrfs: count super block write errors in device instead of tracking folio error state 2024-05-07 21:31:11 +02:00
xattr.c btrfs: rename err to ret in btrfs_initxattrs() 2024-05-07 21:31:00 +02:00
xattr.h btrfs: add forward declarations and headers, part 1 2024-03-04 16:24:49 +01:00
zlib.c btrfs: compression: migrate compression/decompression paths to folios 2024-05-07 21:31:02 +02:00
zoned.c btrfs: zoned: fix use-after-free due to race with dev replace 2024-05-15 17:57:25 +02:00
zoned.h btrfs: add forward declarations and headers, part 2 2024-03-04 16:24:49 +01:00
zstd.c btrfs: compression: migrate compression/decompression paths to folios 2024-05-07 21:31:02 +02:00