linux/fs/btrfs
Boris Burkov e7db9e5c6b btrfs: fix encoded write i_size corruption with no-holes
We have observed a btrfs filesystem corruption on workloads using
no-holes and encoded writes via send stream v2. The symptom is that a
file appears to be truncated to the end of its last aligned extent, even
though the final unaligned extent and even the file extent and otherwise
correctly updated inode item have been written.

So if we were writing out a 1MiB+X file via 8 128K extents and one
extent of length X, i_size would be set to 1MiB, but the ninth extent,
nbyte, etc. would all appear correct otherwise.

The source of the race is a narrow (one line of code) window in which a
no-holes fs has read in an updated i_size, but has not yet set a shared
disk_i_size variable to write. Therefore, if two ordered extents run in
parallel (par for the course for receive workloads), the following
sequence can play out: (following "threads" a bit loosely, since there
are callbacks involved for endio but extra threads aren't needed to
cause the issue)

  ENC-WR1 (second to last)                                         ENC-WR2 (last)
  -------                                                          -------
  btrfs_do_encoded_write
    set i_size = 1M
    submit bio B1 ending at 1M
  endio B1
  btrfs_inode_safe_disk_i_size_write
    local i_size = 1M
    falls off a cliff for some reason
							      btrfs_do_encoded_write
								set i_size = 1M+X
								submit bio B2 ending at 1M+X
							      endio B2
							      btrfs_inode_safe_disk_i_size_write
								local i_size = 1M+X
								disk_i_size = 1M+X
    disk_i_size = 1M
							      btrfs_delayed_update_inode
    btrfs_delayed_update_inode

And the delayed inode ends up filled with nbytes=1M+X and isize=1M, and
writes respect i_size and present a corrupted file missing its last
extents.

Fix this by holding the inode lock in the no-holes case so that a thread
can't sneak in a write to disk_i_size that gets overwritten with an out
of date i_size.

Fixes: 41a2ee75aa ("btrfs: introduce per-inode file extent tree")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-05-02 14:21:00 +02:00
..
tests btrfs: replace map_lookup->stripe_len by BTRFS_STRIPE_LEN 2023-04-17 18:01:14 +02:00
accessors.c btrfs: add eb to btrfs_node_key_ptr_offset 2022-12-05 18:00:58 +01:00
accessors.h btrfs: add stack helpers for a few btrfs items 2022-12-05 18:00:58 +01:00
acl.c fs: port acl to mnt_idmap 2023-01-19 09:24:28 +01:00
acl.h fs: port ->set_acl() to pass mnt_idmap 2023-01-19 09:24:27 +01:00
async-thread.c btrfs: simplify WQ_HIGHPRI handling in struct btrfs_workqueue 2022-05-16 17:03:15 +02:00
async-thread.h btrfs: remove unused typedefs get_extent_t and btrfs_work_func_t 2022-07-25 17:45:36 +02:00
backref.c btrfs: ignore fiemap path cache when there are multiple paths for a node 2023-03-29 01:16:23 +02:00
backref.h btrfs: send: skip resolution of our own backref when finding clone source 2022-12-05 18:00:50 +01:00
bio.c btrfs: introduce a new helper to submit write bio for repair 2023-04-17 18:01:23 +02:00
bio.h btrfs: introduce a new helper to submit write bio for repair 2023-04-17 18:01:23 +02:00
block-group.c btrfs: scrub: remove the old scrub recheck code 2023-04-17 18:01:24 +02:00
block-group.h btrfs: scrub: remove the old scrub recheck code 2023-04-17 18:01:24 +02:00
block-rsv.c btrfs: calculate the right space for delayed refs when updating global reserve 2023-04-17 18:01:20 +02:00
block-rsv.h btrfs: simplify variables in btrfs_block_rsv_refill() 2023-04-17 18:01:19 +02:00
btrfs_inode.h btrfs: avoid iterating over all indexes when logging directory 2023-04-17 19:52:19 +02:00
check-integrity.c btrfs: use btrfs_dev_name() helper to handle missing devices better 2022-12-05 18:00:57 +01:00
check-integrity.h btrfs: check-integrity: split submit_bio from btrfsic checking 2022-05-16 17:03:12 +02:00
compression.c btrfs: introduce btrfs_bio::fs_info member 2023-04-17 18:01:23 +02:00
compression.h btrfs: move kthread_associate_blkcg out of btrfs_submit_compressed_write 2023-04-17 18:01:22 +02:00
ctree.c btrfs: print extent buffers when sibling keys check fails 2023-04-28 16:36:39 +02:00
ctree.h btrfs: open code btrfs_bin_search() 2023-04-17 18:01:15 +02:00
defrag.c btrfs: remove the wait argument to btrfs_start_ordered_extent 2023-02-13 17:50:34 +01:00
defrag.h btrfs: move defrag related prototypes to their own header 2022-12-05 18:00:46 +01:00
delalloc-space.c btrfs: count extents before taking inode's spinlock when reserving metadata 2023-04-17 18:01:19 +02:00
delalloc-space.h btrfs: move delalloc space related prototypes to delalloc-space.h 2022-12-05 18:00:44 +01:00
delayed-inode.c btrfs: handle btrfs_del_item errors in __btrfs_update_delayed_inode 2023-03-06 19:28:19 +01:00
delayed-inode.h btrfs: extend btrfs_dir_item type to store encryption status 2022-12-05 18:00:43 +01:00
delayed-ref.c btrfs: add helper to calculate space for delayed references 2023-04-17 18:01:19 +02:00
delayed-ref.h btrfs: add helper to calculate space for delayed references 2023-04-17 18:01:19 +02:00
dev-replace.c btrfs: use btrfs_dev_name() helper to handle missing devices better 2022-12-05 18:00:57 +01:00
dev-replace.h btrfs: move dev-replace prototypes into dev-replace.h 2022-12-05 18:00:47 +01:00
dir-item.c btrfs: move dir-item prototypes into dir-item.h 2022-12-05 18:00:46 +01:00
dir-item.h btrfs: move dir-item prototypes into dir-item.h 2022-12-05 18:00:46 +01:00
discard.c btrfs: reinterpret async discard iops_limit=0 as no delay 2023-04-17 19:52:19 +02:00
discard.h btrfs: cleanup btrfs_discard_update_discardable usage 2020-12-08 15:54:02 +01:00
disk-io.c btrfs: use test_and_clear_bit() in wait_dev_flush() 2023-04-17 18:01:20 +02:00
disk-io.h btrfs: rename btrfs_clean_tree_block to btrfs_clear_buffer_dirty 2023-02-15 19:38:53 +01:00
export.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
export.h btrfs: simplify generation check in btrfs_get_dentry 2022-12-05 18:00:41 +01:00
extent_io.c btrfs: introduce btrfs_bio::fs_info member 2023-04-17 18:01:23 +02:00
extent_io.h btrfs: combine btrfs_clear_buffer_dirty and clear_extent_buffer_dirty 2023-02-15 19:38:54 +01:00
extent_map.c btrfs: fix extent map logging bit not cleared for split maps after dropping range 2023-03-06 19:28:19 +01:00
extent_map.h btrfs: remove no longer used btrfs_next_extent_map() 2022-12-05 18:00:56 +01:00
extent-io-tree.c btrfs: fix spelling mistakes found using codespell 2023-02-15 19:38:50 +01:00
extent-io-tree.h btrfs: remove the io_failure_record infrastructure 2023-02-15 19:38:51 +01:00
extent-tree.c btrfs: remove obsolete delayed ref throttling logic when truncating items 2023-04-17 18:01:19 +02:00
extent-tree.h btrfs: introduce size class to block group allocator 2023-02-13 17:50:34 +01:00
file-item.c btrfs: fix encoded write i_size corruption with no-holes 2023-05-02 14:21:00 +02:00
file-item.h btrfs: scrub: introduce helper to find and fill sector info for a scrub_stripe 2023-04-17 18:01:23 +02:00
file.c btrfs: remove the wait argument to btrfs_start_ordered_extent 2023-02-13 17:50:34 +01:00
file.h btrfs: use cached state when looking for delalloc ranges with fiemap 2022-12-05 18:00:56 +01:00
free-space-cache.c btrfs: zoned: count fresh BG region as zone unusable 2023-03-15 20:51:07 +01:00
free-space-cache.h btrfs: convert discard stat defs to enum 2022-12-05 18:00:45 +01:00
free-space-tree.c btrfs: rename btrfs_clean_tree_block to btrfs_clear_buffer_dirty 2023-02-15 19:38:53 +01:00
free-space-tree.h
fs.c btrfs: sysfs: update fs features directory asynchronously 2023-02-13 17:50:35 +01:00
fs.h btrfs: scrub: remove scrub_parity structure 2023-04-17 18:01:24 +02:00
inode-item.c btrfs: remove obsolete delayed ref throttling logic when truncating items 2023-04-17 18:01:19 +02:00
inode-item.h btrfs: use struct fscrypt_str instead of struct qstr 2022-12-05 18:00:43 +01:00
inode.c btrfs: introduce btrfs_bio::fs_info member 2023-04-17 18:01:23 +02:00
ioctl.c btrfs: fix assertion of exclop condition when starting balance 2023-04-28 16:36:27 +02:00
ioctl.h fs: port ->fileattr_set() to pass mnt_idmap 2023-01-19 09:24:27 +01:00
Kconfig block: make blkcg_punt_bio_submit optional 2023-04-17 18:01:22 +02:00
locking.c btrfs: locking: use atomic for DREW lock writers 2023-04-17 18:01:17 +02:00
locking.h btrfs: locking: use atomic for DREW lock writers 2023-04-17 18:01:17 +02:00
lru_cache.c btrfs: send: cache utimes operations for directories if possible 2023-02-15 19:38:50 +01:00
lru_cache.h btrfs: remove btrfs_lru_cache_is_full() inline function 2023-04-17 18:01:18 +02:00
lzo.c btrfs: move zero filling of compressed read bios into common code 2023-04-17 18:01:17 +02:00
Makefile btrfs: send: genericize the backref cache to allow it to be reused 2023-02-13 17:50:35 +01:00
messages.c btrfs: mark btrfs_assertfail() __noreturn 2023-04-17 19:52:19 +02:00
messages.h btrfs: mark btrfs_assertfail() __noreturn 2023-04-17 19:52:19 +02:00
misc.h btrfs: simplify percent calculation helpers, rename div_factor 2022-12-05 18:00:48 +01:00
ordered-data.c btrfs: fold btrfs_clone_ordered_extent into btrfs_split_ordered_extent 2023-04-17 18:01:21 +02:00
ordered-data.h btrfs: sink parameter len to btrfs_split_ordered_extent 2023-04-17 18:01:21 +02:00
orphan.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
orphan.h btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
print-tree.c btrfs: move struct btrfs_tree_parent_check out of disk-io.h 2022-12-05 18:00:57 +01:00
print-tree.h btrfs: print the actual offset in btrfs_root_name 2021-01-07 17:25:05 +01:00
props.c btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
props.h btrfs: make module init/exit match their sequence 2022-12-05 18:00:40 +01:00
qgroup.c btrfs: fix race between quota disable and quota assign ioctls 2023-03-28 00:46:53 +02:00
qgroup.h btrfs: sink gfp_t parameter to btrfs_qgroup_trace_extent 2022-12-05 18:00:43 +01:00
raid56.c btrfs: remove unused raid56 functions which were dedicated for scrub 2023-04-17 19:52:18 +02:00
raid56.h btrfs: remove unused raid56 functions which were dedicated for scrub 2023-04-17 19:52:18 +02:00
rcu-string.h btrfs: replace strncpy() with strscpy() 2022-12-05 18:00:59 +01:00
ref-verify.c btrfs: move accessor helpers into accessors.h 2022-12-05 18:00:42 +01:00
ref-verify.h
reflink.c btrfs: pass btrfs_inode to btrfs_inode_unlock 2022-12-05 18:00:53 +01:00
reflink.h
relocation.c btrfs: open code btrfs_bin_search() 2023-04-17 18:01:15 +02:00
relocation.h btrfs: move relocation prototypes into relocation.h 2022-12-05 18:00:47 +01:00
root-tree.c btrfs: move orphan prototypes into orphan.h 2022-12-05 18:00:47 +01:00
root-tree.h btrfs: move root tree prototypes to their own header 2022-12-05 18:00:44 +01:00
scrub.c btrfs: dev-replace: error out if we have unrepaired metadata error during 2023-04-17 19:52:19 +02:00
scrub.h btrfs: scrub: remove scrub_bio structure 2023-04-17 18:01:24 +02:00
send.c btrfs: fix uninitialized variable warnings 2023-04-17 19:52:19 +02:00
send.h btrfs: send add define for v2 buffer size 2022-12-05 18:00:41 +01:00
space-info.c btrfs: add helper to calculate space for delayed references 2023-04-17 18:01:19 +02:00
space-info.h btrfs: update documentation for BTRFS_RESERVE_FLUSH_EVICT flush method 2023-04-17 18:01:18 +02:00
subpage.c btrfs: move the printk helpers out of ctree.h 2022-12-05 18:00:41 +01:00
subpage.h btrfs: make nodesize >= PAGE_SIZE case to reuse the non-subpage routine 2022-05-16 17:03:11 +02:00
super.c btrfs: properly reject clear_cache and v1 cache for block-group-tree 2023-04-28 16:36:45 +02:00
super.h btrfs: move super_block specific helpers into super.h 2022-12-05 18:00:47 +01:00
sysfs.c btrfs: sysfs: relax bg_reclaim_threshold for debugging purposes 2023-04-17 18:01:18 +02:00
sysfs.h btrfs: sysfs: update fs features directory asynchronously 2023-02-13 17:50:35 +01:00
transaction.c btrfs: correctly calculate delayed ref bytes when starting transaction 2023-04-17 18:01:22 +02:00
transaction.h btrfs: move btrfs_abort_transaction to transaction.c 2023-02-13 17:50:33 +01:00
tree-checker.c btrfs: reduce div64 calls by limiting the number of stripes of a chunk to u32 2023-04-17 18:01:14 +02:00
tree-checker.h btrfs: move struct btrfs_tree_parent_check out of disk-io.h 2022-12-05 18:00:57 +01:00
tree-log.c btrfs: use log root when iterating over index keys when logging directory 2023-04-17 19:52:19 +02:00
tree-log.h btrfs: use a negative value for BTRFS_LOG_FORCE_COMMIT 2023-02-13 17:50:34 +01:00
tree-mod-log.c btrfs: add eb to btrfs_node_key_ptr_offset 2022-12-05 18:00:58 +01:00
tree-mod-log.h btrfs: fix SPDX comment in tree-mod-log.h 2022-12-05 18:00:48 +01:00
ulist.c btrfs: constify ulist parameter of ulist_next() 2022-12-05 18:00:50 +01:00
ulist.h btrfs: constify ulist parameter of ulist_next() 2022-12-05 18:00:50 +01:00
uuid-tree.c btrfs: move uuid tree prototypes to uuid-tree.h 2022-12-05 18:00:46 +01:00
uuid-tree.h btrfs: move uuid tree prototypes to uuid-tree.h 2022-12-05 18:00:46 +01:00
verity.c fsverity: pass pos and size to ->write_merkle_tree_block 2023-01-01 15:46:48 -08:00
verity.h btrfs: move verity prototypes into verity.h 2022-12-05 18:00:47 +01:00
volumes.c btrfs: fix leak of source device allocation state after device replace 2023-04-28 16:36:31 +02:00
volumes.h btrfs: introduce a new helper to submit write bio for repair 2023-04-17 18:01:23 +02:00
xattr.c fs: port xattr to mnt_idmap 2023-01-19 09:24:28 +01:00
xattr.h
zlib.c btrfs: move zero filling of compressed read bios into common code 2023-04-17 18:01:17 +02:00
zoned.c btrfs: zoned: fix wrong use of bitops API in btrfs_ensure_empty_zones 2023-04-28 17:17:25 +02:00
zoned.h btrfs: pass a btrfs_bio to btrfs_use_append 2023-02-15 19:38:55 +01:00
zstd.c btrfs: move zero filling of compressed read bios into common code 2023-04-17 18:01:17 +02:00