2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2025-01-09 14:14:00 +08:00
linux-next/fs/btrfs
Filipe Manana 008c6753f7 Btrfs: fix missing data checksums after a ranged fsync (msync)
Recently we got a massive simplification for fsync, where for the fast
path we no longer log new extents while their respective ordered extents
are still running.

However that simplification introduced a subtle regression for the case
where we use a ranged fsync (msync). Consider the following example:

               CPU 0                                    CPU 1

                                            mmap write to range [2Mb, 4Mb[
  mmap write to range [512Kb, 1Mb[
  msync range [512K, 1Mb[
    --> triggers fast fsync
        (BTRFS_INODE_NEEDS_FULL_SYNC
         not set)
    --> creates extent map A for this
        range and adds it to list of
        modified extents
    --> starts ordered extent A for
        this range
    --> waits for it to complete

                                            writeback triggered for range
                                            [2Mb, 4Mb[
                                              --> create extent map B and
                                                  adds it to the list of
                                                  modified extents
                                              --> creates ordered extent B

    --> start looking for and logging
        modified extents
    --> logs extent maps A and B
    --> finds checksums for extent A
        in the csum tree, but not for
        extent B
  fsync (msync) finishes

                                              --> ordered extent B
                                                  finishes and its
                                                  checksums are added
                                                  to the csum tree

                                <power cut>

After replaying the log, we have the extent covering the range [2Mb, 4Mb[
but do not have the data checksum items covering that file range.

This happens because at the very beginning of an fsync (btrfs_sync_file())
we start and wait for IO in the given range [512Kb, 1Mb[ and therefore
wait for any ordered extents in that range to complete before we start
logging the extents. However if right before we start logging the extent
in our range [512Kb, 1Mb[, writeback is started for any other dirty range,
such as the range [2Mb, 4Mb[ due to memory pressure or a concurrent fsync
or msync (btrfs_sync_file() starts writeback before acquiring the inode's
lock), an ordered extent is created for that other range and a new extent
map is created to represent that range and added to the inode's list of
modified extents.

That means that we will see that other extent in that list when collecting
extents for logging (done at btrfs_log_changed_extents()) and log the
extent before the respective ordered extent finishes - namely before the
checksum items are added to the checksums tree, which is where
log_extent_csums() looks for the checksums, therefore making us log an
extent without logging its checksums. Before that massive simplification
of fsync, this wasn't a problem because besides looking for checkums in
the checksums tree, we also looked for them in any ordered extent still
running.

The consequence of data checksums missing for a file range is that users
attempting to read the affected file range will get -EIO errors and dmesg
reports the following:

 [10188.358136] BTRFS info (device sdc): no csum found for inode 297 start 57344
 [10188.359278] BTRFS warning (device sdc): csum failed root 5 ino 297 off 57344 csum 0x98f94189 expected csum 0x00000000 mirror 1

So fix this by skipping extents outside of our logging range at
btrfs_log_changed_extents() and leaving them on the list of modified
extents so that any subsequent ranged fsync may collect them if needed.
Also, if we find a hole extent outside of the range still log it, just
to prevent having gaps between extent items after replaying the log,
otherwise fsck will complain when we are not using the NO_HOLES feature
(fstest btrfs/056 triggers such case).

Fixes: e7175a6927 ("btrfs: remove the wait ordered logic in the log_one_extent path")
CC: stable@vger.kernel.org # 4.19+
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2018-11-06 16:41:40 +01:00
..
tests btrfs: tests: add separate stub for find_lock_delalloc_range 2018-10-15 17:23:34 +02:00
acl.c btrfs: remove unnecessary curly braces in btrfs_get_acl 2018-08-06 13:12:41 +02:00
async-thread.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
async-thread.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
backref.c Btrfs: preftree: use rb_first_cached 2018-10-15 17:23:33 +02:00
backref.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
btrfs_inode.h btrfs: Remove 'objectid' member from struct btrfs_root 2018-10-15 17:23:25 +02:00
check-integrity.c Btrfs: use args in the correct order for kcalloc in btrfsic_read_block 2018-10-15 17:23:30 +02:00
check-integrity.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
compression.c btrfs: remove unused pointer 'tree' in btrfs_submit_compressed_read 2018-10-15 17:23:28 +02:00
compression.h btrfs: compression: Add linux/sizes.h for compression.h 2018-05-29 18:13:00 +02:00
ctree.c Btrfs: fix deadlock when writing out free space caches 2018-10-17 17:46:24 +02:00
ctree.h btrfs: remove fs_info from btrfs_should_throttle_delayed_refs 2018-10-15 17:23:41 +02:00
dedupe.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
delayed-inode.c Btrfs: kill btrfs_clear_path_blocking 2018-10-15 17:23:38 +02:00
delayed-inode.h Btrfs: delayed-inode: use rb_first_cached for ins_root and del_root 2018-10-15 17:23:33 +02:00
delayed-ref.c btrfs: delayed-ref: extract find_first_ref_head from find_ref_head 2018-10-17 19:21:00 +02:00
delayed-ref.h btrfs: delayed-ref: pass delayed_refs directly to btrfs_delayed_ref_lock 2018-10-15 17:23:41 +02:00
dev-replace.c btrfs: dev-replace: remove pointless assert in write unlock 2018-10-15 17:23:38 +02:00
dev-replace.h btrfs: open code btrfs_after_dev_replace_commit 2018-10-15 17:23:37 +02:00
dir-item.c btrfs: Remove root parameter from btrfs_insert_dir_item 2018-10-15 17:23:25 +02:00
disk-io.c btrfs: fix pinned underflow after transaction aborted 2018-11-06 16:41:34 +01:00
disk-io.h btrfs: unify end_io callbacks of async_submit_bio 2018-08-06 13:12:55 +02:00
export.c btrfs: Remove 'objectid' member from struct btrfs_root 2018-10-15 17:23:25 +02:00
export.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
extent_io.c btrfs: tests: add separate stub for find_lock_delalloc_range 2018-10-15 17:23:34 +02:00
extent_io.h btrfs: tests: add separate stub for find_lock_delalloc_range 2018-10-15 17:23:34 +02:00
extent_map.c Btrfs: extent_map: use rb_first_cached 2018-10-15 17:23:33 +02:00
extent_map.h Btrfs: extent_map: use rb_first_cached 2018-10-15 17:23:33 +02:00
extent-tree.c btrfs: fix insert_reserved error handling 2018-10-19 12:20:03 +02:00
file-item.c btrfs: simplify pointer chasing of local fs_info variables 2018-08-06 13:12:43 +02:00
file.c btrfs: move the dio_sem higher up the callchain 2018-10-19 12:20:04 +02:00
free-space-cache.c Btrfs: fix use-after-free when dumping free space 2018-10-22 20:31:22 +02:00
free-space-cache.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
free-space-tree.c btrfs: Remove fs_info from btrfs_del_root 2018-08-06 13:13:00 +02:00
free-space-tree.h btrfs: Remove fs_info argument from add_to_free_space_tree 2018-05-28 18:07:36 +02:00
inode-item.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
inode-map.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
inode-map.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
inode.c Btrfs: fix cur_offset in the error case for nocow 2018-11-06 16:41:04 +01:00
ioctl.c btrfs: Ensure btrfs_trim_fs can trim the whole filesystem 2018-10-15 17:23:32 +02:00
Kconfig btrfs: add SPDX header to Kconfig 2018-04-12 16:29:55 +02:00
locking.c btrfs: replace waitqueue_actvie with cond_wake_up 2018-05-28 18:23:09 +02:00
locking.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
lzo.c btrfs: lzo: Harden inline lzo compressed extent decompression 2018-05-30 16:46:43 +02:00
Makefile btrfs: Remove custom crc32c init code 2018-03-26 15:09:39 +02:00
math.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
ordered-data.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
ordered-data.h btrfs: remove remaing full_sync logic from btrfs_sync_file 2018-08-06 13:12:31 +02:00
orphan.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
print-tree.c btrfs: annotate unlikely branches after V0 extent type removal 2018-08-06 13:12:41 +02:00
print-tree.h btrfs: print-tree: debugging output enhancement 2018-04-20 19:18:16 +02:00
props.c btrfs: property: Set incompat flag if lzo/zstd compression is set 2018-05-17 14:18:25 +02:00
props.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
qgroup.c btrfs: qgroup: move the qgroup->members check out from (!qgroup)'s else branch 2018-10-15 17:23:40 +02:00
qgroup.h btrfs: qgroup: Avoid calling qgroup functions if qgroup is not enabled 2018-10-15 17:23:40 +02:00
raid56.c btrfs: raid56: catch errors from full_stripe_write 2018-08-06 13:12:45 +02:00
raid56.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
rcu-string.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
reada.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
ref-verify.c btrfs: Remove 'objectid' member from struct btrfs_root 2018-10-15 17:23:25 +02:00
ref-verify.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
relocation.c btrfs: relocation: Remove redundant tree level check 2018-10-15 17:23:40 +02:00
root-tree.c btrfs: Remove fs_info from btrfs_add_root_ref 2018-08-06 13:13:00 +02:00
scrub.c btrfs: open code btrfs_dev_replace_stats_inc 2018-10-15 17:23:37 +02:00
send.c Btrfs: unify error handling of btrfs_lookup_dir_item 2018-10-15 17:23:30 +02:00
send.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
struct-funcs.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
super.c btrfs: Remove 'objectid' member from struct btrfs_root 2018-10-15 17:23:25 +02:00
sysfs.c btrfs: prune unused includes 2018-08-06 13:12:43 +02:00
sysfs.h btrfs: sysfs: Use enum/define value for feature array definitions 2018-05-28 18:23:39 +02:00
transaction.c btrfs: don't run delayed_iputs in commit 2018-10-19 12:20:03 +02:00
transaction.h btrfs: replace get_seconds with new 64bit time API 2018-08-06 13:12:29 +02:00
tree-checker.c btrfs: tree-checker: Check level for leaves and nodes 2018-10-15 17:23:37 +02:00
tree-checker.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
tree-defrag.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
tree-log.c Btrfs: fix missing data checksums after a ranged fsync (msync) 2018-11-06 16:41:40 +01:00
tree-log.h btrfs: change btrfs_pin_log_trans to return void 2018-10-15 17:23:27 +02:00
ulist.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
ulist.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
uuid-tree.c btrfs: Remove fs_info argument from btrfs_uuid_tree_rem 2018-05-30 16:46:53 +02:00
volumes.c btrfs: open code btrfs_dev_replace_clear_lock_blocking 2018-10-15 17:23:37 +02:00
volumes.h btrfs: Make btrfs_find_device_by_devspec return btrfs_device directly 2018-10-15 17:23:30 +02:00
xattr.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
xattr.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
zlib.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
zstd.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00