2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-21 03:33:59 +08:00
linux-next/fs/btrfs
Josef Bacik d98da49977 btrfs: save i_size to avoid double evaluation of i_size_read in compress_file_range
We hit a regression while rolling out 5.2 internally where we were
hitting the following panic

  kernel BUG at mm/page-writeback.c:2659!
  RIP: 0010:clear_page_dirty_for_io+0xe6/0x1f0
  Call Trace:
   __process_pages_contig+0x25a/0x350
   ? extent_clear_unlock_delalloc+0x43/0x70
   submit_compressed_extents+0x359/0x4d0
   normal_work_helper+0x15a/0x330
   process_one_work+0x1f5/0x3f0
   worker_thread+0x2d/0x3d0
   ? rescuer_thread+0x340/0x340
   kthread+0x111/0x130
   ? kthread_create_on_node+0x60/0x60
   ret_from_fork+0x1f/0x30

This is happening because the page is not locked when doing
clear_page_dirty_for_io.  Looking at the core dump it was because our
async_extent had a ram_size of 24576 but our async_chunk range only
spanned 20480, so we had a whole extra page in our ram_size for our
async_extent.

This happened because we try not to compress pages outside of our
i_size, however a cleanup patch changed us to do

actual_end = min_t(u64, i_size_read(inode), end + 1);

which is problematic because i_size_read() can evaluate to different
values in between checking and assigning.  So either an expanding
truncate or a fallocate could increase our i_size while we're doing
writeout and actual_end would end up being past the range we have
locked.

I confirmed this was what was happening by installing a debug kernel
that had

  actual_end = min_t(u64, i_size_read(inode), end + 1);
  if (actual_end > end + 1) {
	  printk(KERN_ERR "KABOOM\n");
	  actual_end = end + 1;
  }

and installing it onto 500 boxes of the tier that had been seeing the
problem regularly.  Last night I got my debug message and no panic,
confirming what I expected.

[ dsterba: the assembly confirms a tiny race window:

    mov    0x20(%rsp),%rax
    cmp    %rax,0x48(%r15)           # read
    movl   $0x0,0x18(%rsp)
    mov    %rax,%r12
    mov    %r14,%rax
    cmovbe 0x48(%r15),%r12           # eval

  Where r15 is inode and 0x48 is offset of i_size.

  The original fix was to revert 62b3762271 that would do an
  intermediate assignment and this would also avoid the doulble
  evaluation but is not future-proof, should the compiler merge the
  stores and call i_size_read anyway.

  There's a patch adding READ_ONCE to i_size_read but that's not being
  applied at the moment and we need to fix the bug. Instead, emulate
  READ_ONCE by two barrier()s that's what effectively happens. The
  assembly confirms single evaluation:

    mov    0x48(%rbp),%rax          # read once
    mov    0x20(%rsp),%rcx
    mov    $0x20,%edx
    cmp    %rax,%rcx
    cmovbe %rcx,%rax
    mov    %rax,(%rsp)
    mov    %rax,%rcx
    mov    %r14,%rax

  Where 0x48(%rbp) is inode->i_size stored to %eax.
]

Fixes: 62b3762271 ("btrfs: Remove isize local variable in compress_file_range")
CC: stable@vger.kernel.org # v5.1+
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ changelog updated ]
Signed-off-by: David Sterba <dsterba@suse.com>
2019-11-04 21:41:49 +01:00
..
tests Btrfs: fix selftests failure due to uninitialized i_mode in test inodes 2019-09-24 14:45:02 +02:00
acl.c btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
async-thread.c btrfs: async-thread: convert defines to enums 2019-09-09 14:59:03 +02:00
async-thread.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
backref.c Btrfs: fix deadlock between fiemap and transaction commits 2019-07-30 18:25:12 +02:00
backref.h btrfs: fiemap: preallocate ulists for btrfs_check_shared 2019-07-01 13:34:53 +02:00
block-group.c btrfs: block-group: Fix a memory leak due to missing btrfs_put_block_group() 2019-10-11 21:27:51 +02:00
block-group.h btrfs: move struct io_ctl to free-space-cache.h 2019-09-09 14:59:15 +02:00
block-rsv.c btrfs: use btrfs_try_granting_tickets in update_global_rsv 2019-09-09 14:59:19 +02:00
block-rsv.h btrfs: migrate the global_block_rsv helpers to block-rsv.c 2019-07-02 12:30:55 +02:00
btrfs_inode.h btrfs: remove assumption about csum type form btrfs_print_data_csum_error() 2019-07-01 13:35:02 +02:00
check-integrity.c btrfs: reduce stack usage for btrfsic_process_written_block 2019-09-09 14:58:58 +02:00
check-integrity.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
compression.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00
compression.h btrfs: compression: replace set_level callbacks by a common helper 2019-09-09 14:59:11 +02:00
ctree.c btrfs: Don't assign retval of btrfs_try_tree_write_lock/btrfs_tree_read_lock_atomic 2019-09-09 14:59:20 +02:00
ctree.h btrfs: qgroup: Always free PREALLOC META reserve in btrfs_delalloc_release_extents() 2019-10-15 18:50:07 +02:00
delalloc-space.c Btrfs: fix qgroup double free after failure to reserve metadata for delalloc 2019-10-17 20:13:44 +02:00
delalloc-space.h btrfs: migrate the delalloc space stuff to it's own home 2019-07-04 17:26:17 +02:00
delayed-inode.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00
delayed-inode.h Btrfs: delayed-inode: use rb_first_cached for ins_root and del_root 2018-10-15 17:23:33 +02:00
delayed-ref.c btrfs: rename btrfs_space_info_add_old_bytes 2019-09-09 14:59:18 +02:00
delayed-ref.h btrfs: migrate the delayed refs rsv code 2019-07-04 17:26:17 +02:00
dev-replace.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00
dev-replace.h btrfs: get fs_info from trans in btrfs_run_dev_replace 2019-04-29 19:02:43 +02:00
dir-item.c btrfs: remove unused parameter fs_info from btrfs_extend_item 2019-04-29 19:02:50 +02:00
disk-io.c btrfs: don't needlessly create extent-refs kernel thread 2019-10-15 15:43:29 +02:00
disk-io.h btrfs: Make reada_tree_block_flagged private 2019-09-09 14:59:11 +02:00
export.c btrfs: Remove 'objectid' member from struct btrfs_root 2018-10-15 17:23:25 +02:00
export.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
extent_io.c Btrfs: fix missing error return if writeback for extent buffer never started 2019-09-24 14:45:23 +02:00
extent_io.h btrfs: Remove delalloc_end argument from extent_clear_unlock_delalloc 2019-09-09 14:58:59 +02:00
extent_map.c btrfs: assert extent map tree lock in add_extent_mapping 2019-09-09 14:59:00 +02:00
extent_map.h btrfs: Remove impossible condition from mergable_maps 2019-02-25 14:13:21 +01:00
extent-tree.c btrfs: refactor the ticket wakeup code 2019-09-09 14:59:18 +02:00
file-item.c btrfs: directly call into crypto framework for checksumming 2019-07-01 13:35:02 +02:00
file.c Btrfs: check for the full sync flag while holding the inode lock during fsync 2019-10-17 20:36:02 +02:00
free-space-cache.c btrfs: stop clearing EXTENT_DIRTY in inode I/O tree 2019-09-09 14:59:17 +02:00
free-space-cache.h btrfs: move struct io_ctl to free-space-cache.h 2019-09-09 14:59:15 +02:00
free-space-tree.c btrfs: move basic block_group definitions to their own header 2019-09-09 14:59:03 +02:00
free-space-tree.h btrfs: move basic block_group definitions to their own header 2019-09-09 14:59:03 +02:00
inode-item.c btrfs: Make btrfs_find_name_in_ext_backref return struct btrfs_inode_extref 2019-09-09 14:59:16 +02:00
inode-map.c btrfs: qgroup: Always free PREALLOC META reserve in btrfs_delalloc_release_extents() 2019-10-15 18:50:07 +02:00
inode-map.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
inode.c btrfs: save i_size to avoid double evaluation of i_size_read in compress_file_range 2019-11-04 21:41:49 +01:00
ioctl.c btrfs: qgroup: Always free PREALLOC META reserve in btrfs_delalloc_release_extents() 2019-10-15 18:50:07 +02:00
Kconfig btrfs: Fix build error while LIBCRC32C is module 2019-07-17 17:03:30 +02:00
locking.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00
locking.h btrfs: Remove unused locking functions 2019-09-09 14:58:59 +02:00
lzo.c btrfs: compression: replace set_level callbacks by a common helper 2019-09-09 14:59:11 +02:00
Makefile btrfs: migrate the block group lookup code 2019-09-09 14:59:04 +02:00
misc.h btrfs: move math functions to misc.h 2019-09-09 14:59:15 +02:00
ordered-data.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00
ordered-data.h btrfs: don't assume ordered sums to be 4 bytes 2019-07-01 13:35:00 +02:00
orphan.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
print-tree.c btrfs: switch extent_buffer write_locks from atomic to int 2019-07-02 12:30:47 +02:00
print-tree.h btrfs: print-tree: debugging output enhancement 2018-04-20 19:18:16 +02:00
props.c btrfs: rename the btrfs_calc_*_metadata_size helpers 2019-09-09 14:59:13 +02:00
props.h btrfs: delete unused function btrfs_set_prop_trans 2019-04-29 19:02:54 +02:00
qgroup.c btrfs: tracepoints: Fix wrong parameter order for qgroup events 2019-10-17 14:09:31 +02:00
qgroup.h btrfs: qgroup: Move reserved data accounting from btrfs_delayed_ref_head to btrfs_qgroup_extent_record 2019-02-25 14:13:39 +01:00
raid56.c btrfs: move private raid56 definitions from ctree.h 2019-09-09 14:59:15 +02:00
raid56.h btrfs: constify map parameter for nr_parity_stripes and nr_data_stripes 2019-07-01 13:34:58 +02:00
rcu-string.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
reada.c btrfs: Make reada_tree_block_flagged private 2019-09-09 14:59:11 +02:00
ref-verify.c btrfs: fix uninitialized ret in ref-verify 2019-10-03 15:00:56 +02:00
ref-verify.h btrfs: ref-verify: Use btrfs_ref to refactor btrfs_ref_tree_mod() 2019-04-29 19:02:49 +02:00
relocation.c btrfs: qgroup: Always free PREALLOC META reserve in btrfs_delalloc_release_extents() 2019-10-15 18:50:07 +02:00
root-tree.c btrfs: rename the btrfs_calc_*_metadata_size helpers 2019-09-09 14:59:13 +02:00
scrub.c btrfs: move basic block_group definitions to their own header 2019-09-09 14:59:03 +02:00
send.c btrfs: silence maybe-uninitialized warning in clone_range 2019-10-08 13:14:55 +02:00
send.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
space-info.c Btrfs: fix race leading to metadata space leak after task received signal 2019-10-25 19:11:34 +02:00
space-info.h btrfs: rename btrfs_space_info_add_old_bytes 2019-09-09 14:59:18 +02:00
struct-funcs.c btrfs: tie extent buffer and it's token together 2019-09-09 14:59:16 +02:00
super.c btrfs: move sysfs declarations out of ctree.h 2019-09-09 14:59:06 +02:00
sysfs.c btrfs: sysfs: move helper macros to sysfs.c 2019-09-09 14:59:08 +02:00
sysfs.h btrfs: sysfs: move helper macros to sysfs.c 2019-09-09 14:59:08 +02:00
transaction.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00
transaction.h Btrfs: fix deadlock between fiemap and transaction commits 2019-07-30 18:25:12 +02:00
tree-checker.c btrfs: tree-checker: Fix wrong check on max devid 2019-10-25 19:11:34 +02:00
tree-checker.h btrfs: get fs_info from eb in btrfs_check_chunk_valid 2019-04-29 19:02:39 +02:00
tree-defrag.c btrfs: open code now trivial btrfs_set_lock_blocking 2019-02-25 14:13:27 +01:00
tree-log.c btrfs: fix incorrect updating of log root tree 2019-10-01 18:41:02 +02:00
tree-log.h btrfs: get fs_info from trans in btrfs_set_log_full_commit 2019-04-29 19:02:41 +02:00
ulist.c btrfs: replace GPL boilerplate by SPDX -- sources 2018-04-12 16:29:51 +02:00
ulist.h btrfs: replace GPL boilerplate by SPDX -- headers 2018-04-12 16:29:46 +02:00
uuid-tree.c btrfs: remove unused parameter fs_info from btrfs_extend_item 2019-04-29 19:02:50 +02:00
volumes.c btrfs: Consider system chunk array size for new SYSTEM chunks 2019-10-25 19:11:34 +02:00
volumes.h btrfs: reset device stat using btrfs_dev_stat_set 2019-09-09 14:59:06 +02:00
xattr.c Btrfs: fix failure to persist compression property xattr deletion on fsync 2019-06-17 16:37:17 +02:00
xattr.h btrfs: cleanup btrfs_setxattr_trans and drop transaction parameter 2019-04-29 19:02:44 +02:00
zlib.c btrfs: compression: replace set_level callbacks by a common helper 2019-09-09 14:59:11 +02:00
zstd.c btrfs: move cond_wake_up functions out of ctree 2019-09-09 14:59:15 +02:00