linux/fs
Qu Wenruo d16c702fe4 btrfs: ctree: check key order before merging tree blocks
[BUG]
With a crafted image, btrfs can panic at btrfs_del_csums():

  kernel BUG at fs/btrfs/ctree.c:3188!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 0 PID: 1156 Comm: btrfs-transacti Not tainted 5.0.0-rc8+ #9
  RIP: 0010:btrfs_set_item_key_safe+0x16c/0x180
  RSP: 0018:ffff976141257ab8 EFLAGS: 00010202
  RAX: 0000000000000001 RBX: ffff898a6b890930 RCX: 0000000004b70000
  RDX: 0000000000000000 RSI: ffff976141257bae RDI: ffff976141257acf
  RBP: ffff976141257b10 R08: 0000000000001000 R09: ffff9761412579a8
  R10: 0000000000000000 R11: 0000000000000000 R12: ffff976141257abe
  R13: 0000000000000003 R14: ffff898a6a8be578 R15: ffff976141257bae
  FS: 0000000000000000(0000) GS:ffff898a77a00000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00007f779d9cd624 CR3: 000000022b2b4006 CR4: 00000000000206f0
  Call Trace:
  truncate_one_csum+0xac/0xf0
  btrfs_del_csums+0x24f/0x3a0
  __btrfs_free_extent.isra.72+0x5a7/0xbe0
  __btrfs_run_delayed_refs+0x539/0x1120
  btrfs_run_delayed_refs+0xdb/0x1b0
  btrfs_commit_transaction+0x52/0x950
  ? start_transaction+0x94/0x450
  transaction_kthread+0x163/0x190
  kthread+0x105/0x140
  ? btrfs_cleanup_transaction+0x560/0x560
  ? kthread_destroy_worker+0x50/0x50
  ret_from_fork+0x35/0x40
  Modules linked in:
  ---[ end trace 93bf9db00e6c374e ]---

[CAUSE]
This crafted image has a tricky key order corruption:

  checksum tree key (CSUM_TREE ROOT_ITEM 0)
  node 29741056 level 1 items 14 free 107 generation 19 owner CSUM_TREE
          ...
          key (EXTENT_CSUM EXTENT_CSUM 73785344) block 29757440 gen 19
          key (EXTENT_CSUM EXTENT_CSUM 77594624) block 29753344 gen 19
          ...

  leaf 29757440 items 5 free space 150 generation 19 owner CSUM_TREE
          item 0 key (EXTENT_CSUM EXTENT_CSUM 73785344) itemoff 2323 itemsize 1672
                  range start 73785344 end 75497472 length 1712128
          item 1 key (EXTENT_CSUM EXTENT_CSUM 75497472) itemoff 2319 itemsize 4
                  range start 75497472 end 75501568 length 4096
          item 2 key (EXTENT_CSUM EXTENT_CSUM 75501568) itemoff 579 itemsize 1740
                  range start 75501568 end 77283328 length 1781760
          item 3 key (EXTENT_CSUM EXTENT_CSUM 77283328) itemoff 575 itemsize 4
                  range start 77283328 end 77287424 length 4096
          item 4 key (EXTENT_CSUM EXTENT_CSUM 4120596480) itemoff 275 itemsize 300 <<<
                  range start 4120596480 end 4120903680 length 307200
  leaf 29753344 items 3 free space 1936 generation 19 owner CSUM_TREE
          item 0 key (18446744073457893366 EXTENT_CSUM 77594624) itemoff 2323 itemsize 1672
                  range start 77594624 end 79306752 length 1712128
          ...

Note the item 4 key of leaf 29757440, which is obviously too large, and
even larger than the first key of the next leaf.

However it still follows the key order in that tree block, thus tree
checker is unable to detect it at read time, since tree checker can only
work inside one leaf, thus such complex corruption can't be detected in
advance.

[FIX]
The next time to detect such problem is at tree block merge time,
which is in push_node_left(), balance_node_right(), push_leaf_left() or
push_leaf_right().

Now we check if the key order of the right-most key of the left node is
larger than the left-most key of the right node.

By this we don't need to call the full tree-checker, while still keeping
the key order correct as key order in each node is already checked by
tree checker thus we only need to check the above two slots.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=202833
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2020-10-07 12:12:14 +02:00
..
9p treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
adfs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
affs affs: fix basic permission bits to actually work 2020-08-31 12:20:31 +02:00
afs Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2020-09-03 18:50:48 -07:00
autofs autofs: use __kernel_write() for the autofs pipe writing 2020-09-29 17:18:34 -07:00
befs block: move struct block_device to blk_types.h 2020-06-24 09:16:02 -06:00
bfs
btrfs btrfs: ctree: check key order before merging tree blocks 2020-10-07 12:12:14 +02:00
cachefiles cachefiles: switch to kernel_write 2020-07-08 08:27:56 +02:00
ceph We have an inode number handling change, prompted by s390x which is 2020-08-28 10:33:04 -07:00
cifs cifs: fix DFS mount with cifsacl/modefromsid 2020-09-06 23:59:53 -05:00
coda
configfs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
cramfs
crypto mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
debugfs debugfs: Fix module state check condition 2020-09-04 18:12:52 +02:00
devpts
dlm treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
ecryptfs mm, treewide: rename kzfree() to kfree_sensitive() 2020-08-07 11:33:22 -07:00
efivarfs efi/efivars: Expose RT service availability via efivars abstraction 2020-07-09 10:14:29 +03:00
efs block: move struct block_device to blk_types.h 2020-06-24 09:16:02 -06:00
erofs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
exfat exfat: retain 'VolumeFlags' properly 2020-08-12 08:31:13 +09:00
exportfs
ext2 ext2: don't update mtime on COW faults 2020-09-05 10:00:05 -07:00
ext4 \n 2020-08-28 10:57:14 -07:00
f2fs f2fs: Return EOF on unaligned end of file DIO read 2020-09-08 20:31:33 -07:00
fat fat: fix fat_ra_init() for data clusters == 0 2020-08-12 10:58:01 -07:00
freevxfs
fscache Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
fuse fuse: fix the ->direct_IO() treatment of iov_iter 2020-09-17 17:26:56 -04:00
gfs2 Fix memory leak on filesystem withdraw 2020-08-28 10:41:00 -07:00
hfs block: move block-related definitions out of fs.h 2020-06-24 09:16:02 -06:00
hfsplus treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
hostfs
hpfs hpfs: fix warning due to superfluous semicolon 2020-06-06 10:08:17 -07:00
hugetlbfs hugetlbfs: prevent filesystem stacking of hugetlbfs 2020-08-12 10:57:56 -07:00
iomap treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
isofs Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
jbd2 jbd2: clean up checksum verification in do_one_pass() 2020-08-19 12:04:35 -04:00
jffs2 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
jfs block: move struct block_device to blk_types.h 2020-06-24 09:16:02 -06:00
kernfs fsnotify: pass dir and inode arguments to fsnotify() 2020-07-27 23:15:48 +02:00
lockd
minix fs/minix: remove expected error message in block_to_path() 2020-08-12 10:58:00 -07:00
nfs pNFS/flexfiles: Be consistent about mirror index types 2020-09-18 09:25:33 -04:00
nfs_common treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
nfsd Fixes: 2020-08-25 18:01:36 -07:00
nilfs2 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
nls treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
notify treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
ntfs ntfs: fix ntfs_test_inode and ntfs_init_locked_inode function type 2020-08-07 11:33:21 -07:00
ocfs2 treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
omfs treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
openpromfs
orangefs orangefs: remove unnecessary assignment to variable ret 2020-08-04 15:01:58 -04:00
overlayfs Remove uninitialized_var() macro for v5.9-rc1 2020-08-04 13:49:43 -07:00
proc mm, oom: make the calculation of oom badness more accurate 2020-08-12 10:57:56 -07:00
pstore treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
qnx4
qnx6 fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
quota treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
ramfs
reiserfs \n 2020-08-06 19:28:26 -07:00
romfs romfs: fix uninitialized memory leak in romfs_dev_read() 2020-08-21 09:52:53 -07:00
squashfs squashfs: avoid bio_alloc() failure with 1Mbyte blocks 2020-08-21 09:52:53 -07:00
sysfs RDMA 5.8 merge window pull request 2020-06-05 14:05:57 -07:00
sysv
tracefs
ubifs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
udf treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
ufs treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
unicode
vboxsf Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-09-22 15:08:41 -07:00
verity fs-verity: use smp_load_acquire() for ->i_verity_info 2020-07-21 16:02:41 -07:00
xfs Fixes (2) for 5.9: 2020-09-05 10:04:53 -07:00
zonefs zonefs: add zone-capacity support 2020-08-11 17:42:24 +09:00
aio.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
anon_inodes.c
attr.c
bad_inode.c fs: move the fiemap definitions out of fs.h 2020-06-03 23:16:55 -04:00
binfmt_aout.c
binfmt_elf_fdpic.c Merge branch 'work.fdpic' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 13:29:39 -07:00
binfmt_elf.c kill elf_fpxregs_t 2020-07-27 14:29:23 -04:00
binfmt_em86.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
binfmt_flat.c binfmt_flat: revert "binfmt_flat: don't offset the data start" 2020-08-24 08:49:13 +10:00
binfmt_misc.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
binfmt_script.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
block_dev.c for-5.9/io_uring-20200802 2020-08-03 13:01:22 -07:00
buffer.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
char_dev.c
compat_binfmt_elf.c Split the old READ_IMPLIES_EXEC workaround from executable PT_GNU_STACK 2020-06-05 13:45:21 -07:00
compat.c
coredump.c coredump: add %f for executable filename 2020-08-12 10:58:01 -07:00
d_path.c
dax.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
dcache.c vfs: Use sequence counter with associated spinlock 2020-07-29 16:14:27 +02:00
dcookies.c
direct-io.c block: remove the bd_queue field from struct block_device 2020-07-01 08:08:20 -06:00
drop_caches.c
eventfd.c
eventpoll.c ep_create_wakeup_source(): dentry name can change under you... 2020-09-24 19:41:58 -04:00
exec.c mm/gup: remove task_struct pointer for all gup code 2020-08-12 10:58:04 -07:00
fcntl.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
fhandle.c
file_table.c Revert "fs: Do not check if there is a fsnotify watcher on pseudo inodes" 2020-06-29 09:40:55 -07:00
file.c Merge branch 'hch.init_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 09:40:34 -07:00
filesystems.c
fs_context.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
fs_parser.c
fs_pin.c
fs_struct.c vfs: Use sequence counter with associated spinlock 2020-07-29 16:14:27 +02:00
fs_types.c
fs-writeback.c fs/fs-writeback.c: adjust dirtytime_interval_handler definition to match prototype 2020-09-19 13:13:39 -07:00
fsopen.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
init.c init: add an init_dup helper 2020-08-04 21:02:38 -04:00
inode.c AFS Changes 2020-06-05 16:26:36 -07:00
internal.h Merge branch 'hch.init_path' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 09:40:34 -07:00
io_uring.c io_uring-5.9-2020-10-02 2020-10-02 14:38:10 -07:00
io-wq.c io-wq: fix hang after cancelling pending hashed work 2020-08-23 11:38:50 -06:00
io-wq.h io_uring/io-wq: move RLIMIT_FSIZE to io-wq 2020-07-24 13:00:44 -06:00
ioctl.c fs: remove ksys_ioctl 2020-07-31 08:16:01 +02:00
Kconfig tmpfs: support 64-bit inums per-sb 2020-08-07 11:33:24 -07:00
Kconfig.binfmt treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
libfs.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
locks.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
Makefile init: add an init_mount helper 2020-07-31 08:17:51 +02:00
mbcache.c
mount.h
mpage.c fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
namei.c exec: restore EACCES of S_ISDIR execve() 2020-08-14 19:56:56 -07:00
namespace.c Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-08-07 21:03:25 -07:00
no-block.c
nsfs.c
open.c exec: move S_ISREG() check earlier 2020-08-12 10:58:01 -07:00
pipe.c pipe: remove pipe_wait() and fix wakeup race with splice 2020-10-01 19:14:36 -07:00
pnode.c
pnode.h
posix_acl.c vfs: clean up posix_acl_permission() logic aroudn MAY_NOT_BLOCK 2020-06-08 11:04:19 -07:00
proc_namespace.c Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-06-04 13:54:34 -07:00
read_write.c autofs: use __kernel_write() for the autofs pipe writing 2020-09-29 17:18:34 -07:00
readdir.c fs: remove ksys_getdents64 2020-07-31 08:16:00 +02:00
select.c pselect6() and friends: take handling the combined 6th/7th args into helper 2020-05-29 19:10:42 -04:00
seq_file.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
signalfd.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
splice.c pipe: remove pipe_wait() and fix wakeup race with splice 2020-10-01 19:14:36 -07:00
stack.c
stat.c New code for 5.8: 2020-06-02 19:45:12 -07:00
statfs.c
super.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-10 16:09:11 -07:00
sync.c overlayfs update for 5.8 2020-06-09 15:40:50 -07:00
timerfd.c
userfaultfd.c A set of locking fixes and updates: 2020-08-10 19:07:44 -07:00
utimes.c fs: expose utimes_common 2020-07-31 08:16:01 +02:00
xattr.c xattr: add a function to check if a namespace is supported 2020-07-13 17:27:03 -04:00