The function write_and_map_eb() is quite abused as a way to write any
generic buffer back to disk.
But we have a more suitable function already, write_data_to_disk().
This patch would remove the abused write_data_to_disk() calls, and
convert the only three valid call sites to write_data_to_disk() instead.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The in-kernel version of read_tree_block adds some extra sanity checks
to make sure we don't return blocks that don't match what we expect.
This includes the owning root, the level, and the expected first key.
We don't actually do these checks in btrfs-progs, however kernel code
we're going to sync will expect this calling convention, so update it to
match the in-kernel code and then update all the callers.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch syncs file-item.h into btrfs-progs. This carries with it an
API change for btrfs_del_csums, which takes a root argument in the
kernel, so all callsites have been updated accordingly.
I didn't sync file-item.c because it carries with it a bunch of bio
related helpers which are difficult to adapt to the kernel.
Additionally there's a few helpers in the local copy of file-item.c that
aren't in the kernel that are required for different tools.
This requires more cleanups in both the kernel and progs in order to
sync file-item.c, so for now just do file-item.h in order to pull things
out of ctree.h.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
The following command will crash:
$ btrfs-corrupt-block --value 4308598784 --root 5 --inode 256 --file-extent 0 \
-f disk_bytenr ~/test.img
[CAUSE]
The backtrace is at the following code:
case 'r':
root_objectid = arg_strtou64(optarg);
break;
And @optarg is NULL.
The root cause is, for short option "-r" it indeed requires an argument.
But unfortunately for the longer version, it goes:
{ "root", no_argument, NULL, 'r'},
Thus it gave @optarg as NULL if we go the longer option and crash.
[FIX]
Just fix the argument requirement for "--root" option.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We're using btrfs_item_nr_offset(leaf, 0) to get the start of the leaf
data in the kernel, we don't have btrfs_leaf_data. Replace all
occurrences of btrfs_leaf_data() with btrfs_item_nr_offset(leaf, 0) in
order to make syncing accessors.[ch] easier. ctree.c will be synced
later, so this is simply an intermediate step.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The (unsigned long long) type casts can be dropped, printf understands
%llu and u64 and does not warn. In cases where the type is not u64 keep
the cast.
Signed-off-by: David Sterba <dsterba@suse.com>
The radix-tree is not used in userspace code. In kernel it's for
tracking unpersisted and in-memory structures and has been replaced by
the xarray.
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
If using btrfs-corrupt-block to corrupt the generation of a tree
block (in my example, it's csum root), it will cause csum mismatch other
than the expected transid mismatch:
# ./btrfs-corrupt-block --metadata-block 30474240 -f generation \
/dev/test/scratch1
# btrfs check /dev/test/scratch1
Opening filesystem to check...
checksum verify failed on 30474240 wanted 0xb3e8059a found 0xb4a4b45c
checksum verify failed on 30474240 wanted 0xb3e8059a found 0xb4a4b45c
checksum verify failed on 30474240 wanted 0xb3e8059a found 0xb4a4b45c
Csum didn't match
ERROR: could not setup csum tree
ERROR: cannot open file system
[CAUSE]
Inside the switch branch BTRFS_METADATA_BLOCK_GENERATION in
corrupt_metadata_block(), we just set the generation and trigger
write_and_map_eb().
However write_and_map_eb() doesn't re-generate the checksum by itself,
thus we make the victim tree block to have a stale checksum.
[FIX]
Just call csum_tree_block_size() before write_and_map_eb().
Now the corrupted fs have the expected corruption pattern now:
# btrfs check /dev/test/scratch1
Opening filesystem to check...
parent transid verify failed on 30474240 wanted 7 found 11814770867473404344
parent transid verify failed on 30474240 wanted 7 found 11814770867473404344
parent transid verify failed on 30474240 wanted 7 found 11814770867473404344
Ignoring transid failure
...
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The help text is out of sync with many options, lacking the long
options, required arguments or mistakenly requiring arguments when the
value is read from another one.
Signed-off-by: David Sterba <dsterba@suse.com>
To corrupt holes/prealloc/inline extents, we need to mess with
extent data items. This patch makes it possible to modify
disk_bytenr with a specific value (useful for hole corruptions)
and to modify the type field (useful for prealloc corruptions)
Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs-corrupt-block already has a mix of generic and specific corruption
options, but currently lacks the capacity for totally arbitrary
corruption in item data.
There is already a flag for corruption size (bytes/-b), so add a flag
for an offset and a value to memset the item with. Exercise the new
flags with a new variant for -I (item) corruption. Look up the item as
before, but instead of corrupting a field in the item struct, corrupt an
offset/size in the item data.
The motivating example for this is that in testing fsverity with btrfs,
we need to corrupt the generated Merkle tree--metadata item data which
is an opaque blob to btrfs.
Reviewed-by: Sweet Tea Dorminy <sweettea-kernel@dorminy.me>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Add constant for initial value to avoid unexpected clashes with user
defined getopt values and shift the common size getopt values.
Signed-off-by: David Sterba <dsterba@suse.com>
Those two members are a shortcut for non-RAID56 profiles.
But we should not use such shortcut, and move all our logical address
read/write to the unified read_data_from_disk()/write_data_to_disk().
With previous refactors, now we're safe to remove them.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The function read_extent_from_disk() is only a wrapper to read tree
block.
And read_extent_data() is just a while loop to eliminate short read
caused by stripe boundary.
In fact, a lot of call sites of read_extent_data() are either reading
metadata (thus no possible short read) or doing extra loop by
themselves.
This patch will replace those two functions with read_data_from_disk(),
making it the only entrance for data/metadata read.
And update read_data_from_disk() to return the read bytes, so caller can
do a simple while loop.
For the few callers of read_extent_data(), open-code a small while loop
for them.
This will allow later RAID56 read repair using P/Q much easier.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are two call sites using write_extent_to_disk() directly:
- debug_corrupt_block() in btrfs-corrupt-block.c
- corrupt_keys() in btrfs-corrupt-block.c
The problem of write_extent_to_disk() is, it can only handle plain
profiles (All profiles except P/Q stripes of RAID56).
Calling it directly can corrupted RAID56 P/Q, and in the future we're
going to remove eb::fd/eb::dev_bytes, so remove such call sites with
write_and_map_eb().
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Now that all callers are using the _nr variations we can simply rename
these helpers to btrfs_item_##member/btrfs_set_item_##member and change
the actual item SETGET funcs to raw_item_##member/set_raw_item_##member
and then change all callers to drop the _nr part.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have a lot of the following patterns
item = btrfs_item_nr(nr);
btrfs_set_item_*(eb, item, val);
btrfs_set_item_*(eb, btrfs_item_nr(nr), val);
in a lot of places in our code. Instead add _nr variations of these
helpers and convert all of the users to this new helper.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When we switch to multiple global trees we'll need to access the
appropriate extent root depending on the block group or possibly root.
To handle this, use a helper in most places and then the actual root in
places where it is required. We will whittle down the direct accessors
with future patches, but this does the bulk of the preparatory work.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With extent tree v2 we will have per-block group checksums, so add a
helper to access the csum root and rename the fs_info csum_root to
_csum_root to catch all the places that are accessing it directly.
Convert everybody to use the helper except for internal things.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Just like kernel commit 22b6331d9617 ("btrfs: store precalculated
csum_size in fs_info"), we can cache csum_size and csum_type in
btrfs_fs_info.
Furthermore, there is already a 32 bits hole in btrfs_fs_info, and we
can fit csum_type and csum_size into the hole without increase the size
of btrfs_fs_info.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Long options are always preferred and in case there's a long list of
single letter options it's the best practice to keep the options sane.
Signed-off-by: David Sterba <dsterba@suse.com>
While doing the extent tree v2 stuff I noticed that fsck doesn't detect
an invalid ->used value on the block group item in the normal mode. To
build a test case for this I need the ability to corrupt block group
items. This allows us to corrupt the various fields of a block group.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
All callers of write_and_map_eb(), except btrfs-corrupt-block, have
handled error, but inside write_and_map_eb() itself, the only error
handling is BUG_ON().
This patch will kill all the BUG_ON()s inside write_and_map_eb(), and
enhance the the caller in btrfs-corrupt-block() to handle the error.
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As progs' transaction/CoW logic evolved over the years the metadata block
corruption code failed to do so. It's currently impossible to corrupt
the generation because the CoW logic will not only set it to the value
of the currently running transaction (__btrfs_cow_block) but the
current code will ASSERT due to the following check in __btrfs_cow_block:
WARN_ON(!(buf->flags & EXTENT_BAD_TRANSID) &&
btrfs_header_generation(buf) > trans->transid);
Fix this by making the generation corruption code directly write
the modified block, outside of the transaction mechanism. At the same
time move the old code into BTRFS_METADATA_BLOCK_SHIFT_ITEMS handling
case, essentially leaving it unchanged.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add the checksum type to csum_tree_block_size(), __csum_tree_block_size()
and verify_tree_block_csum_silent().
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Since commit 04be0e4b19 ("btrfs-progs: corrupt-block: Correctly
handle -r when passing -I") the 'r' switch is used with both -I and -d
options. So remove the wrong clarificatoin that -r is used only with -d
option.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The error message about the unsatisfied argument count is scrolled away
by the full usage string dump. This is not considered a good usability
practice.
This commit switches all direct usage -> return patterns, where the
argument check has no other constraint, eg. dependency on an option.
Signed-off-by: David Sterba <dsterba@suse.com>
Similar to the changes where strerror(errno) was converted, continue
with the remaining cases where the argument was stored in another
variable.
The savings in object size are about 4500 bytes:
$ size btrfs.old btrfs.new
text data bss dec hex filename
805055 24248 19748 849051 cf49b btrfs.old
804527 24248 19748 848523 cf28b btrfs.new
Signed-off-by: David Sterba <dsterba@suse.com>
It's not needed, since we can obtain a reference to fs_info from the
passed transaction handle. This is needed by delayed refs code.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently the -D option is essentially defunct since it's the root,
where we are going to corrupt a dir item is always set to the tree
root. Fix this by passing the root from the "-r" option. Aditionally
convert the interface for this option to the new format. So if one
wants to corrupt a dir item in the default fs tree, they should now
invoke:
btrfs-corrupt-block -r 5 -D <objectid,DIR_ITEM,offset> -f name
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently if we want to delete an item we need to invoke corrupt block
like so:
btrfs-corrupt-block [-r <numeric id of root>] -K <objectid,type,offset> -d
Instead, this patch converts the format to:
btrfs-corrupt-block [-r <numeric id of root>] -d <objectid,type,offset>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently the -K option supports corrupting items only in the default
root (which is the root tree). This makes it impossible to test the
free-space recovery (or any other) code for that matter. Fix it by
using the root corresponding to the one passed in -r (if any).
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>