"fi usage" shows the warning "RAID5/6 numbers will be incorrect" when
running without root privilege even if raid5/6 is not used. What
happens is it cannot get the per device profile usage info, so change
the message more appropriately.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
[ adjust message ]
Signed-off-by: David Sterba <dsterba@suse.com>
First this function does the '!!' trick to turn its value into a bool,
yet the return type is int. Second, the kernel counterpart also returns
bool.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
All real consumers of that variable have inlined the checks since they
are simple enough. So just remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's no longer used so just remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's no longer used in that function so can be dropped altogether.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's no longer used so just remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's no longer use and can be dropped.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's no longer used in the function so just remove it
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We can derive this argument from root->fs_info and also make it local
to the sole switch case it's used.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This parameter is no longer used in this function, so drop it
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is a boolean parameter which signals whether the fs has the
EXTENDED_IREF feature flag toggled or not. Since a reference to fs_info
can be obtained there is no need to pollute the interface.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Verify that if we have an otherwise clean filesystem, containging
collided DIR_ITEM, btrfs check lowmem's mode can correctly handle those
and not produce any false positives.
This if fixed by commit titled:
"btrfs-progs: check: fix DIR_ITEM checking in lowmem"
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When checking the validity of a DIR_ITEM item the index variable is
explicitly set to -1 so that the index check in find_inode_ref() is
ignored. This is necessary due to possible name collisions in DIR_ITEMS
(i.e. multiple files with the same crc32c for their names, resulting in
the identical key->offset). Currently the code is broken due to index
being set to -1 at the beginning of check_dir_item and subsequently
reset to the index found in find_inode_ref. On subsequent iterations of
the while loop in check_dir_items we are going to erroneously perform
index checking, since index is not -1 but whatever the index value of
the last handled INODE_REF ITEM.
Without this patch check in lowmem mode produces the following false
warnings positivess:
ERROR: root 5 INODE REF[1991456, 256] name 5ab4e28b~~~~~~~~QBXUT2GBJDRT5CB6ZWAJVDHK filetype 1 missing
But the items for this name are in fact correct:
------------------SNIP------------------------
item 13 key (256 DIR_ITEM 4227063046) itemoff 2945 itemsize 140
location key (220890 INODE_ITEM 0) type FILE
transid 2278 data_len 0 name_len 40
name: 5ab4e1bb~~~~~~~~65KNTAWVJ5VD4SV2R3BKFCXL
location key (1991456 INODE_ITEM 0) type FILE
transid 2285 data_len 0 name_len 40
name: 5ab4e28b~~~~~~~~QBXUT2GBJDRT5CB6ZWAJVDHK
item 21 key (1991456 INODE_REF 256) itemoff 2104 itemsize 50
index 1934146 namelen 40 name: 5ab4e28b~~~~~~~~QBXUT2GBJDRT5CB6ZWAJVDHK
-------------------SNIP-------------------------
Fix this by moving the code setting index at the beginning of the while
loop. This ensure that for each item in DIR_ITEM we have index set
correctly.
Fixes: 564901eac7 ("btrfs-progs: check: introduce print_dir_item_err()")
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since the test case uses run_mustfail(), which is pretty easy pass due to
other unexpected problems, so here an extra run_check() is added to
ensure we don't only report qgroup error, but also fix it without
problem.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Current btrfs-check will check qgroup consistency, but even when it
finds something wrong, the return value is still 0.
Fix it by allowing report_qgroups() to return int to indicate qgroup
mismatch, and also add extra logic to return no error if qgroup repair
is successful.
Without this patch, fstests can't detect qgroup corruption by its fsck
alone.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Originally commit 2681e00f00 ("btrfs-progs: check for matchingi
free space in cache") added the account_super_bytes function to prevent
false negative when running btrfs check. Turns out this function is
really copied exclude_super_stripes, excluding the calls to
exclude_super_stripes. Later commit e4797df6a9 ("btrfs-progs: check
the free space tree in btrfsck") introduced proper version of
exclude_super_stripes. Instead of duplicating the function, just remove
account_super_bytes and use exclude_super_stripes instead of the former.
This also has the benefit of bringing the userspace code a bit closer
to the kernel counterpart.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since commit aaf2dac5ef ("btrfs-progs: qgroup: split update_qgroup to
reduce arguments") cause qgroup show to output the wrong qgroup
parent-child relationship, in addition to fixing the problem, a test case
is needed to prevent the similar problem in the future.
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The QGROUP_RELATION item is very special, it always exists in pairs
(with objectid and offset exchanged). Its objectid and offset are the
ids of a pair of parent and child qgroups, respectively. The larger one
is parent and the smaller one is child. After the following commit, the
order of the parameters is wrong and causes qgroup show to output the
wrong qgroup parent-child relationship.
Fixes: aaf2dac5ef ("btrfs-progs: qgroup: split update_qgroup to reduce arguments")
Issue: #129
Signed-off-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Normally corrupted leaf should be caught by csum check, but sometimes
corrupted item pointers (out of leaf range) can still pass csum check.
In fact, our fsck/005 test case image has such corrupted item pointer
and btrfs check can surprisingly fix it.
Anyway, make print-tree to skip such item and remaining slots to avoid
segfault.
Please note that, in btrfs-progs we can't put such check into
check_tree_block() nor do kernel level comprehensive check as under
certain case, btrfs-progs can handle or even repair it.
A strict check_tree_block() or backporting kernel btrfs_check_leaf()
could break such test cases and reduce the utility of btrfs-progs.
Issue: #128
Reported-by: Hubert Kario <hubert@kario.pl>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function is always called with objectid set to 0. So remove the
parameter and statically set the ->objectid to 0 when allocating a new
extent.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 17793e3e6a ("Btrfs-progs: extend the extent cache for the
device extent") extended the cache extent APIs to support objectid to
distinguish between phyisical extents with same dimensions but on
different devices. However, it seems that this particular function is
not used to allocate a device extent with accompanying objectid.
Instead such extents as processed solely by insert_device_extent_record.
Remove the unused code as a result.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This parameter was introduced with the original implementation of the
function but has never really been used, so just remove it.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This new image misses one extent which leads lowmem mode to allocate new
chunks in repair.
Rename original image to no_extent_bad_dev.img.
Because of its bad used bytes, it should let lowmem mode
exclude blocks in repair.
Due to problems of btrfs-image, choose xz as compression tool.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Remove @trans in check_chunks_and_extents_lowmem().
After this patch, lowmem repair works again.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce repair_block_accounting() which calls
btrfs_fix_block_accounting() to repair block group accouting.
Replace btrfs_fix_block_accounting() with the new function.
Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
Such error will disappear after removing @trans finished.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Remove parameters @trans of delete_extent_item() and walk_down_tree_v2().
Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
Such error will disappear after removing @trans finished.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch removes parameter @trans of repair_tree_back_ref().
It calls try_avoid_extents_overwrite() and starts a transaction by
itself.
Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
Such error will disappear after removing @trans finished.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch removes parameter @trans of check_leaf_items().
Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
Such error will disappear after removing @trans finished.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch removes parameter @trans of repair_extent_item().
It calls try_avoid_extents_overwrite() and starts a transaction by
itself.
Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
Such error will disappear after removing @trans finished.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch removes parameter @trans of repair_chunk_item().
It calls try_avoid_extents_overwrite() and starts a transaction by
itself.
Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
Such error will disappear after removing @trans finished.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This patch removes the parameter @trans of delete_extent_item().
It calls try_avoid_extents_overwrite() and starts a transaction
by itself.
Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
Such error will disappear after removing @trans finished.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since extents can be avoided to be overwrittten by excluding or chunk
allocation. It's not necessary to do all repairs in one transaction.
This patch removes parameter @trans of repair_extent_data_item().
repair_extent_data_item() calls try_avoid_extents_overwrite()
and starts a transaction by itself.
Note: This patch and next patches cause error in lowmem repair like:
"Error: Commit_root already set when starting transaction".
Such error will disappear after removing @trans finished.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If options '--init-extent-tree' and '--mode=lowmem' are both
input, all metadata blocks will be traversed twice.
First one is done by pin_metadata_blocks() in reinit_extent_tree().
Second one is in check_chunks_and_extents_v2().
Excluding instead of pinning metadata blocks before reinitializing th
extent tree in lowmem can save some time.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Declare a global u64 variable last_allocated_chunk records start of
the last chunk allocated by lowmem repair.
Although global variable is not nice, it simplifies the code a lot.
avoid_extents_overwrite() prefers to allocate a new chunk first.
If it failed because of no space or wrong used bytes (like in
fsck-tests/004), then it tries to exclude metadata blocks which costs
a lot of time in large filesystems.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce create_chunk_and_block_block_group() to allocate a new chunk
and corresponding block group.
The new function force_cow_in_new_chunk() firstly allocates new chunk
and records its start.
Then it modifies all metadata block groups cached and full.
Finally it marks the new block group uncached and unfree.
In the next COW, extents states will be updated automatically by
cache_block_group().
New function try_to_force_cow_in_new_chunk() will try to mark block
groups full, allocate a new chunk and records the start.
If the last allocated chunk is almost full, a new chunk will be
allocated.
Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Excluding or pining all metadata blocks is time-inefficient for large
filesystems. Here is another way to mark all metadata block groups full
and allocate new chunks for COW. Then new reserved extents never
overwrite extents.
Introduce modify_block_groups_cache() to modify all blocks groups
cache state and set all extents in block groups unfree in free space
cache.
mark/clear_block_groups_full() are wrappers of above function.
Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit d17d6663c99c ("btrfs-progs: lowmem check: Fix regression which
screws up extent allocator") removes pin_metadata_blocks() from lowmem
repair. So we have to find another way to exclude extents which should
be occupied by existing tree blocks.
Modify pin_down_tree_blocks() and rename it to traverse_tree_blocks
for sharing code with new function exclude_metadata_blocks().
* exclude_metadata_blocks() traverses and marks extents of all tree
blocks dirty in fs_info->excluded_extents.
* cleanup_excluded_extents() is responsible for cleanup.
Export them to mode-common.h since they will be used both in original
and lowmem modes.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Move pin_down_tree_blocks from main.c to mode-common.c for
further patches.
And export pin_metadata_blocks to mode-common.h.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
No matter --check-data-csum is passed or not, btrfs check will always
output message like "checking csum".
This message could be a little confusing, change it according to
--check-data-csum option.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since data csum mismatch is not a fatal error compared to fs/extent
trees, continue check.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When --check-data-csum option found csum mismatch, btrfs check still
return 0.
Fix it so log-replay could automatically pause when it finds csum error.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Original --check-data-csum option will skip the extra copy if the first
copy matches csum.
Since offline scrub (with recoverability report) is still out-of-tree, at
least enhance --check-data-csum option to handle multiple copies.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The variable @eb is assigned to leaf in fs_tree before insertion of
backref. It will cause wrong parent of new inserted backref.
Set @parent at beginning to fix the problem.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In lowmem mode with '--repair', check_chunks_and_extents_v2 will fix
accounting in block groups and clear the error bit BG_ACCOUNTING_ERROR.
However, return value of check_btrfs_root() doesn't contain error bits.
If extent tree is on error, lowmem repair always prints error and
returns nonzero value even the filesystem is fine after repair.
Introduce FATAL_ERROR for lowmem mode to represent negative return
values since negative and positive can't be mixed in the bit operations.
Then let check_btrfs_root() return error bits.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In repair_extent_data_item(), path is not released if some errors occurs
which causes extent buffer leak.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Function btrfs_delete_one_dir_name() will check if the dir_item is the
last content of the item, and delete the whole item if needed.
However if @name_len of one dir_item/dir_index is corrupted and larger
than the item size, the function will still try to treat it as partly
remove, which will screw up the whole leaf.
This patch will enhance the item deletion check, to cover corrupted name
len, so in that case we just delete the whole item.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
verify_dir_item() is called in btrfs_match_dir_item_name() to ensure we
won't search beyond item boundary and does extra filetype check.
However in the following call chain, such extra filetype check can cause
problems:
1) btrfs_add_link()
|- check_dir_conflict()
|- btrfs_lookup_dir_index()
|- btrfs_match_dir_item_name()
And if we have an offending dir index whose filetype is invalid,
btrfs_match_dir_item_name() will return NULL, meaning no match dir
index is found.
So btrfs_add_link() will still try to insert a dir index, which may
have same key->offset and leading to duplicated dir index.
2) btrfs_unlink()
|- btrfs_lookup_dir_index()
|- btrfs_lookup_dir_index()
|- btrfs_match_dir_item_name()
For the same offending dir index with invalid filetype, this will
return NULL, and btrfs_unlink() will just consider there is no
existing dir_index and do nothing.
Leave an orphan and invalid dir_index hanging there forever.
The patch removes the extra filetype check, as "btrfs check" can already
handle invalid filetype correctly for both modes.
And this makes "btrfs check --repair --mode=lowmem" to delete the
offending dir index to repair it correctly.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For repair_ternary_lowmem() used in lowmem mode, if it found 1 of
DIR_INDEX/DIR_ITEM/INODE_REF missing, it will try to insert correct
link.
However for case like invalid type in DIR_INDEX, we should delete the
corrupted DIR_INDEX first before inserting the correct link.
This patch will remove the corrupted link before re-inserting.
This should solve the duplicated DIR_INDEX problem in old lowmem mode
repair.
Reported-by: Sebastian Andrzej Siewior <sebastian@breakpoint.cc>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>