To prepare for reuniting separated devices due to an incomplete fsid
change task, consolidate and monitor the per device's changing_fsid flag
in the struct btrfs_fs_devices.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
If btrfstune is executed on a filesystem that contains a missing device,
the command will now fail.
It is ok to fail when any of the options supported by btrfstune are
used, the filesystem devices should be all available.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Maintain the btrfs_fs_devices::missing counter to track the number of
missing devices, similar to what kernel does.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Previous commit "btrfs-progs: dump-super: print actual metadata_uuid
value" changed the value of the super_block::metadata_uuid to be printed
as it is, without tweaking it depending on the METADATA_UUID flag.
Apply similar tweak in the common helper functions used to read the
metadata_uuid so that test-cases still be successful.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The function btrfs_print_superblock() prints all members of the
superblock as they are, except for the superblock::metadata_uuid.
If the METADATA_UUID flag is unset, it prints the fsid instead of
zero as in the superblock::metadata_uuid.
Perhaps this was done because to match with the kernel
btrfs_fs_devices::metadata_uuid value as it also sets fsid if
METADATA_UUID flag is unset.
However, the actual superblock::metadata_uuid is always zero if the
METADATA_UUID flag is unset. Just to mention the kernel does not alter
the superblock::metadata_uuid value any time.
The dump-super printing fsid instead of zero, is confusing because we
generally expect dump_super to print the superblock value in the raw
format without modification.
Fix this by printing the actual metadata_uuid value instead of fsid.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Test case misc/046-seed-multi-mount would always fail with the following
error:
[TEST] misc-tests.sh
[TEST/misc] 046-seed-multi-mount
unexpected success: writable file despite read-only mount
test failed for case 046-seed-multi-mount
[CAUSE]
Although mounting seed device is indeed read-only, sprouting it with a
new device would always make it read-write by itself.
The behavior is already there for a long time, thus expecting a new
behavior (not changing the read-only flag) is a little weird.
[FIX]
Instead of doing the write check after the sprout, do it before the
sprout.
This looks more correct, and would not rely on the kernel behavior
change (if we determine to go that path).
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When I was testing misc/058, the fs still has around 7GiB free space,
but during that test case, btrfs kernel module reports write failures
and even git commands failed inside that fs.
And obviously the test case failed.
[CAUSE]
It turns out that, the test case itself would require 6GiB (4 data
disks) + 1.5GiB x 2 (the two replace target), thus it requires 9 GiB
free space.
And obviously my partition is not that large and failed.
[FIX]
In fact, we really don't need that much space at all.
The test verifies that two consecutive replace operations can be started
and enqueued, the sleep of 1 second is not strictly necessary as the
first command should start the replace right away.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[PROBLEM]
Since we have migrated to default v2 cache, the test case
misc/030-missing-device-image is no longer executed:
[TEST/misc] 030-missing-device-image
[NOTRUN] unable to create v1 space cache
[CAUSE]
The test case itself is trying its best to cover all paths, including
the data extent read path.
Thus the test case is requiring v1 cache, as that's the only way to
cover the data read path.
[FIX]
Just remove the v1 space cache requirement, it's still better to run the
test even it only exercises the metadata read path.
The good news is, after commit 3ff9d35257 ("btrfs-progs: use
read_data_from_disk() to replace read_extent_from_disk() and replace
read_extent_data()"), all data/metadata read paths are unified.
They only difference is the verification part.
Thus even if we didn't fully exercise the data read path, we didn't lose
much coverage anyway.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
During my test runs of mkfs-tests, 005-long-device-name-for-ssd failed
with the following error messages:
====== RUN CHECK dmsetup remove btrfs-test-with-very-long-name-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQPGc
device-mapper: remove ioctl on btrfs-test-with-very-long-name-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQPGc failed: Device or resource busy
Command failed.
failed: dmsetup remove btrfs-test-with-very-long-name-AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAQPGc
test failed for case 005-long-device-name-for-ssd
[CAUSE]
There seems to be a race between "btrfs inspect dump-super" and the
dmsetup removal.
[FIX]
Add a "udevadm settle" before removing the dm devices.
Also since we're here, use the same "udevadm settle" instead of the
manual sleep to wait for the new dm device to show up.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Check for each test directory if the utilities requested by
check_global_prereq can be found on the system.
Signed-off-by: David Sterba <dsterba@suse.com>
We still have some files after cleanup that git identifies. Add them to
the .gitignore.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Testing with the fstests config option POST_MKFS_CMD="btrfstune -m"
reported failure, as shown below:
./check btrfs/003
[111.635618] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 1 transid 6 /dev/sdb2 scanned by systemd-udevd (1117)
[111.642199] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 2 transid 6 /dev/sdb3 scanned by systemd-udevd (1114)
[111.660882] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 3 transid 6 /dev/sdb5 scanned by systemd-udevd (1116)
[111.672623] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 4 transid 6 /dev/sdb6 scanned by systemd-udevd (993)
[111.701301] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 6 transid 6 /dev/sdb8 scanned by systemd-udevd (1080)
[111.706513] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 5 transid 6 /dev/sdb7 scanned by systemd-udevd (1117)
[111.716532] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 7 transid 6 /dev/sdb9 scanned by systemd-udevd (1114)
[111.721253] BTRFS: device fsid a6599a65-8b6d-4156-bb55-0a3a2f0eae9d devid 8 transid 6 /dev/sdb10 scanned by mkfs.btrfs (1504)
[112.405186] BTRFS: device fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e devid 4 transid 8 /dev/sdb6 scanned by systemd-udevd (1117)
[112.422104] BTRFS: device fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e devid 6 transid 8 /dev/sdb8 scanned by systemd-udevd (1523)
[112.448355] BTRFS: device fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e devid 1 transid 8 /dev/sdb2 scanned by systemd-udevd (1115)
[112.456126] BTRFS error: device /dev/sdb3 belongs to fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e, and the fs is already mounted
[112.461299] BTRFS error: device /dev/sdb7 belongs to fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e, and the fs is already mounted
[112.465690] BTRFS info (device sdb2): using crc32c (crc32c-generic) checksum algorithm
[112.468758] BTRFS info (device sdb2): using free space tree
[112.471318] BTRFS error: device /dev/sdb9 belongs to fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e, and the fs is already mounted
[112.475962] BTRFS error: device /dev/sdb10 belongs to fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e, and the fs is already mounted
[112.481934] BTRFS error: device /dev/sdb5 belongs to fsid 1b3bacbf-14db-49c9-a3ef-547998aacc4e, and the fs is already mounted
[112.494614] BTRFS error (device sdb2): devid 2 uuid 99a57db7-2ef6-4240-a700-07ee7e64fb36 is missing
[112.497834] BTRFS error (device sdb2): failed to read chunk tree: -2
[112.507705] BTRFS error (device sdb2): open_ctree failed
The original fsid created by mkfs was a6599a65-8b6d-4156-bb55-0a3a2f0eae9d,
and the fsid created by the btrfstune -m option was
1b3bacbf-14db-49c9-a3ef-547998aacc4e.
During mount (after btrfstune -m), only 3 out of 8 devices were scanned
by systemd, while the rest were still being discovered. Consequently, the
mount command raced to find the missing devices. Since the mount command
in the kernel sets the flag fsdevices::opened, any further new alloc_device()
were blocked, resulting in the error "the fs is already mounted."
It is a good idea to register all devices after changing the fsid.
The previous registrations are already stale after changing the fsid.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
- use proper json array filtering
- use long parameter names
- add comments
- add cleaner for coverage runs
- keep last run for branches with badges at README
Signed-off-by: David Sterba <dsterba@suse.com>
Make the column names more descriptive, PNumber is from times when there
was only physical sort. Make the type/profile more explicit, later it
can be filtered by that. The 'Age' reflects the current allocation
strategy to always pick a higher number but this could become confusing,
it's really the number when sorted by logical offset.
Signed-off-by: David Sterba <dsterba@suse.com>
Now we can use newlines in option descriptions to make nicer lists for
options:
--sort MODE sort by a column ascending (default: pstart),
MODE can be one of:
pstart - physical offset, grouped by device
lstart - logical offset
usage - by chunk usage (implies --usage)
length_p - by chunk length, secondary by physical offset
length_l - by chunk length, secondary by logical offset
Signed-off-by: David Sterba <dsterba@suse.com>
A newline character in option description text will break line and then
indent the text properly, can be used for lists or paragraphs.
Signed-off-by: David Sterba <dsterba@suse.com>
Add another sorting key 'usage' to sort chunks by usage, ascending. Also
implies --usage parameter so it's viewed. This ignores devid, so all
chunks are mixed.
Signed-off-by: David Sterba <dsterba@suse.com>
Add a convenience script for adding new injection cookies and filter for
actual testing.
- ./inject-error new - return new unique cookie
- ./inject-error file tune/change-csum.c - show all cookies in the given
file (takes regexp)
Example usage:
for i in $(./inject-error file tune/change-csum.c | awk '{print $1}') ; do
echo "Inject $i"
export INJECT="$i"
rm img
cp --reflink img.filled img
./btrfstune --csum blake2 img
btrfs check img
done
Where 'img' is a filesystem with sample files and directories.
Signed-off-by: David Sterba <dsterba@suse.com>
To be able to test errors at specific locations, add a simple way to
check for a condition in code and controlled from user space environment
variable INJECT. For now a single value is accepted.
Use like:
if (inject_error(0x1234)) {
do_something();
return -ERROR;
}
This is enabled in debugging build by default (make D=1) and can be
enabled on demand too (make EXTRA_CFLAGS=-DINJECT).
Signed-off-by: David Sterba <dsterba@suse.com>
Enqueuing allows to let some operations to wait until the current one
finishes. This usually means that it's waiting for another one, but in
case of replace there's a check that does not allow the enqueuing to
take place, as reported.
Move it before that check.
Issue: #645
Signed-off-by: David Sterba <dsterba@suse.com>
Needed to work:
- install github app Codecov, accept permissions
- copy token from codecov.io to repository secrets
- allow actions permissions to run either verified marketplace creators
or list codecov/codecov-action@*
- set up repository as active on codecov.io, watch results eg.
https://app.codecov.io/gh/kdave/btrfs-progs/tree/coverage-test/
Signed-off-by: David Sterba <dsterba@suse.com>
With 'make D=gcov' the files are built with gcov support. After running
the workload, the results can be viewed by 'gcov file.c' or by
lcov+genhtml.
Signed-off-by: David Sterba <dsterba@suse.com>
There's a report that btrfs-find-root does not work as built-in tool in
btrfs.box, while it's advertised in the help:
$ ./btrfs.box help --box
Standalone tools built-in in the busybox style:
- mkfs.btrfs
- btrfs-image
- btrfs-convert
- btrfstune
- btrfs-find-root
Add the support as it might be useful tool sometimes. In the future the
command should be moved to e.g. inspect-internal or rescue.
Issue: #648
Signed-off-by: David Sterba <dsterba@suse.com>
On relese tests also check all the backends, use Tumbleweed as it's
known to work there and provide all supported library versions.
Signed-off-by: David Sterba <dsterba@suse.com>
The function check_where_mounted() scans the system for all other btrfs
devices, which is necessary for its operation. However, in certain
cases, devices remaining in the scanned state is undesirable. Introduce
the 'noscan' argument to make devices unscanned before return.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
To prepare for handling command line given devices factor out
btrfs_scan_argv_devices().
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Drop the devid argument, it can be fetched from the disk_super argument.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Local variable open_ctree_flags carries the flags whose final update is
for the locally declared struct variable oca_flags. Just use oca.flags
directly.
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The struct open_ctree_flags currently holds arguments for
open_ctree_fs_info(), it can be confusing when mixed with a local variable
named open_ctree_flags as below in the function cmd_inspect_dump_tree().
cmd_inspect_dump_tree()
::
struct open_ctree_flags ocf = { 0 };
::
unsigned open_ctree_flags;
So rename struct open_ctree_flags to struct open_ctree_args.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The variable 'is_btrfs' is declared as an integer but should be a boolean
instead.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The test misc/058 does not properly filter out the output due to -s that
only ignores non-existent files and was there due to previous changes.
We need to use -q.
Signed-off-by: David Sterba <dsterba@suse.com>
On aarch64 systems with glibc 2.28, several btrfs-progs test cases are
failing because the command 'btrfs inspect dump-super -a <dev>' reports
an error when it attempts to read beyond the disk/file-image size.
$ btrfs inspect dump-super -a /dev/vdb12
<snap>
ERROR: Failed to read the superblock on /dev/vdb12 at 274877906944
And btrfs/184 also fails, as it uses -s 2 option to dump the last super
block.
$ ./check btrfs/184
FSTYP -- btrfs
PLATFORM -- Linux/aarch64 a4k 6.4.0-rc7+ #7 SMP PREEMPT Sat Jun 24 02:47:24 EDT 2023
MKFS_OPTIONS -- /dev/vdb2
MOUNT_OPTIONS -- /dev/vdb2 /mnt/scratch
btrfs/184 1s ... [failed, exit status 1]- output mismatch (see /Volumes/ws/xfstests-dev/results//btrfs/184.out.bad)
--- tests/btrfs/184.out 2020-03-03 00:26:40.172081468 -0500
+++ /Volumes/ws/xfstests-dev/results//btrfs/184.out.bad 2023-06-24 05:54:40.868210737 -0400
@@ -1,2 +1,3 @@
QA output created by 184
-Silence is golden
+Deleted dev superblocks not scratched
+(see /Volumes/ws/xfstests-dev/results//btrfs/184.full for details)
...
(Run 'diff -u /Volumes/ws/xfstests-dev/tests/btrfs/184.out /Volumes/ws/xfstests-dev/results//btrfs/184.out.bad' to see the entire diff)
Ran: btrfs/184
Failures: btrfs/184
Failed 1 of 1 tests
This is because `pread()` behaves differently on aarch64 and sets
`errno = 2` instead of the usual `errno = 0`.
To fix check if the sb offset is beyond the device size or regular file
size and skip the corresponding sbread().
Also, move putchar('\n') after a successful call to load_and_dump_sb() to
the load_and_dump_sb() itself.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In cmd_inspect_dump_super(), at the label 'out', nothing much happens
other than returning ret.
At the goto statement to the label, in the for loop, we perform close(fd).
However, moving the close(fd) to 'out' as well is not a good idea because
close(fd) doesn't make sense outside the for loop.
Instead, simply return 1 instead of ret=1 and then returning it. Drop both
the 'out' label and ret.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add more error information to help debugging:
$ ./btrfs inspect-internal dump-super -Ffa /dev/vdb10
Before:
ERROR: failed to read the superblock on /dev/vdb10 at 274877906944
After:
ERROR: failed to read the superblock on /dev/vdb10 at 274877906944 read 0/4096 bytes
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Fix failures caused by the lack of ACL support in btrfs. For example:
$ make test
::
[TEST/misc] 057-btrfstune-free-space-tree
failed: setfacl -m u:root:x /Volumes/ws/btrfs-progs/tests/mnt/acls/acls.1
test failed for case 057-btrfstune-free-space-tree
make: *** [Makefile:493: test-misc] Error 1
Similar failures occurred in the test cases convert/001-ext2-basic,
convert/003-ext4-basic, convert/005-delete-all-rollback, and
convert/006-large-hole-extent.
Resolve it by adding a check for ACL support using the
check_kernel_support_acl() helper function. It gracefully handles the case
when ACL support is not compiled by calling _not_run().
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Some test cases are failing when ACL is not compiled in the system.
Instead, they should be marked as 'not_run'.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>