Minor refactoring, check arguments and print parameters. This can be
used to stress btree allocation by changing run_size.
Signed-off-by: David Sterba <dsterba@suse.com>
There's an old test for btree code that could be used in the testsuite.
Still needs some polishing and review what tests make sense.
Signed-off-by: David Sterba <dsterba@suse.com>
Previously the XFS-specific code was commented out so we don't need the
headers for building fsstress, this changes how the utility behaves
compared to the one in fstests, e.g. randomness or additional open/close
operations. Add enough code from xfsprogs to make it compile and enable
the #if0-ed code.
Signed-off-by: David Sterba <dsterba@suse.com>
In experimental build, read global '--param zone-size=SIZE' and use it
as emulated zone size. This is for testing only, will be promoted to a
proper option in the future.
Signed-off-by: David Sterba <dsterba@suse.com>
./btrfs --param key=value command ...
./btrfs --param key command ...
To pass various tuning data for testing and debugging, undocumented
for regular users.
To add support add reading of the parameter value after option parsing
bconf_param_value("key") and convert to what you need.
Signed-off-by: David Sterba <dsterba@suse.com>
Add a document describing the layout and functionality of the newly
introduced RAID Stripe Tree.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Due to refactoring in 88c25674c7 ("btrfs-progs: convert device info
to struct array") the variable tracking number of devices was not
updated and led to an error.
$ btrfs device usage /path
ERROR: unexpected number of devices: 0 != 1
...
Issue: #697
Signed-off-by: David Sterba <dsterba@suse.com>
Issue #622 reported a case where ntfs2btrfs can generate out-of-order
inline backref items, which can lead to kernel transaction abort, but
not detected by btrfs-check.
This patch would add such image, whose extent tree looks like this for
the only data extent:
item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16172 itemsize 111
refs 3 gen 7 flags DATA
(178 0xdfb591fa80d95ea) extent data backref root FS_TREE objectid 257 offset 0 count 1
(178 0xdfb591fbbf5f519) extent data backref root FS_TREE objectid 258 offset 0 count 1
(178 0xdfb591f49f9f8e7) extent data backref root FS_TREE objectid 259 offset 0 count 1
While the original good base image has the following backrefs for the
same data extent:
item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16172 itemsize 111
refs 3 gen 7 flags DATA
(178 0xdfb591fbbf5f519) extent data backref root FS_TREE objectid 258 offset 0 count 1
(178 0xdfb591fa80d95ea) extent data backref root FS_TREE objectid 257 offset 0 count 1
(178 0xdfb591f49f9f8e7) extent data backref root FS_TREE objectid 259 offset 0 count 1
Notice the sequence (the 2nd number in the round brackets) should be
descending. (Meanwhile type should be ascending, but it's way harder to
create.)
For now we don't have a way to fix it, but at least we should detect it.
Issue: #622
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 6cf11f3e38 ("btrfs-progs: check: check order of inline extent
refs") added the ability to detect out-of-order inline extent backref
items.
Meanwhile there is no such ability in lowmem mode, this patch would
introduce such ability to lowmem mode.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 6cf11f3e38 ("btrfs-progs: check: check order of inline extent
refs") fixes a problem that btrfs check never properly verify the
sequence of inline references.
It's not obvious because by default kernel handles EXTENT_DATA_REF_KEY
using its own hash, resulting some seemingly out-of-order result:
item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16143 itemsize 140
refs 4 gen 7 flags DATA
extent data backref root FS_TREE objectid 258 offset 0 count 1
extent data backref root FS_TREE objectid 257 offset 0 count 1
extent data backref root FS_TREE objectid 260 offset 0 count 1
extent data backref root FS_TREE objectid 259 offset 0 count 1
By a quick glance, no one can see the above inline backref items are in
any order.
To make such sequence more obvious, let dump-tree to output a new prefix
to indicate the type and the internal sequence number:
For above case, the new output would look like this:
item 0 key (13631488 EXTENT_ITEM 4096) itemoff 16143 itemsize 140
refs 4 gen 7 flags DATA
(178 0xdfb591fbbf5f519) extent data backref root FS_TREE objectid 258 offset 0 count 1
(178 0xdfb591fa80d95ea) extent data backref root FS_TREE objectid 257 offset 0 count 1
(178 0xdfb591f9c0534ff) extent data backref root FS_TREE objectid 260 offset 0 count 1
(178 0xdfb591f49f9f8e7) extent data backref root FS_TREE objectid 259 offset 0 count 1
Although still not that obvious, it should show the inline data backrefs
has descending sequence number.
For the type part, it's anti-instinctive in ascending order, which is
not that easy to produce.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 963188943f ("btrfs-progs: make
btrfs_super_block::log_root_transid deprecated") deprecated
log_root_transid and broke build of sb-mod. As an unused member of sb we
don't need to edit it, so remove it from the list.
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel seems to order inline extent items in a particular way:
forward by sub-type, then reverse by hash. Having these out of order can
cause a volume to go readonly when deleting an inode.
See https://github.com/maharmstone/ntfs2btrfs/issues/51
With additional comments from the pull request:
- lookup_inline_extent_backref() is skipping the remaining backref item
if data/metadata item is smaller (either through the data hash, or
metadata parent/ref_root) than the target range
- the fix could be still missing in lowmem mode
- image could be created according this comment
https://github.com/maharmstone/ntfs2btrfs/issues/51#issuecomment-1500781204
- due to late merge, the squota newly added key
BTRFS_EXTENT_OWNER_REF_KEY was not part of the patch and the value of
'hash' needs to be verified
Pull-request: #622
Signed-off-by: Mark Harmstone <mark@harmstone.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There were some files left after running 'make clean-all'. Reorganize
the target commands and group them by type of files so it's easier to
see what's cleaned and where to add new files.
Signed-off-by: David Sterba <dsterba@suse.com>
Add new option -p to 'subvolume create' so it behaves like 'mkdir -p'
and create all missing path components before the subvolume.
Issue: #429
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We need to validate the device uuid the same way as the fsid:
$ ./mkfs.btrfs --device-uuid 18eabcf0-6766-4fbf-b366-71b4ae725b2- img
btrfs-progs v6.5.2
See https://btrfs.readthedocs.io for more information.
ERROR: could not parse device UUID: 18eabcf0-6766-4fbf-b366-71b4ae725b2-
Signed-off-by: David Sterba <dsterba@suse.com>
Print the device uuid in the summary in case it's specified on the
command line, not always as it would be confusing and is not usually
needed. Can be found in 'btrfs inspect-internal dump-super' as
device_item.uuid .
Signed-off-by: David Sterba <dsterba@suse.com>
Add option --device-uuid that will set the device uuid item in super
block.
This is useful for creating a filesystem with a specific device uuid,
namely for testing.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The commit ("btrfs-progs: allow duplicate fsid for single device
filesystems") lets the duplicate fsid used for a new mkfs document this.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The initial patch used -q for enabling simple quota but this is not
right, -q is reserved for --quiet and we want the long options first.
Signed-off-by: David Sterba <dsterba@suse.com>
The new test case would:
- Prepare two loopback devices
One is for storing the source directory (on a btrfs).
This is to ensure we can set xattr for the directory, as filesystems
like tmpfs (mostly utilized by mktemp) is not supporting xattr.
The other one is the real target fs where we call
"mkfs.btrfs --rootdir" on.
- Create the source directory with the following contents:
* rootdir inode attributs:
# mode (750)
# uid (1000)
# gid (1000)
# xattr (user.rootdir)
* one regular file, with attributes:
# xattr (user.foorbar)
- Execute "mkfs.btrfs --rootdir" and mount the new fs
- Verify the above attributes
The target fs should have the same attributes, especially for the
rootdir inode.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When using "mkfs.btrfs" with "--rootdir" option, the top level inode
(rootdir) will not get the same xattr from the source dir:
mkdir -p source_dir/
touch source_dir/file
setfattr -n user.rootdir_xattr source_dir/
setfattr -n user.regular_xattr source_dir/file
mkfs.btrfs -f --rootdir source_dir $dev
mount $dev $mnt
getfattr $mnt
# Nothing <<<
getfattr $mnt/file
# file: $mnt/file
user.regular_xattr <<<
[CAUSE]
In function traverse_directory(), we only call add_xattr_item() for all
the child inodes, not really for the rootdir inode itself, leading to
the missing xattr items.
Not only xattr, in fact we also miss the uid/gid/timestamps/mode for the
rootdir inode.
[FIX]
Extract a dedicated function, copy_rootdir_inode(), to handle every
needed attributes for the rootdir inode, including:
- xattr
- uid
- gid
- mode
- timestamps
Issue: #688
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
For running scrubs, with v6.3 and newer btrfs-progs, it can report
incorrect "Total to scrub":
Scrub resumed: Mon Oct 9 11:28:33 2023
Status: running
Duration: 0:44:36
Time left: 0:00:00
ETA: Mon Oct 9 11:51:38 2023
Total to scrub: 625.49GiB
Bytes scrubbed: 625.49GiB (100.00%)
Rate: 239.35MiB/s
Error summary: no errors found
[CAUSE]
Commit c88ac0170b ("btrfs-progs: scrub: unify the output numbers for
"Total to scrub"") changed the output method for "Total to scrub", but
that value is only suitable for finished scrubs.
For running scrubs, if we use the currently scrubbed values, it would
lead to the above problem.
The real scrubbed bytes is only reliable for finished scrubs, not for
running/canceled/interrupted ones.
[FIX]
Change print_scrub_dev() to do extra checks, and only for finished
scrubs to use the scrubbed bytes.
Otherwise fall back to the device's bytes_used.
Issue: #690
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When running mkfs.btrfs with --rootdir on a block device, and the source
directory contains a sparse file, whose size is larger than the block
size, then mkfs.btrfs would fail:
# lsblk /dev/test/test
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
test-test 253:0 0 10G 0 lvm
# mkdir -p /tmp/output
# truncate -s 20G /tmp/output/file
# mkfs.btrfs -f --rootdir /tmp/output /dev/test/test
# sudo mkfs.btrfs -f /dev/test/scratch1 --rootdir /tmp/output/
btrfs-progs v6.3.3
See https://btrfs.readthedocs.io for more information.
ERROR: unable to zero the output file
[CAUSE]
Mkfs.btrfs would try to zero out the target file according to the total
size of the directory.
However the directory size is calculated using the file size, not the
real bytes taken by the file, thus for such sparse file with holes only,
it would still take 20G.
Then we would use that 20G size to zero out the target file, but if the
target file is a block device, we would fail as we can not enlarge a block
device.
[FIX]
When zeroing the file, we only enlarge it if the target is a regular
file.
Otherwise we warn about the size and continue.
Please note that, since "mkfs.btrfs --rootdir" doesn't handle sparse
file any differently from regular file, above case would still fail due
to ENOSPC, as will write zeros into the target file inside the fs.
Proper handling for sparse files would need a new series of patch to
address.
Issue: #653
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, write_dev_supers() compares the superblock location vs the size
of the device to check if it can write the superblock. This is not correct
for a zoned device, whose superblock location is different than a regular
device.
Introduce check_sb_location() to check if the superblock zone exists for
the zoned case.
Running btrfs check can fail on a certain zoned device setup (e.g,
zone size = 128MB, device size = 16GB).
From generic/330:
yes | btrfs check --repair --force /dev/nullb1
[1/7] checking root items
Fixed 0 roots.
[2/7] checking extents
ERROR: zoned: failed to read zone info of 4096 and 4097: Invalid argument
ERROR: failed to write super block for devid 1: write error: Input/output error
failed to write new super block err -5
failed to repair damaged filesystem, aborting
This happens because write_dev_supers() is comparing the original
superblock location vs the device size to check if it can write out a
superblock copy or not.
For the above example, since the first copy location (64MB) < device size
(16GB), it tries to write out the copy. But, the copy must be written into
zone 4096 (512G / zone size (128M) = 4096), which is out of the device.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Introduce sb_bytenr_to_sb_zone(), which converts the original superblock
location to the zone number of superblock log writing.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The convert tests take a long time and are not necessary for quick test
rounds under the convenience target 'make test'. We still run the tests
in CI.
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel patches for RST and squota are queued for 6.7, we need to be
able to test the features so it's not necessary to hide the mkfs support
under experimental build. The kernel may still need debug build to
enable mount.
Signed-off-by: David Sterba <dsterba@suse.com>
While we like to have the descriptive names also add short aliases that
we also use for reference in changelogs and documentation.
$ mkfs.btrfs -O list-all
Filesystem features available:
mixed-bg - mixed data and metadata block groups (compat=2.6.37, safe=2.6.37)
quota - quota support (qgroups) (compat=3.4)
extref - increased hardlink limit per file to 65536 (compat=3.7, safe=3.12, default=3.12)
raid56 - raid56 extended format (compat=3.9)
skinny-metadata - reduced-size metadata extent refs (compat=3.10, safe=3.18, default=3.18)
no-holes - no explicit hole extents for files (compat=3.14, safe=4.0, default=5.15)
fst - free-space-tree alias
free-space-tree - free space tree (space_cache=v2) (compat=4.5, safe=4.9, default=5.15)
raid1c34 - RAID1 with 3 or 4 copies (compat=5.5)
zoned - support zoned devices (compat=5.12)
extent-tree-v2 - new extent tree format (compat=5.15)
bgt - block-group-tree alias
block-group-tree - block group tree to reduce mount time (compat=6.1)
rst - raid-stripe-tree alias
raid-stripe-tree - raid stripe tree (compat=6.7)
squota - squota support (simple accounting qgroups) (compat=6.7)
Signed-off-by: David Sterba <dsterba@suse.com>
The list 'mkfs -O list-all' should be sorted by version of kernel
support (compat) so it's clear what's new.
Signed-off-by: David Sterba <dsterba@suse.com>
Unlike kernel, in btrfs-progs btrfs_start_transaction() never checks if
there is enough metadata space.
This can lead to very dangerous situation where there is no metadata
space left at all, deadlocking future tree operations.
This patch introduces a very basic version of metadata/system free space
check by:
- Check if there is enough metadata/system space left
If there is enough, go as usual.
- If there is not enough space left, try allocating a new chunk
- Recheck if the new space can meet our demand
If not, return ERR_PTR(-ENOSPC).
Otherwise, allocate a new trans handle to the caller.
This is possible thanks to the simplified transaction model in
btrfs-progs:
- We don't allow joining a transaction
This means we don't need to handle complex cases like data ordered
extents, which need to reserve space first, then join the current
transaction and use the reserved blocks.
- We don't allow multiple transaction handles for one transaction
Since btrfs-progs is single threaded, we always start a transaction
and then commit it.
However there is a feature that must be an exception for the new
metadata/system free space check:
- btrfs check --init-extent-tree
As all the meta/system free space check is based on the space info,
which is loaded from block group items.
Thus when rebuilding extent tree, we can no longer have an accurate
view, thus we have to disable the feature for the whole execution if
we're rebuilding the extent tree.
For now, there is no regression exposed during the self tests, but I
really hope this can be an extra safety net to prevent causing ENOSPC
deadlock in btrfs-progs.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Create subvolumes and delete every N-th to generate output where we can
also see how the stale qgroups are placed (at the end).
Issue: #687
Signed-off-by: David Sterba <dsterba@suse.com>
This patch fixes a bug that could occur when comparing paths in showing
qgroups list. Old code doesn't check it and the bug occurs when there is
stale qgroup and its path is null. Check whether it is null and return
without comparing paths.
Issue: #687
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Sidong Yang <realwakka@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>