On large (blockcount > 32bit) filesystems reading directly
super_block->s_blocks_count is not sufficient as the block count is held
in 2 separate 32 bit variables. Instead always use the provided
ext2fs_blocks_count to read the value. This can result in assertion
failure, when the block count is only held in the high 32 bits, in this
case s_block_counts would be zero, which would result in
btrfs_convert_context::block_count/total_bytes to also be 0 and hit an
assertion failure:
convert/main.c:1162: do_convert: Assertion `cctx.total_bytes != 0` failed, value 0
btrfs-convert(+0xffb0)[0x557defdabfb0]
btrfs-convert(main+0x6c5)[0x557defdaa125]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xea)[0x7f66e1f8bd0a]
btrfs-convert(_start+0x2a)[0x557defdab52a]
Aborted
What's worse it can also result in btrfs-convert mistakenly thinking
that a filesystem is smaller than it actually is (ignoring the top 32 bits).
Link: https://lore.kernel.org/linux-btrfs/023b5ca9-0610-231b-fc4e-a72fe1377a5a@jansson.tech/
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add constant for initial value to avoid unexpected clashes with user
defined getopt values and shift the common size getopt values.
Signed-off-by: David Sterba <dsterba@suse.com>
Creating a simple directory structure leads to the following error:
$ btrfs check
Checking filesystem on test.img
UUID: 8f2292ad-c80e-4ab4-8a72-29aa3a83002c
[1/7] checking root items
[2/7] checking extents
[3/7] checking free space cache
[4/7] checking fs roots
unresolved ref dir 260 index 0 namelen 2 name .. filetype 0 errors 3, no dir item, no dir index
ERROR: errors found in fs roots
found 101085184 bytes used, error(s) found
total csum bytes: 98460
total tree bytes: 262144
total fs tree bytes: 49152
total extent tree bytes: 16384
btree space waste bytes: 151864
file data blocks allocated: 167931904
referenced 167931904
The self-reference should exist for the toplevel directory, where the
parent directory points to itself.
Issue: #453
Author: tyan0
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When running some tests, I notice that my debug build of btrfs-convert
is throwing out garbage for target fs label:
$ ./btrfs-convert ~/test.img
btrfs-convert from btrfs-progs v5.17
Source filesystem:
Type: ext2
Label:
Blocksize: 4096
UUID: 29d159a8-cb46-41d3-8089-3c5c65e4afae
Target filesystem:
Label: @pcwU <<< Garbage here
Blocksize: 4096
Nodesize: 16384
UUID: 682bf5f2-8cb1-4390-b9ac-6883cd87ed39
Checksum: crc32c
...
[CAUSE]
The fslabel[] array is just not initialized, thus it can contain
garbage.
[FIX]
Initialize fslabel[] array to all zero.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Now that all callers are using the _nr variations we can simply rename
these helpers to btrfs_item_##member/btrfs_set_item_##member and change
the actual item SETGET funcs to raw_item_##member/set_raw_item_##member
and then change all callers to drop the _nr part.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have a lot of the following patterns
item = btrfs_item_nr(nr);
btrfs_set_item_*(eb, item, val);
btrfs_set_item_*(eb, btrfs_item_nr(nr), val);
in a lot of places in our code. Instead add _nr variations of these
helpers and convert all of the users to this new helper.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The mkfs_config can hold the BTRFS_LEAF_DATA_SIZE, so calculate this at
config creation time and then use that value throughout convert instead
of calling __BTRFS_LEAF_DATA_SIZE.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When we switch to multiple global trees we'll need to access the
appropriate extent root depending on the block group or possibly root.
To handle this, use a helper in most places and then the actual root in
places where it is required. We will whittle down the direct accessors
with future patches, but this does the bulk of the preparatory work.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have this helper sitting in extent-tree.c, but it's a repair
function. I'm going to need to make changes to this for extent-tree-v2
and would rather this live outside of the code we need to share with the
kernel.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We are going to need to start looking up the csum root based on the
bytenr with extent tree v2. To that end stop passing the root to the
csum related functions so that can be done in the helper functions
themselves.
There's an unrelated deletion of a function prototype that no longer
exists.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are a lot of call sites where we use the following code snippet:
u8 super_block_data[BTRFS_SUPER_INFO_SIZE];
struct btrfs_super_block *sb;
u64 ret;
sb = (struct btrfs_super_block *)super_block_data;
The reason for this is, structure btrfs_super_block was smaller than
BTRFS_SUPER_INFO_SIZE.
Thus for anything with csum involved, we have to use a proper 4K buffer.
Since the recent unification of sizeof(struct btrfs_super_block), we no
longer need such workaround, and can use struct btrfs_super_block
directly to do any operation.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Several extent_buffer initializations miss fs_info initialization. This
is OK before the following patch ("btrfs-progs: use direct-io for zoned
device") as eb->fs_info is not always necessary. But, after that patch,
we will use fs_info to determine it is zoned or not and that causes
segfault in such cases.
Properly set fs_info when initializing extent_buffers to fix the issue.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Relax the condition about a unique uuid for convert, only print a
warning. In case we copy the uuid, it's expected that at the time the
conversion starts the uuid is not unique as it sill exists on the source
filesystem.
In case user sets the uuid manually but it's still the same one as on
the source filesystem we should also allow that, so it warns in this
case as well.
Update the test so it creates a block device where the uuid would be
also cached by blkid and lets the non-unique check succeed.
Issue: #404
Signed-off-by: David Sterba <dsterba@suse.com>
There are various parsing helpers scattered everywhere, unify them to
one file and start with helpers already in utils.c.
Signed-off-by: David Sterba <dsterba@suse.com>
Add new option --uuid to convert with the following modes:
- 'copy' -- copy the UUID from the source filesystem
- 'new' -- (default) generate new UUID
- UUID -- a valid UUID that will be set on btrfs
Based on patch from Florian
https://lore.kernel.org/linux-btrfs/1357486331-4615-2-git-send-email-falbrechtskirchinger@gmail.com/
and ported to contemporary codebase.
Issue: #391
Signed-off-by: David Sterba <dsterba@suse.com>
There's a group of functions that are related to opening filesystem in
various modes, this can be moved to a separate file.
Signed-off-by: David Sterba <dsterba@suse.com>
Decrease dependency on system headers, remove where they're not needed
or became stale after code moved. The path-utils.h encapsulate path
operations so include linux/limits.h here, that's where PATH_MAX is
defined.
Signed-off-by: David Sterba <dsterba@suse.com>
For passing authentication keys to the checksumming functions we need a
container for the key.
Pass in a btrfs_fs_info to btrfs_csum_data() so we can use the fs_info
as a container for the authentication key.
Note this is not always possible for all callers of btrfs_csum_data() so
we're just passing in NULL for now
Functions calling btrfs_csum_data() with a NULL fs_info argument are
currently not supported in the context of an authenticated file system.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit b3df561fbf ("btrfs-progs: convert: copy extra timespec on
ext4") has introduced the ability to convert extended inode time
precision on ext4, but this breaks builds on older distros, where ext4
does not have the nsec time precision.
Commit c615287cc0 ("btrfs-progs: a bunch of typo fixes") tried to fix
that by testing the availability of the EXT4_EPOCH_MASK macro, but the
test is not complete.
This patch aims at fixing the macro test, and changes the
name of the associated HAVE_ macro, since the logic is reverted.
This fixes#353 when ext4 has nsec time precision. Note that the test
convert/019-ext4-copy-timestamps fails when ext4 does not have the nsec
time precision and needs to check for the support.
Issue: #353
Signed-off-by: Pierre Labastie <pierre.labastie@neuf.fr>
Signed-off-by: David Sterba <dsterba@suse.com>
As Chris reports: This ext4 file system has 'needs_recovery' feature set, and
if mounted rw, log replay happens. But btrfs-convert doesn't check for it and
converts anyway. It probably shouldn't.
# debugfs -R stats /dev/loop0
debugfs 1.45.6 (20-Mar-2020)
Filesystem volume name: <none>
Last mounted on: /mnt/0
Filesystem UUID: d3e3862e-f892-4ab7-ae91-84eb4be4a3ef
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal ext_attr resize_inode dir_index
filetype needs_recovery extent 64bit flex_bg
sparse_super large_file huge_file dir_nlink
extra_isize metadata_csum
Filesystem flags: signed_directory_hash
Default mount options: user_xattr acl
Filesystem state: clean
Errors behavior: Continue
...
Then 'btrfs-convert' proceeds, while 'e2fsck -fvn /dev/loop1' finds some
problems and wants to fix them.
Add a check for the 'needs_recovery' incompat bit set and don't convert
the filesystem.
Issue: #348
Signed-off-by: David Sterba <dsterba@suse.com>
In 5.10 the convert gained support for extended inode time precision,
but this is not available on older distros and breaks build. Add a
configure-time check for the EXT4_EPOCH_MASK macro and add a stub in
case it's not detected.
This means that the 64bit timestamps will not be transferred from the
original filesystem in such environment, at least a warning is printed.
Issue: #344
Signed-off-by: David Sterba <dsterba@suse.com>
The libmount dependency has been added in commit 61ecaff036
("btrfs-progs: build: add libmount dependency"), and static build got
broken. There are functions that do basically the same thing and also
share the name, which in turn fails at link time.
ld: /../lib64/libmount.a(libcommon_la-canonicalize.o): in function `canonicalize_dm_name':
util-linux-2.34/lib/canonicalize.c:58: multiple definition of `canonicalize_dm_name';
common/path-utils.static.o:btrfs-progs/common/path-utils.c:286: first defined here
In case the collision can be resolved by renaming, it's done
(canonicalize_path and parse_size). There are 2 symbols from selinux
that are substituted by a weak aliases during the static build.
There's one new warning due to use of getgrnam_r in libmount that
depends on dynamic linking and may not work properly with static build.
We're not using the related functions directly or indirectly, so it
should be safe to ignore the warnings.
ld: ../lib64/libmount.a(la-utils.o): in function `mnt_get_gid':
util-linux-2.34/libmount/src/utils.c:625: warning: Using 'getgrnam_r' in statically linked applications
+requires at runtime the shared libraries from the glibc version used for linking
Issue: #333
Signed-off-by: David Sterba <dsterba@suse.com>
Currently btrfs-convert only copies ext2 inode timestamps
i_[cma]time from ext4, while filling 0 to nsec and crtime fields.
This change copies nsec and crtime by parsing i_[cma]time_extra fields.
Author: Jiachen YANG <farseerfc@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs-convert currently can't handle more fragmented block groups when
converting ext4 because the minimum size of a data chunk is 32MiB.
When converting an ext4 fs with more fragmented block group with the disk
almost full, we can end up hitting a ENOSPC problem [1] since smaller
block groups (10MiB for example) end up being extended to 32MiB, leaving
the free space tree smaller when converting it to btrfs.
This patch adds error messages telling which needed bytes couldn't be
allocated from the free space tree and shows the largest portion available:
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
checksum: crc32c
free space report:
total: 1073741824
free: 39124992 (3.64%)
ERROR: failed to reserve 33554432 bytes for metadata chunk, largest available: 33488896 bytes
ERROR: unable to create initial ctree: No space left on device
Issue: #251
Reviewed-by: Neal Gompa <ngompa13@gmail.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Now if an ENOSPC error happened, the free space report would help user
to determine if it's a real ENOSPC or a bug in convert.
The reported free space is the calculated free space, which doesn't
include super block space, nor merged data chunks.
The free space is always smaller than the reported available space of
the original fs, as we need extra padding space for used space to avoid
too fragmented data chunks.
The output would be:
$ ./btrfs-convert /dev/sda
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
checksum: crc32c
free space report:
total: 10737418240
free: 0 (0.00%)
ERROR: unable to create initial ctree: No space left on device
WARNING: an error occurred during conversion, the original filesystem is not modified
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ put total, free to separate lines ]
Signed-off-by: David Sterba <dsterba@suse.com>
The original fs is not touched until we migrate the super blocks.
Under most error cases, we fail before that thus the original fs is
still safe.
So change the error message according the stages we failed to reflect
that.
Signed-off-by: Qu Wenruo <wqu@suse.com>
[ adjust wording of messages ]
Signed-off-by: David Sterba <dsterba@suse.com>
This patch will enhance the error handling of ext2_copy_inodes by:
- Return more meaningful error number
Instead of -1 (-EPERM), now return -EIO for ext2 calls error, and
proper error number from btrfs calls.
- Commit transaction if ext2fs_open_inode_scan() failed
- Call ext2fs_close_inode_scan() on error
- Hunt down the BUG_ON()s
- Add error messages for transaction related calls
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit "btrfs-progs: convert: prevent 32bit overflow for
cctx->total_bytes" added an assert to ensure that cctxx.total_bytes did
not overflow, but this ASSERT calls assert_trace, which expects a long
value.
By converting the u64 to long overflows in a 32bit machine, leading the
assert_trace to be triggered since cctx.total_bytes turns to zero.
Fix this problem by comparing the cctx.total_bytes with zero when
calling ASSERT.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
When convert is called on a 64GiB ext4 fs, it fails like this:
$ btrfs-convert /dev/loop0p1
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
checksum: crc32c
creating ext2 image file
ERROR: missing data block for bytenr 1048576
ERROR: failed to create ext2_saved/image: -2
WARNING: an error occurred during conversion, filesystem is partially created but not finalized and not mountable
Btrfs-convert also corrupts the source fs:
$ LANG=C e2fsck /dev/loop0p1 -f
e2fsck 1.45.6 (20-Mar-2020)
Resize inode not valid. Recreate<y>? yes
Pass 1: Checking inodes, blocks, and sizes
Deleted inode 3681 has zero dtime. Fix<y>? yes
Inodes that were part of a corrupted orphan linked list found. Fix<y>? yes
Inode 3744 was part of the orphaned inode list. FIXED.
Deleted inode 3745 has zero dtime. Fix<y>? yes
Inode 3747 has INLINE_DATA_FL flag on filesystem without inline data support.
Clear<y>? yes
...
[CAUSE]
After some debugging, the first strange behavior is, the value of
cctx->total_bytes is 0 in ext2_open_fs().
It turns out that, the value assign for cctx->total_bytes could lead to
bit overflow for the unsigned int value.
And that 0 cctx->total_bytes leads to various problems for later free
space calculation.
For example, in calculate_available_space(), we use cctx->total_bytes to
ensure we won't create a data chunk beyond device end:
cue_len = min(cctx->total_bytes - cur_off, cur_len);
If that cur_offset is also 0, we will create a cache_extent with 0 size,
which could cause a lot of problems for cache tree search.
[FIX]
Do manual casting for the multiply operation, so we could got a real u64
result. The fix will be applied to all supported fses (ext* and
reiserfs).
Reported-by: Christian Zangl <coralllama@gmail.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
The following script could lead to corrupted btrfs fs after
btrfs-convert:
fallocate -l 1G test.img
mkfs.ext4 test.img
mount test.img $mnt
fallocate -l 200m $mnt/file1
fallocate -l 200m $mnt/file2
fallocate -l 200m $mnt/file3
fallocate -l 200m $mnt/file4
fallocate -l 205m $mnt/file1
fallocate -l 205m $mnt/file2
fallocate -l 205m $mnt/file3
fallocate -l 205m $mnt/file4
umount $mnt
btrfs-convert test.img
The result btrfs will have a device extent beyond its boundary:
pening filesystem to check...
Checking filesystem on test.img
UUID: bbcd7399-fd5b-41a7-81ae-d48bc6935e43
[1/7] checking root items
[2/7] checking extents
ERROR: dev extent devid 1 physical offset 993198080 len 85786624 is beyond device boundary 1073741824
ERROR: errors found in extent allocation tree or chunk allocation
[3/7] checking free space cache
[4/7] checking fs roots
[5/7] checking only csums items (without verifying data)
[6/7] checking root refs
[7/7] checking quota groups skipped (not enabled on this FS)
found 913960960 bytes used, error(s) found
total csum bytes: 891500
total tree bytes: 1064960
total fs tree bytes: 49152
total extent tree bytes: 16384
btree space waste bytes: 144885
file data blocks allocated: 2129063936
referenced 1772728320
[CAUSE]
Btrfs-convert first collect all used blocks in the original fs, then
slightly enlarge the used blocks range as new btrfs data chunks.
However the enlarge part has a problem, that it doesn't take the device
boundary into consideration.
Thus it caused device extents and data chunks to go beyond device
boundary.
[FIX]
Just to extra check before inserting data chunks into
btrfs_convert_context::data_chunk.
Reported-by: Jiachen YANG <farseerfc@gmail.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[WARNING]
When compiling btrfs-progs, there is one warning from convert ext2 code:
convert/source-ext2.c: In function 'ext2_open_fs':
convert/source-ext2.c:91:44: warning: pointer targets in passing argument 1 of 'strndup' differ in signedness [-Wpointer-sign]
91 | cctx->volume_name = strndup(ext2_fs->super->s_volume_name, 16);
| ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
| |
| __u8 * {aka unsigned char *}
In file included from ./kerncompat.h:25,
from convert/source-ext2.c:19:
/usr/include/string.h:175:35: note: expected 'const char *' but argument is of type '__u8 *' {aka 'unsigned char *'}
175 | extern char *strndup (const char *__string, size_t __n)
| ~~~~~~~~~~~~^~~~~~~~
The toolchain involved is:
- GCC 10.1.0
- e2fsprogs 1.45.6
[CAUSE]
Obviously, in the offending e2fsprogs, the volume label is using u8,
which is unsigned char, not char.
/*078*/ __u8 s_volume_name[EXT2_LABEL_LEN]; /* volume name, no NUL? */
[FIX]
Just do a forced conversion to suppress the warning is enough.
I don't think we need to apply -Wnopointer-sign yet.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Make the features structures more generic to allow mkfs-time and
mount-time sets to be defined.
This provides base for later mkfs support of mount-time features like
quotas.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This would sync the code between kernel and btrfs-progs, and save at
least 1 byte for each btrfs_block_group_cache.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
On aarch64 with pagesize 64k, btrfs-convert of ext4 is successful,
but it won't mount because we don't yet support subpage blocksize, ie.
when page size and sectorsize don't match.
BTRFS error (device vda): sectorsize 4096 not supported yet, only support 65536
So in this case during convert provide a warning but let the conversion
proceed.
Example:
WARNING: Blocksize 4096 is not equal to the pagesize 65536,
converted filesystem won't mount on this system.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Convert is always set to true so there's no point in having it as a
function parameter or using it as a predicate inside
btrfs_alloc_data_chunk. Remove it and all relevant code which would
have never been executed. No semantics changes.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's always set to BTRFS_BLOCK_GROUP_DATA so sink it into the function.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
With the introduction of xxhash64 to btrfs-progs we created a crypto/
directory for all the hashes used in btrfs (although no
cryptographically secure hash is there yet).
Move the crc32c implementation from kernel-lib/ to crypto/ as well so we
have all hashes consolidated.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Adding this table will make extending btrfs-progs with new checksum types
easier.
Also add accessor functions to access the table fields.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Add an option to mkfs to specify which checksum algorithm will be used
for the filesystem. Currently only crc32c is supported.
The option name is -c, presumably one of the comonly used options so it
gets the lowercase option.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Update the checksumming API to be able to cope with more checksum types
than just CRC32C. The finalization call is merged into btrfs_csum_data.
There are some fixme's and asserts added that need to be resolved.
Co-developed-by: David Sterba <dsterba@suse.com>
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
In preparation to supporting new checksum algorithm pass the checksum type
to btrfs_csum_data/btrfs_csum_final, this allows us to encapsulate any
differences in processing into the respective functions
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Pass pointer to a generic buffer instead of fixed size that crc32c
currently uses.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Add the checksum type to csum_tree_block_size(), __csum_tree_block_size()
and verify_tree_block_csum_silent().
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Add checksum type to the definition structure for a new filesystem, this
will be used in following patches.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Pass in a btrfs_mkfs_config to write_temp_extent_buffer(), this is
needed so we can grab the checksum type for checksum buffer verification
in later patches.
Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
Signed-off-by: David Sterba <dsterba@suse.com>
Build several standalone tools into one binary and switch the function
by name (symlink or hardlink).
* btrfs
* mkfs.btrfs
* btrfs-image
* btrfs-convert
* btrfstune
The static target is also supported. The name of resulting boxed
binaries is btrfs.box and btrfs.box.static . All the binaries can be
built at the same time without prior configuration.
text data bss dec hex filename
822454 27000 19724 869178 d433a btrfs
927314 28816 20812 976942 ee82e btrfs.box
2067745 58004 44736 2170485 211e75 btrfs.static
2627198 61724 83800 2772722 2a4ef2 btrfs.box.static
File sizes:
857496 btrfs
968536 btrfs.box
2141400 btrfs.static
2704472 btrfs.box.static
Standalone utilities:
512504 btrfs-convert
495960 btrfs-image
471224 btrfstune
491864 mkfs.btrfs
1747720 btrfs-convert.static
1411416 btrfs-image.static
1304256 btrfstune.static
1361696 mkfs.btrfs.static
So the shared 900K binary saves ~2M, or ~5.7M for static build.
Signed-off-by: David Sterba <dsterba@suse.cz>
Create directory for all sources that can be used by anything that's not
rellated to a relevant kernel part, all common functions, helpers,
utilities that do not fit any other specific category.
The traditional location would be probably lib/ with all things that are
statically linked to the main binaries, but we have libbtrfs and
libbtrfsutil so this would be confusing.
Signed-off-by: David Sterba <dsterba@suse.com>
Although moderm hardware is fast enough and crc32c calculation is not a
hotspot, doing such optimization won't hurt anyway.
Issue: #175
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In convert we use trans->block_reserved >= 4096 as a threshold to commit
transaction, where block_reserved is the number of new tree blocks
allocated inside a transaction.
The problem is, we still have a hidden bug in delayed ref implementation
in btrfs-progs, when we have a large enough transaction, delayed ref may
failed to find certain tree blocks in extent tree and cause transaction
abort.
This fix will workaround it by committing transaction at a much lower
threshold.
The old 4096 means 4096 new tree blocks, when using default (16K)
nodesize, it's 64M, which can contain over 12k inlined data extent or
csum for around 60G, or over 800K file extents.
The new threshold will limit the size of new tree blocks to 2M, aligning
with the chunk preallocator threshold, and reducing the possibility to
hit that delayed ref bug.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add support for a new metadata_uuid field. This is just a preparatory
commit which switches all users of the fsid field for metdata comparison
purposes to utilize the new field. This more or less mirrors the
kernel patch, additionally:
* Update 'btrfs inspect-internal dump-super' to account for the new
field. This involes introducing the 'metadata_uuid' line to the
output and updating the logic for comparing the fs uuid to the
dev_item uuid.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Similar to the changes where strerror(errno) was converted, continue
with the remaining cases where the argument was stored in another
variable.
The savings in object size are about 4500 bytes:
$ size btrfs.old btrfs.new
text data bss dec hex filename
805055 24248 19748 849051 cf49b btrfs.old
804527 24248 19748 848523 cf28b btrfs.new
Signed-off-by: David Sterba <dsterba@suse.com>
When convert failed, the error messsage would look like:
create btrfs filesystem:
blocksize: 4096
nodesize: 16384
features: extref, skinny-metadata (default)
creating ext2 image file
ERROR: failed to create ext2_saved/image: -1
WARNING: an error occurred during conversion, filesystem is partially
created but not finalized and not mountable
We can only know something wrong happened during "ext2_saved/image" file
creation, but unable to know what exactly went wrong.
This patch will add the following error messages for create_image() and
its callee:
1) Sanity test error
2) Csum calculation error
3) Free ino number allocation error
4) Inode creation error
5) Inode mode change error
6) Inode link error
With all these error messages, we should be pretty easy to locate the
error without extra debugging.
Reported-by: Serhat Sevki Dincer <jfcgauss@gmail.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When pread64() returns value smaller than expected, it normally means
EIO, so just return -EIO to replace the intermediate number. So when IO
fails, we should be able to get more meaningful error number of than
EPERM.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We reuse the task_position enum and task_ctx struct of the original progress
indicator, adding more values and fields for our needs.
Then add hooks in all steps of the check to properly record progress.
Here's how the output looks like on a 22 Tb 5-disk RAID1 FS:
Opening filesystem to check...
Checking filesystem on /dev/mapper/luks-ST10000VN0004-XXXXXXXX
UUID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
[1/7] checking extents (0:20:21 elapsed, 950958 items checked)
[2/7] checking root items (0:01:29 elapsed, 15121 items checked)
[3/7] checking free space cache (0:00:11 elapsed, 4928 items checked)
[4/7] checking fs roots (0:51:31 elapsed, 600892 items checked)
[5/7] checking csums (0:14:35 elapsed, 754522 items checked)
[6/7] checking root refs (0:00:00 elapsed, 232 items checked)
[7/7] checking quota groups skipped (not enabled on this FS)
found 5286458060800 bytes used, no error found
Signed-off-by: Stéphane Lesimple <stephane_btrfs@lesimple.fr>
Signed-off-by: David Sterba <dsterba@suse.com>
It's always set to extent_root and the function already takes a
transaction handle where fs_info could be referenced and in turn
the extent_tree.
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Reviewed-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 324d4c1857 (btrfs-progs: convert: Add larger device support)
introduced new dependencies on the 64-bit API provided by e2fsprogs.
That API was introduced in v1.42 (along with bigalloc).
This patch maps the following to their equivalents in e2fsprogs < 1.42.
- ext2fs_get_block_bitmap_range2
- ext2fs_inode_data_blocks2
- ext2fs_read_ext_attr2
Since we need to detect and define EXT2_FLAG_64BITS for compatibilty
anyway, it makes sense to use that to detect the older e2fsprogs instead
of defining a new flag ourselves.
Signed-off-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The old flag OPEN_CTREE_FS_PARTIAL is in fact quite easy to be confused
with OPEN_CTREE_PARTIAL, which allow btrfs-progs to open damaged
filesystem (like corrupted extent/csum tree).
However OPEN_CTREE_FS_PARTIAL, unlike its name, is just allowing
btrfs-progs to open fs with temporary superblocks (which only has 6
basic trees on SINGLE meta/sys chunks).
The usage of FS_PARTIAL is really confusing here.
So rename OPEN_CTREE_FS_PARTIAL to OPEN_CTREE_TEMPORARY_SUPER, and add
extra comment for its behavior.
Also rename BTRFS_MAGIC_PARTIAL to BTRFS_MAGIC_TEMPORARY to keep the
naming consistent.
And with above comment, the usage of FS_PARTIAL in dump-tree is
obviously incorrect, fix it.
Fixes: 8698a2b9ba ("btrfs-progs: Allow inspect dump-tree to show specified tree block even some tree roots are corrupted")
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Instead of using the internal struct extent_io_tree, use struct fs_info.
This does not only unify the interface between kernel and btrfs-progs,
but also makes later btrfs_print_tree() use fewer parameters.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Lu Fengqi <lufq.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel code no longer has BTRFS_CRC32_SIZE and only uses
btrfs_csum_sizes[]. So, update the progs code as well.
Suggested-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Tomohiro Misono <misono.tomohiro@jp.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[Bug]
On btrfs converted from ext*, one user reported the following kernel
warning:
------------[ cut here ]------------
BTRFS: Transaction aborted (error -95)
WARNING: CPU: 0 PID: 324 at fs/btrfs/inode.c:3042 btrfs_finish_ordered_io+0x7ab/0x850 [btrfs]
Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
RIP: 0010:btrfs_finish_ordered_io+0x7ab/0x850 [btrfs]
...
Call Trace:
normal_work_helper+0x39/0x370 [btrfs]
process_one_work+0x1ce/0x410
worker_thread+0x2b/0x3d0
? process_one_work+0x410/0x410
kthread+0x113/0x130
? kthread_create_on_node+0x70/0x70
? do_syscall_64+0x74/0x190
? SyS_exit_group+0x10/0x10
ret_from_fork+0x35/0x40
---[ end trace c8ed62ff6a525901 ]---
BTRFS: error (device dm-2) in
btrfs_finish_ordered_io:3042: errno=-95 unknown
BTRFS info (device dm-2): forced readonly
BTRFS error (device dm-2): pending csums is 6447104
[Cause]
The call trace and the unique return value points to
__btrfs_drop_extents(), when we tries to drop pages of an inline extent,
we will trigger such -EOPNOTSUPP.
However kernel has limitation on the size of inline file extent
(sector size for ram size and sector size - 1 for on-disk size),
btrfs-convert doesn't have the same limitation, resulting much larger
file extent.
The lack of correct inline extent size check dates back to 2008 when
btrfs-convert is added into btrfs-progs.
[Fix]
Fix the inline extent creation condition, not only using
BTRFS_MAX_INLINE_DATA_SIZE(), which is only the maximum size of inline
data according to nodesize, but also limit it against sector size.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
In latest e2fsprogs (1.44.0) definition of ext2_ext_attr_entry has
removed member e_value_block, as currently ext* doesn't support it set
anyway.
So remove such check so that we can pass compile.
Issue: #110
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=199071
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Exposed by convert-test with D=asan.
Unlike btrfs, ext2fs_close() still leaves its ext2_filsys parameter
filled with allocated pointers.
It needs ext2fs_free() to free those pointers.
Issue: #92
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Do a cleanup. Also make it consistent with kernel. Use fs_info instead
of root for BTRFS_MAX_INLINE_DATA_SIZE, since maybe in some situation we
do not know root, but just know fs_info.
Change macro to inline function to be consistent with kernel. And
change the function body to match kernel.
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Do a cleanup. Also make it consistent with kernel. Use fs_info instead
of root for BTRFS_LEAF_DATA_SIZE, since maybe in some situation we do
not know root, but just know fs_info.
Signed-off-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
@chunk_objectid of btrfs_make_block_group() function is always fixed to
BTRFS_FIRST_FREE_OBJECTID, so there is no need to pass it as parameter
explicitly.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
As btrfs is specific to Linux, %m can be used instead of strerror(errno)
in format strings. This has some size reduction benefits for embedded
systems.
glibc, musl, and uclibc-ng all support %m as a modifier to printf.
A quick glance at the BIONIC libc source indicates that it has
support for %m as well. BSDs and Windows do not but I do believe
them to be beyond the scope of btrfs-progs.
Compiled sizes on Ubuntu 16.04:
Before:
3916512 btrfs
233688 libbtrfs.so.0.1
4899 bcp
2367672 btrfs-convert
2208488 btrfs-corrupt-block
13302 btrfs-debugfs
2152160 btrfs-debug-tree
2136024 btrfs-find-root
2287592 btrfs-image
2144600 btrfs-map-logical
2130760 btrfs-select-super
2152608 btrfstune
2131760 btrfs-zero-log
2277752 mkfs.btrfs
9166 show-blocks
After:
3908744 btrfs
233256 libbtrfs.so.0.1
4899 bcp
2366560 btrfs-convert
2207432 btrfs-corrupt-block
13302 btrfs-debugfs
2151104 btrfs-debug-tree
2134968 btrfs-find-root
2281864 btrfs-image
2143536 btrfs-map-logical
2129704 btrfs-select-super
2151552 btrfstune
2130696 btrfs-zero-log
2276272 mkfs.btrfs
9166 show-blocks
Total savings: 23928 (24 kilo)bytes
Signed-off-by: Rosen Penev <rosenp@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit 1170ac3079 ("btrfs-progs: convert: Introduce function to check if
convert image is able to be rolled back") reworked rollback check
condition, by checking 1:1 mapping of each file extent.
The idea itself has nothing wrong, but error handler is not implemented
correctly, which over writes the return value and always try to rollback
the fs even it fails to pass the check.
Fix it by correctly return the error before rollback the fs.
Fixes: 1170ac3079 ("btrfs-progs: convert: Introduce function to check if convert image is able to be rolled back")
Reported-by: Jeff Mahoney <jeffm@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Build with musl libc needs the sys/types.h header for the dev_t type,
since this header is not included indirectly. This fixes the following
build failure:
In file included from convert/source-fs.c:23:0:
./convert/source-fs.h:112:1: error: unknown type name ‘dev_t’
dev_t decode_dev(u32 dev);
^~~~~
convert/source-fs.c:31:1: error: unknown type name ‘dev_t’
dev_t decode_dev(u32 dev)
^~~~~
Signed-off-by: Baruch Siach <baruch@tkos.co.il>
Signed-off-by: David Sterba <dsterba@suse.com>
For rollback, we only needs to open the fs to check if it meets the
condition to rollback. And this RW read makes us failed to rollback
btrfs with v2 space cache.
In fact, we don't even start a transaction during rollback.
So open the fs RO for rollback, to avoid v2 space cache problem.
Reported-by: Gu Jinxiang <gujx@cn.fujitsu.com>
Reviewed-by: Gu JinXiang <gujx@cn.fujitsu.com>
Tested-by: Gu JinXiang <gujx@cn.fujitsu.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
For code reuse, btrfs_insert_dir_item() now calls
inserts_with_overflow() even if the dir_item existed.
Add a parameter @ignore_existed to btrfs_add_link().
If @ignore_existed is not zero, btrfs_add_link() continues to do link.
Signed-off-by: Su Yue <suy.fnst@cn.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
A convert parameter is added as a flag to indicate if btrfs_mksubvol()
is used for btrfs-convert. The change cascades down to the callchain.
Signed-off-by: Yingyi Luo <yingyil@google.com>
Signed-off-by: David Sterba <dsterba@suse.com>
link_subvol() is moved to inode.c and renamed as btrfs_mksubvol().
The change cascades down to the callchain.
Signed-off-by: Yingyi Luo <yingyil@google.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Currently transaction bugs out insided btrfs_start_transaction in case
of error, we want to lift the error handling to the callers. This patch
adds the BUG_ON anywhere it's been missing so far. This is not the best
way of course. Transforming BUG_ON to a proper error handling highly
depends on the caller and should be dealt with case by case.
Signed-off-by: David Sterba <dsterba@suse.com>