These are the printk helpers from the kernel. There were a few
modifications, the hi-lights are
- We do not have fs_info::fs_state, so that needed to be removed.
- We do not have discard.h sync'ed yet, so that dependency was dropped.
- Anything related to struct super_block was commented out.
- The transaction abort had to be modified to fit with the current
btrfs-progs code.
- Added a btrfs_no_printk() helper to common/messages.* so that the
print statements still worked.
- The 32bit limit checkers are not needed so are behind __KERNEL__
Additionally there were kerncompat.h changes that needed to be made to
handle the dependencies properly. Those are easier to spot.
Any function that needed to be modified has a MODIFIED tag in the
comment section with a list of things that were changed.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We want to keep this file locally as we want to be uptodate with
upstream, so we can build btrfs-progs regardless of which kernel is
currently installed. Sync this with the upstream version and put it in
kernel-shared/uapi to maintain some semblance of where this file comes
from.
There are some changes that need to be synced back to kernel. A local
definition of static_assert is used to avoid compilation problems on gcc
(< 9) due to mandatory 2nd parameter.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
These helpers all do variations on the same thing, so add a helper to
just do the printf part, and a macro to handle the special prefix and
postfix, and then make the helpers just use the macro and new helper.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have different helpers for warning_on and error_on(), which take the
condition and then do the printf. However we can just check the
condition in the macro and call the normal warning or error helper, so
clean this usage up and delete the unneeded message helpers.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Everybody who calls print_trace wraps it around this check, move the
check instead to print_trace and remove the check from all the callers.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While syncing messages.[ch] I had to back out the ASSERT() code in
kerncompat.h, which means we now rely on the kernel code for ASSERT().
In order to maintain some semblance of separation introduce UASSERT()
and use that in all the purely userspace code.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The block-group-tree used to be under experimental flag in -R but now
that we've deprecated -R it does not make sense to leave
block-group-tree there for compatibility, this has never been exposed to
users.
Signed-off-by: David Sterba <dsterba@suse.com>
The feedback from the community on block group tree is very positive,
the only complain is, end users need to recompile btrfs-progs with
experimental features to enjoy the new feature.
So let's move it out of experimental features and let more people enjoy
faster mount speed.
Also change the option of btrfstune, from `-b` to
`--enable-block-group-tree` to avoid short option.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The option -R|--runtime-features was introduced to support features that
don't result in a full incompat flag change, thus things like
free-space-tree and quota features are put here.
But to end users, such separation of features is not helpful and can be
sometimes confusing.
Thus we're already migrating those runtime features into -O|--features
option under experimental builds.
I believe this is the proper time to move those runtime features into
-O|--features option, and mark the -R|--runtime-features option
deprecated.
For now we still keep the old option as for compatibility purposes.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There's a report that a static build fails when there's a static version
of libudev:
/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0/../../../../x86_64-pc-linux-gnu/bin/ld: /usr/lib/libudev.a(path-util.o): in function `path_is_mount_point':
path-util.c:(.text+0xbc0): multiple definition of `path_is_mount_point'; common/path-utils.o:path-utils.c:(.text+0x290): first defined here
There's a helper path_is_mount_point in libudev too but not exported so
dynamic library is fine, unlike static build. The static build of
libudev is not common but we can support that with a simple rename.
Issue: #611
Signed-off-by: David Sterba <dsterba@suse.com>
On a 32bit host the split qgroupid is wrong due to the way the numbers
are passed to the formatter as variable length arguments. The level is
u16, promoted to int and then parsed as u64. This means that the values
are shifted and some stack data are printed instead.
Example error messages from yast2-bootloader:
SystemCmd.cc(addLine):569 Adding Line 7 " "qgroupid": "21474836480/23885859321282560","
The value 21474836480 = 0x5000000 is 0x5 shifted by 32 bits,
23885859321282560 is 0x54dc1000000000 and shifting by 32 does not
lead to a valid value which should be 0 in this case.
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1209136
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Currently cli/009 test case failed with different exit number:
====== RUN CHECK /home/adam/btrfs-progs/btrfstune --help
usage: btrfstune [options] device
[...]
failed: /home/adam/btrfs-progs/btrfstune --help
test failed for case 009-btrfstune
[CAUSE]
In tune/main.c, we have the following call on usage():
static void print_usage(int ret)
{
usage(&tune_cmd);
exit(ret);
}
However usage() itself would always call exit(1):
void usage(const struct cmd_struct *cmd)
{
usage_command_usagestr(cmd->usagestr, NULL, 0, true, true);
exit(1);
}
This makes prevents any caller of usage() to modify its exit number.
[FIX]
Add a new argument @error for print_usage(), so we can properly return 0
for -h/--help usage.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add CPU_FLAG_NONE that's first in the list and represents a CPU without
any acceleration features. Can be used for reference/fallback
implementation.
Signed-off-by: David Sterba <dsterba@suse.com>
Add support for run-time detection of CPU features on x86_64 to allow
selection of accelerated implementations of hash algorithms.
When possible use the compiler builtin (works on gcc and clang).
The SHA extensions can't be detected by __builtin_cpu_supports and the
__cpuid/__cpuidex macros are not consistently provided in all supported
gcc and clang versions. Copy the __cpuidex and call it manually for the
SHA extensions. Complete list https://en.wikipedia.org/wiki/CPUID .
Signed-off-by: David Sterba <dsterba@suse.com>
[FALSE ALERT]
Unlike gcc, clang doesn't really understand the comments, thus it's
reportings tons of fall through related errors:
cmds/reflink.c:124:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
case 'r':
^
cmds/reflink.c:124:3: note: insert '__attribute__((fallthrough));' to silence this warning
case 'r':
^
__attribute__((fallthrough));
cmds/reflink.c:124:3: note: insert 'break;' to avoid fall-through
case 'r':
^
break;
[CAUSE]
Although gcc is fine with /* fallthrough */ comments, clang is not.
[FIX]
So just introduce a fallthrough macro to handle the situation properly,
and use that macro instead.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Use the configured widths and print padding directly instead of the
embedded printf format and fixed width strings.
Signed-off-by: David Sterba <dsterba@suse.com>
To make option formatting a bit easier so the spacing is unified add
macros and formatting helpers.
Usage in the help text:
OPTLINE("-o value", "description")
Internally the option and description are delimiters by chars that are
not part of normal text, the formatter separates that and uses fixed
with for output. The description text can be of any length, multi-line
text should still end up as one token (i.e. newline without ',' between).
Signed-off-by: David Sterba <dsterba@suse.com>
Add formatter type 'str' where the string must be escaped, e.g. paths or
internal data. Otherwise plain %s can be printed if it's known that
there are no special characters.
Signed-off-by: David Sterba <dsterba@suse.com>
Track how many rows belong to a header (names and separators) so they
can be printed or cleared separately. The rest is body that could be
cleared and filled repeatedly. The ranged API allows to print any number
of rows if the table is filled partially.
Signed-off-by: David Sterba <dsterba@suse.com>
Cleanups are for integer types, prototypes and comments.
New functionality: spacing can be set after table allocation as
->spacing, now able to print 1 or 2 spaces between columns.
Signed-off-by: David Sterba <dsterba@suse.com>
This patch copies in compression.h from the kernel. This is relatively
straightforward, we just have to drop the compression types definition
from ctree.h, and update the image to use BTRFS_NR_COMPRESS_TYPES
instead of BTRFS_COMPRESS_LAST, and add a few things to kerncompat.h to
make everything build smoothly.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This doesn't really belong with the ioctl definitions, and when we sync
the ioctl definitions with the kernel this helper will go away, so
adjust this now.
The ioctl.h is a public API header but the helper is not used in any 3rd
party tool so we can safely move it.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We're trying to get a U32 for the attributes, but the kernel sends a U64
(which is correct as we store attributes in a u64 flags field of the
inode). This makes anyone trying to receive a v2 send stream to fail with:
ERROR: invalid size for attribute, expected = 4, got = 8
We actually recently got such a report of someone using send stream v2 and
getting such failure. See the Link tag below.
Link: https://lore.kernel.org/linux-btrfs/6cb11fa5-c60d-e65b-0295-301a694e66ad@inbox.ru/
Fixes: 7a6fb356dc ("btrfs-progs: receive: process setflags ioctl commands")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We should warn that there's an experimental feature used. Add a helper
with optional description. Should be used only if such feature is used
and not always.
Signed-off-by: David Sterba <dsterba@suse.com>
The -O and -R help texts say that compatible version for
block-group-tree is 6.0 but it's in fact 6.1.
Issue: #523
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Even with chunk_objectid bug fixed, mkfs.btrfs can still caused stack
overflow when enabling extent-tree-v2 feature (need experimental
features enabled):
# ./mkfs.btrfs -f -O extent-tree-v2 ~/test.img
btrfs-progs v5.19.1
See http://btrfs.wiki.kernel.org for more information.
ERROR: superblock magic doesn't match
NOTE: several default settings have changed in version 5.15, please make sure
this does not affect your deployments:
- DUP for metadata (-m dup)
- enabled no-holes (-O no-holes)
- enabled free-space-tree (-R free-space-tree)
Label: (null)
UUID: 205c61e7-f58e-4e8f-9dc2-38724f5c554b
Node size: 16384
Sector size: 4096
Filesystem size: 512.00MiB
Block group profiles:
Data: single 8.00MiB
Metadata: DUP 32.00MiB
System: DUP 8.00MiB
SSD detected: no
Zoned device: no
=================================================================
[... Skip full ASAN output ...]
==65655==ABORTING
[CAUSE]
For experimental build, we have unified feature output, but the old
buffer size is only 64 bytes, which is too small to cover the new full
feature string:
extref, skinny-metadata, no-holes, free-space-tree, block-group-tree, extent-tree-v2
Above feature string is already 84 bytes, over the 64 on-stack memory
size.
This can also be proved by the ASAN output:
==65655==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffc4e03b1d0 at pc 0x7ff0fc05fafe bp 0x7ffc4e03ac60 sp 0x7ffc4e03a408
WRITE of size 17 at 0x7ffc4e03b1d0 thread T0
#0 0x7ff0fc05fafd in __interceptor_strcat /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:377
#1 0x55cdb7b06ca5 in parse_features_to_string common/fsfeatures.c:316
#2 0x55cdb7b06ce1 in btrfs_parse_fs_features_to_string common/fsfeatures.c:324
#3 0x55cdb7a37226 in main mkfs/main.c:1783
#4 0x7ff0fbe3c28f (/usr/lib/libc.so.6+0x2328f)
#5 0x7ff0fbe3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#6 0x55cdb7a2cb34 in _start ../sysdeps/x86_64/start.S:115
[FIX]
Introduce a new macro, BTRFS_FEATURE_STRING_BUF_SIZE, along with a new
sanity check helper, btrfs_assert_feature_buf_size().
The problem is I can not find a build time method to verify
BTRFS_FEATURE_STRING_BUF_SIZE is large enough to contain all feature
names, thus have to go the runtime function to do the BUG_ON() to verify
the macro size.
Now the minimal buffer size for experimental build is 138 bytes, just
bump it to 160 for future expansion.
And if further features go beyond that number, mkfs.btrfs/btrfs-convert
will immediately crash at that BUG_ON(), so we can definitely detect it.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Tested-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Commit "btrfs-progs: prepare merging compat feature lists" tries to
merged "-O" and "-R" options, as they don't correctly represents
btrfs features.
But that commit caused the following bug during mkfs for experimental
build:
$ mkfs.btrfs -f -O block-group-tree /dev/nvme0n1
btrfs-progs v5.19.1
See http://btrfs.wiki.kernel.org for more information.
ERROR: superblock magic doesn't match
ERROR: illegal nodesize 16384 (not equal to 4096 for mixed block group)
[CAUSE]
Currently btrfs_parse_fs_features() will return a u64, and reuse the
same u64 for both incompat and compat RO flags for experimental branch.
This can easily leads to conflicts, as
BTRFS_FEATURE_INCOMPAT_MIXED_BLOCK_GROUP and
BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE both share the same bit
(1 << 2).
Thus for above case, mkfs.btrfs believe it has set MIXED_BLOCK_GROUP
feature, but what we really want is BLOCK_GROUP_TREE.
[FIX]
Instead of incorrectly re-using the same bits in btrfs_feature, split
the old flags into 3 flags:
- incompat_flag
- compat_ro_flag
- runtime_flag
The first two flags are easy to understand, the corresponding flag of
each feature.
The last runtime_flag is to compensate features which doesn't have any
on-disk flag set, like QUOTA and LIST_ALL.
And since we're no longer using a single u64 as features, we have to
introduce a new structure, btrfs_mkfs_features, to contain above 3
flags.
This also mean, things like default mkfs features must be converted to
use the new structure, thus those old macros are all converted to
const static structures:
- BTRFS_MKFS_DEFAULT_FEATURES + BTRFS_MKFS_DEFAULT_RUNTIME_FEATURES
-> btrfs_mkfs_default_features
- BTRFS_CONVERT_ALLOWED_FEATURES -> btrfs_convert_allowed_features
And since we're using a structure, it's not longer as easy to implement
a disallowed mask.
Thus functions with @mask_disallowed are all changed to using
an @allowed structure pointer (which can be NULL).
Finally if we have experimental features enabled, all features can be
specified by -O options, and we can output a unified feature list,
instead of the old split ones.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add similar helper to pr_verbose that prints on stderr, for commands
that need to print to stderr based on the set verbosity level.
Signed-off-by: David Sterba <dsterba@suse.com>
Factor out the level check so we can add helper for stderr as some
commands don't/can't print to stdout.
Signed-off-by: David Sterba <dsterba@suse.com>
There are messages that are supposed to be printed by default and now
use the LOG_ALWAYS level, but that's a negative level and was meant as a
workaround for commands that must really print the message.
The default log level should be 1 and can be adjusted by the -q or -v
global commands.
Signed-off-by: David Sterba <dsterba@suse.com>
There's a group of helpers to read device size, the btrfs_device_size
should be one of them. Rename it and so minor cleanup.
Signed-off-by: David Sterba <dsterba@suse.com>
Switch the remaining use of assert() as it lacks the verbose assert that
we have for ASSERT (but otherwise is equivalent).
Signed-off-by: David Sterba <dsterba@suse.com>
There are several generic errors that repeat the same message. Define a
template for such messages, with optional text.
Signed-off-by: David Sterba <dsterba@suse.com>
Add more granularity to verbose levels and describe when they should be
used. Lots of pr_verbose still hardcode the value or compare level to
bconf.verbose but the individual messages have to be revisited
separately.
Signed-off-by: David Sterba <dsterba@suse.com>
Rename MUST_LOG Use a prefix LOG_ so we can add more levels, use it
where it was hardcoded as argument to pr_verbose.
Signed-off-by: David Sterba <dsterba@suse.com>
In a few occasions there's an internal report, make a common helper so
the prefix message is not necessary and the stack trace can be printed
if enabled.
Signed-off-by: David Sterba <dsterba@suse.com>
Process an enable_verity cmd by running the enable verity ioctl on the
file. Since enabling verity denies write access to the file, it is
important that we don't have any open write file descriptors.
This also revs the send stream format to version 3 with no format
changes besides the new commands and attributes. This version is not
finalized and commands may change, also this needs to be synchronized
with any kernel changes.
Note: the build is conditional on the header linux/fsverity.h
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
The block group tree doesn't yet have full bi-directional conversion
support from btrfstune, and it seems we may want one or two release
cycles to rule out some extra bugs before really releasing the progs
support.
This patch will hide the block group tree feature behind experimental
flag for the following tools:
- btrfstune
"-b" option to convert to bg tree.
- mkfs.btrfs
hide "block-group-tree" feature from both -O (the new default position
for all features) and -R (the old, soon to be deprecated one).
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Lots of code still uses fprintf(stderr, "...") that should be the
error() helper. The kernel-shared code is left out of the conversion for
now.
Signed-off-by: David Sterba <dsterba@suse.com>
The tool IWYU (include what you use) suggests to remove and add some
includes. This is only partial to avoid accidental build breakage, the
includes are entangled and will have to be cleaned in the future again.
Signed-off-by: David Sterba <dsterba@suse.com>
The features are split to -O and -R but it does not make much sense from
user POV, there are different levels of compatibility but it does not
need to be selected that way. Merge the tables into one but hide it
behind experimental build until the conversion is complete.
Signed-off-by: David Sterba <dsterba@suse.com>
All files include the <btrfsutil.h> which could be confused with the
system-wide installation. Drop the -I path from build and use full path
for any libbtrfsutil headers.
Signed-off-by: David Sterba <dsterba@suse.com>
The preferred order:
- system headers
- standard headers
- libraries
- kernel library
- kernel shared
- common headers
- other tools
- own headers
Signed-off-by: David Sterba <dsterba@suse.com>
The function csum_tree_block() is not really utilized by anyone, all
current callers just use csum_tree_block_size().
Furthermore there is a stale definition in common/utils.h which is using
the old "struct btrfs_root" as the first argument, while we have already
migrated to "struct btrfs_fs_info".
So just unexport csum_tree_block() and remove the stale definition.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The raw number of the features in the list of 'mkfs.btrfs -O list-all'
and for -R is not that useful, it's an implementation detail or can be
put to documentation.
Now looks like:
Filesystem features available:
mixed-bg - mixed data and metadata block groups (compat=2.6.37, safe=2.6.37)
extref - increased hardlink limit per file to 65536 (compat=3.7, safe=3.12, default=3.12)
raid56 - raid56 extended format (compat=3.9)
...
Signed-off-by: David Sterba <dsterba@suse.com>
Block group tree feature is completely a standalone feature, and it has
been over 5 years before the initial introduction to solve the long
mount time.
I don't really want to waste another 5 years waiting for a feature which
may or may not work, but definitely not properly reviewed for its
preparation patches.
So this patch will separate the block group tree feature into a
standalone compat RO feature.
There is a catch, in mkfs create_block_group_tree(), current
tree-checker only accepts block group item with valid chunk_objectid,
but the existing code from extent-tree-v2 didn't properly initialize it.
This patch will also fix above mentioned problem so kernel can mount it
correctly.
Now mkfs/fsck should be able to handle the fs with block group tree.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This file includes linux/fs.h which includes linux/mount.h and with
glibc 2.36 linux/mount.h and glibc mount.h are not compatible [1]
therefore try to avoid including both headers
[1] https://sourceware.org/glibc/wiki/Release/2.36
Signed-off-by: Khem Raj <raj.khem@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The initial proposal for file attributes was built on simply doing
SETFLAGS but this builds on an old and non-extensible interface that has
no direct mapping for all inode flags. There's a unified interface
fileattr that covers file attributes and xflags, it should be possible
to add new bits.
On the protocol level the value is copied as-is in the original inode
but this does not provide enough information how to apply the bits on
the receiving side. Eg. IMMUTABLE flag prevents any changes to the file
and has to be handled manually.
The receiving side does not apply the bits yet, only parses it from the
stream.
Signed-off-by: David Sterba <dsterba@suse.com>
Add constant for initial value to avoid unexpected clashes with user
defined getopt values and shift the common size getopt values.
Signed-off-by: David Sterba <dsterba@suse.com>
In send stream v2, send can emit a command for setting inode flags via
the setflags ioctl. Pass the flags attribute through to the ioctl call
in receive.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Send stream v2 can emit fallocate commands, so receive must support them
as well. The implementation simply passes along the arguments to the
syscall. Note that mode is encoded as a u32 in send stream but fallocate
takes an int, so there is a unsigned->signed conversion there.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Add a new btrfs_send_op and support for both dumping and proper receive
processing which does actual encoded writes.
Encoded writes are only allowed on a file descriptor opened with an
extra flag that allows encoded writes, so we also add support for this
flag when opening or reusing a file for writing.
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
The new format privileges the BTRFS_SEND_A_DATA attribute by
guaranteeing it will always be the last attribute in any command that
needs it, and by implicitly encoding the data length as the difference
between the total command length in the command header and the sizes of
the rest of the attributes (and of course the tlv_type identifying the
DATA attribute). To parse the new stream, we must read the tlv_type and
if it is not DATA, we proceed normally, but if it is DATA, we don't
parse a tlv_len but simply compute the length.
In addition, we add some bounds checking when parsing each chunk of
data, as well as for the tlv_len itself.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
In send stream v2, write commands can now be an arbitrary size. For that
reason, we can no longer allocate a fixed array in sctx for read_cmd.
Instead, read_cmd dynamically allocates sctx->read_buf. To avoid
needless reallocations, we reuse read_buf between read_cmd calls by also
keeping track of the size of the allocated buffer in sctx->read_buf_sz.
We do the first allocation of the old default size at the start of
processing the stream, and we only reallocate if we encounter a command
that needs a larger buffer.
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
An encoded extent can be up to 128K in length, which exceeds the largest
value expressible by the current send stream format's 16 bit tlv_len
field. Since encoded writes cannot be split into multiple writes by
btrfs send, the send stream format must change to accommodate encoded
writes.
Supporting this changed format requires retooling how we store the
commands we have processed. We currently store pointers to the struct
btrfs_tlv_headers in the command buffer. This is not sufficient to
represent the new BTRFS_SEND_A_DATA format. Instead, parse the attribute
headers and store them in a new struct btrfs_send_attribute which has a
32bit length field. This is transparent to users of the various TLV_GET
macros.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Headers that are only exported and not used for build do not need the
BTRFS_FLAT_INCLUDES switch (between local and installed headers). Now
that there are local copies of the shared headers drop the respective
part from local headers.
Signed-off-by: David Sterba <dsterba@suse.com>
Kernel commit efc0e69c2fea ("btrfs: introduce exclusive operation
BALANCE_PAUSED state") allows to start a device add when there's a
paused balance, eg. to let the balance finish when there's not enough
chunk space. Add the support for that, though this needs an updated
kernel to export the 'balance paused' in sysfs.
Signed-off-by: David Sterba <dsterba@suse.com>
We need to make sure we process the block group root, and mark its
blocks as used for the free space tree checking.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The fuzz-test/003 was infinite looping when I reworked the code to
re-calculate the used bytes for the superblock. This is because fsck
wasn't properly fixing the bad extent before my change, it just happened
to error out nicely, whereas my change made it so we go the wrong bytes
used count and just infinite looped trying to fix the problem.
Fix this by sanity checking the extent when we try to re-calculate the
bytes_used. This makes us no longer infinite loop so we can get through
the fuzz tests.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We want to enable developers to test the extent tree v2 features as they
are added, add the ability to mkfs an extent tree v2 fs if we have
experimental enabled.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We could have multiple extent roots, so add a helper to mark all the
used space in the FS based on any extent roots we find, and then use
this extent io tree to fixup the block group accounting.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
When we switch to multiple global trees we'll need to access the
appropriate extent root depending on the block group or possibly root.
To handle this, use a helper in most places and then the actual root in
places where it is required. We will whittle down the direct accessors
with future patches, but this does the bulk of the preparatory work.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs_mark_used_tree_blocks skips the reloc roots for some reason, which
causes problems because these blocks are in use, and we use this helper
to determine if the block accounting is correct with extent tree v2.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This is going to be used for the extent tree v2 stuff more commonly, so
move it out so that it is accessible from everywhere that we need it.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have this helper sitting in extent-tree.c, but it's a repair
function. I'm going to need to make changes to this for extent-tree-v2
and would rather this live outside of the code we need to share with the
kernel.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
mkfs.btrfs v5.15 outputs a message even if the disk is a HDD without
TRIM/DISCARD support:
Performing full device TRIM /dev/sdc2 (326.03GiB) ...
[CAUSE]
mkfs.btrfs check TRIM/DISCARD support through the content of
queue/discard_granularity, but compare it against a wrong value.
When HDD without TRIM/DISCARD support, the content of
queue/discard_granularity is '0' '\n' '\0', rather than '0' '\0'.
[FIX]
- compare the value based on atoi() to provide more robustness
- delete unnecessary '\n' in pr_verbose()
Fixes: c50c448518 ("btrfs-progs: do sysfs detection of device discard capability")
Signed-off-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There are a lot of call sites where we use the following code snippet:
u8 super_block_data[BTRFS_SUPER_INFO_SIZE];
struct btrfs_super_block *sb;
u64 ret;
sb = (struct btrfs_super_block *)super_block_data;
The reason for this is, structure btrfs_super_block was smaller than
BTRFS_SUPER_INFO_SIZE.
Thus for anything with csum involved, we have to use a proper 4K buffer.
Since the recent unification of sizeof(struct btrfs_super_block), we no
longer need such workaround, and can use struct btrfs_super_block
directly to do any operation.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Since commit dad03fac3b ("btrfs-progs: switch btrfs_group_profile_str
to use raid table"), fstests/btrfs/023 and btrfs/151 will always fail.
The failure of btrfs/151 explains the reason pretty well:
btrfs/151 1s ... - output mismatch
--- tests/btrfs/151.out 2019-10-22 15:18:14.068965341 +0800
+++ ~/xfstests-dev/results//btrfs/151.out.bad 2021-11-02 17:13:43.879999994 +0800
@@ -1,2 +1,2 @@
QA output created by 151
-Data, RAID1
+Data, raid1
...
(Run 'diff -u ~/xfstests-dev/tests/btrfs/151.out ~/xfstests-dev/results//btrfs/151.out.bad' to see the entire diff)
[CAUSE]
Commit dad03fac3b ("btrfs-progs: switch btrfs_group_profile_str to use
raid table") will use btrfs_raid_array[index].raid_name, which is all
lower case.
[FIX]
There is no need to bring such output format change.
So here we split the btrfs_raid_attr::raid_name[] into upper_name[] and
lower_name[], and make upper and lower case helpers for callers to use.
Now there are several types of callers referring to lower_name and
upper_name:
- parse_bg_profile()
It uses strcasecmp(), either case would be fine.
- btrfs_group_profile_str()
Originally it uses upper case for all profiles except "single".
Now unified to upper case.
- sprint_profiles()
It uses lower case.
- bg_flags_to_str()
It uses upper case.
Reviewed-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Commit ("btrfs-progs: switch btrfs_group_profile_str to use raid table")
introduced a regression that raid profile of GlobalReserve will be
printed as 'unknown'.
$ btrfs filesystem df /mnt/test
Data, single: total=5.02TiB, used=4.98TiB
System, single: total=4.00MiB, used=624.00KiB
Metadata, single: total=11.01GiB, used=6.94GiB
GlobalReserve, unknown: total=512.00MiB, used=0.00B
Fix it by:
- take BTRFS_BLOCK_GROUP_RESERVED into account when masking the block
group flags
- update the define of BTRFS_BLOCK_GROUP_RESERVED too so it's same as in
kernel
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Wang Yugui <wangyugui@e16-tech.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Kernel emits inode number for all mkfile/mkdir/... commands but the
receive part does not pass it to the callbacks. At least document that
and read it from the stream in case we'd like to use it in the future.
Signed-off-by: David Sterba <dsterba@suse.com>
Use the raid table helper to avoid hard coding profiles for the given
number of devices in test_num_disk_vs_raid.
Signed-off-by: David Sterba <dsterba@suse.com>
Another duplication of the raid table, in this case missing the changes
to raid10 and raid0 minimum devices changed in a177ef7dd4
("btrfs-progs: mkfs: allow degenerate raid0/raid10").
Define and use a helper using the table value.
Signed-off-by: David Sterba <dsterba@suse.com>
Wrap pread with btrfs_pread as well.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Wrap pwrite with btrfs_pwrite(). It simply calls pwrite() on non-zoned
btrfs (opened without O_DIRECT). On zoned mode (opened with O_DIRECT),
it allocates an aligned bounce buffer, copies the contents and uses it
for direct-IO writing.
Writes in device_zero_blocks() and btrfs_wipe_existing_sb() are a little
tricky. We don't have fs_info on our hands, so use zinfo to determine it
is a zoned device or not.
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since we cannot create ext*/reiserfs on a zoned device, it is useless to
allow ZONED feature when converting a file system. Drop ZONED flag from
BTRFS_CONVERT_ALLOWED_FEATURES.
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Due to ambiguity of error values in the public API subvol_uuid_search,
there was second version added. There's a separate copy in libbtrfs and
we don't have to distinguish in the internal code. All callers check for
IS_ERR and NULL, so we can safely merge the helpers into one.
Signed-off-by: David Sterba <dsterba@suse.com>
After removing uuid search fallback code the structure has become
trivial and copies the fd that all callers have in their context.
Signed-off-by: David Sterba <dsterba@suse.com>
After the uuid search fallback code has been removed, the finit helper
has become empty and can be removed.
Signed-off-by: David Sterba <dsterba@suse.com>
There's a lot of code under BTRFS_COMPAT_SEND_NO_UUID_TREE to support
kernels < 3.12 that don't have uuid tree and the subvolume uuids are
searched in a slow way, building the uuid tree on the userspace side.
As the uuid tree is always created, the fallback code was not exercised
anyway due to 'uuid_tree_existed' check in subvol_uuid_search2.
Delete the code from the internal copy of send-utils. The support still
stays for libbtrfs and will be removed in the future.
Signed-off-by: David Sterba <dsterba@suse.com>
The btrfs_list_* functions come with some overhead and for simple path
resolution we can use btrfs_subvolid_resolve.
Signed-off-by: David Sterba <dsterba@suse.com>
We don't need to include this besides btrfs-list.c itself and
subvolume.c that does use the btrfs_list_* API.
Signed-off-by: David Sterba <dsterba@suse.com>