There's a report that 'btrfs balance start --enqueue' does not properly
wait when there are multiple instances started. The command does a busy
wait instead of timeouts.
Strace output:
0.000006 pselect6(5, NULL, NULL, [4], {tv_sec=60, tv_nsec=0}, NULL) = 1 (except [4], left {tv_sec=59, tv_nsec=999999716})
0.000008 pselect6(5, NULL, NULL, [4], {tv_sec=29, tv_nsec=999999000}, NULL) = 1 (except [4], left {tv_sec=29, tv_nsec=999998786})
After the first select there's almost the entire time left, the second
one starts right after it.
Polling/selecting sysfs files is possible under some conditions:
- the file descriptor must be reopened before each poll/select
- the whole buffer must be read too
With that in place it now works as expected. The remaining timeout logic
is slightly adjusted to wait at most 10 seconds so the pending jobs do
not wait too long if there's still a lot of time left from the first
select.
Issue: #746
Signed-off-by: David Sterba <dsterba@suse.com>
Be verbose about the potential compatibility problems with the
sectorsize and page size. Also print the page size on the overview.
Signed-off-by: David Sterba <dsterba@suse.com>
The symbol BTRFS_UPDATE_KERNEL seems to be unused since 2f55fd7019
("btrfs-progs: optimize btrfs_scan_lblkid() for multiple calls"), remove
it.
Signed-off-by: David Sterba <dsterba@suse.com>
The raid-stripe-tree can be enabled for convert, though it's still
considered incomplete and slightly experimental. Due to that the tests
need to be adjusted to check for support and skip mount eventually.
Possible remaining options to add: quota, squota
Issue: #694
Signed-off-by: David Sterba <dsterba@suse.com>
This patch introduces a new parser helper, parse_u64_with_suffix(),
which has a better error handling, following all the parse_*()
helpers to return non-zero value for errors.
This new helper is going to replace parse_size_from_string(), which
would directly call exit(1) to stop the whole program.
Furthermore most callers of parse_size_from_string() are expecting
exit(1) for error, so that they can skip the error handling.
For those call sites, introduce a wrapper, arg_strtou64_with_suffix(),
to do that. The only disadvantage is a little less detailed error
report for why the parse failed, but for most cases the generic error
string should be enough.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Both functions are just doing the same thing, the only difference is
only in error handling, as parse_u64() requires callers to handle it,
meanwhile arg_strtou64() would call exit(1).
This patch would convert arg_strtou64() to utilize parse_u64(), and use
the return value to output different error messages.
This also means the return value of parse_u64() would be more than just
0 or 1, but -EINVAL for invalid string (including no numeric string at
all, has any tailing characters, or minus value), and -ERANGE for
overflow.
The existing callers are only checking if the return value is 0, thus
not really affected.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
There is a report about failed btrfs-convert, which shows the following
error:
Create btrfs metadata
corrupt leaf: root=5 block=5001931145216 slot=1 ino=89911763, invalid previous key objectid, have 89911762 expect 89911763
leaf 5001931145216 items 336 free space 7 generation 90 owner FS_TREE
leaf 5001931145216 flags 0x1(WRITTEN) backref revision 1
fs uuid 8b69f018-37c3-4b30-b859-42ccfcbe2449
chunk uuid 448ce78c-ea41-49f6-99dc-46ad80b93da9
item 0 key (89911762 INODE_REF 3858733) itemoff 16222 itemsize 61
index 171 namelen 51 name: [FILENAME1]
item 1 key (89911763 INODE_REF 3858733) itemoff 16161 itemsize 61
index 103 namelen 51 name: [FILENAME2]
[CAUSE]
When iterating a directory, btrfs-convert would insert the DIR_ITEMs,
along with the INODE_REF of that inode.
This leads to above stray INODE_REFs, and trigger the tree-checker.
This can only happen for large fs, as for most cases we have all these
modified tree blocks cached, thus tree-checker won't be triggered.
But when the tree block cache is not hit, and we have to read from disk,
then such behavior can lead to above tree-checker error.
[FIX]
Insert a dummy INODE_ITEM for the INODE_REF first, the inode items would
be updated when iterating the child inode of the directory.
Issue: #731
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Recently we had a scrub use-after-free caused by unaligned chunk
length, although the fix was submitted, we may want to do extra checks
for a chunk's alignment.
This patch adds such check for the starting bytenr and length of a
chunk, to make sure they are properly aligned to 64K stripe boundary.
By default, the check only leads to a warning but is not treated as an
error, as we expect kernel to handle such unalignment without any
problem.
But if the new debug environmental variable,
BTRFS_PROGS_DEBUG_STRICT_CHUNK_ALIGNMENT, is specified, then we will
treat it as an error. So that we can detect unexpected chunks from
btrfs-progs, and fix them before reaching the end users.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
To be consistent with the rest of the code the sysfs helper should
return the -errno instead of passing -1 from various syscalls. Update
callers that relied on -1 as the invalid file descriptor.
Signed-off-by: David Sterba <dsterba@suse.com>
The enqueue option should let the user know that the expected operation
hasn't started yet and that it's waiting for another one. Although the
exclusive operations can take long, the two reason should be
distinguished.
Signed-off-by: David Sterba <dsterba@suse.com>
strtoull may return the boundary values, if the callers could expect
that and verify it then the errno must be reset before the call.
Signed-off-by: David Sterba <dsterba@suse.com>
GCC 14 introduces a new -Walloc-size included in -Wextra which gives:
```
common/utils.c:983:15: warning: allocation of insufficient size ‘1’ for type ‘struct config_param’ with size ‘32’ [-Walloc-size]
cmds/qgroup.c:1644:13: warning: allocation of insufficient size ‘1’ for type ‘struct btrfs_qgroup_inherit’ with size ‘72’ [-Walloc-size]
```
The calloc prototype is:
```
void *calloc(size_t nmemb, size_t size);
```
So, just swap the number of members and size arguments to match the prototype, as
we're initialising 1 struct of size `sizeof(struct ...)`. GCC then sees we're not
doing anything wrong.
Pull-request: #707
Signed-off-by: Sam James <sam@gentoo.org>
Signed-off-by: David Sterba <dsterba@suse.com>
There's a report that reading properties from a sound device the system
is stuck and then gets rebooted by watchdog. Reading from fifo files
gets stuck as well, although this would not trigger the watchdog.
The reason is that open() on fifo files is blocking until the other end
of the pipe is opened. For device nodes it's driver specific, most
device nodes fail right away:
$ btrfs prop get /dev/tty
ERROR: object is not a btrfs object: /dev/tty
In case of the sound device the consequences were fatal. We can fix that
by opening the path on non-blocking mode. This is only for reading the
fsid, the fd is closed right after the ioctl so the non-blocking mode
does not affect other operation.
The blocking mode must be used for block devices as e.g. loop devices
may not be finalized when the open() call returns and get_fsid fails.
The known problematic devices are character and fifos.
Issue: #699
Signed-off-by: David Sterba <dsterba@suse.com>
Some commands could be run in a dry-run mode, i.e. not doing any
write/change actions, only printing the steps and ignoring errors.
There are two possibilities where to put the option:
- as a global one: btrfs --dry-run subvolume delete /path
- local option: btrfs subvolume delete --dry-run /path
As we have several global options already, let's put it there, dry-run
should not be very common so the slight inconvenience of writing the
option out of order of command arguments should be acceptable.
Issue: #629
Signed-off-by: David Sterba <dsterba@suse.com>
./btrfs --param key=value command ...
./btrfs --param key command ...
To pass various tuning data for testing and debugging, undocumented
for regular users.
To add support add reading of the parameter value after option parsing
bconf_param_value("key") and convert to what you need.
Signed-off-by: David Sterba <dsterba@suse.com>
The kernel patches for RST and squota are queued for 6.7, we need to be
able to test the features so it's not necessary to hide the mkfs support
under experimental build. The kernel may still need debug build to
enable mount.
Signed-off-by: David Sterba <dsterba@suse.com>
While we like to have the descriptive names also add short aliases that
we also use for reference in changelogs and documentation.
$ mkfs.btrfs -O list-all
Filesystem features available:
mixed-bg - mixed data and metadata block groups (compat=2.6.37, safe=2.6.37)
quota - quota support (qgroups) (compat=3.4)
extref - increased hardlink limit per file to 65536 (compat=3.7, safe=3.12, default=3.12)
raid56 - raid56 extended format (compat=3.9)
skinny-metadata - reduced-size metadata extent refs (compat=3.10, safe=3.18, default=3.18)
no-holes - no explicit hole extents for files (compat=3.14, safe=4.0, default=5.15)
fst - free-space-tree alias
free-space-tree - free space tree (space_cache=v2) (compat=4.5, safe=4.9, default=5.15)
raid1c34 - RAID1 with 3 or 4 copies (compat=5.5)
zoned - support zoned devices (compat=5.12)
extent-tree-v2 - new extent tree format (compat=5.15)
bgt - block-group-tree alias
block-group-tree - block group tree to reduce mount time (compat=6.1)
rst - raid-stripe-tree alias
raid-stripe-tree - raid stripe tree (compat=6.7)
squota - squota support (simple accounting qgroups) (compat=6.7)
Signed-off-by: David Sterba <dsterba@suse.com>
The list 'mkfs -O list-all' should be sorted by version of kernel
support (compat) so it's clear what's new.
Signed-off-by: David Sterba <dsterba@suse.com>
The clear-cache functionality is shared by several commands:
- btrfs check
For --clear-cache and --clear-ino-cache.
- btrfstune
Mostly for block-group-tree feature conversion.
- btrfs-convert
To enable the now default v2 space cache.
Thus it's no longer proper to keep clear-cache.[ch] under check/
directory, move them to common/ directory.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The current implementation would introduce variable shadowing due to
both max() and min() are using the same __x and __y.
This may not be a big deal, but since kernel is already handling it
properly using __UNIQUE_ID() macro, and has more checks, we can
cross-port the kernel version to btrfs-progs.
There are some dependency needed, they are all small enough thus can be
put into the helper.
- __PASTE()
- __UNIQUE_ID()
- BUILD_BUG_ON_ZERO()
- __is_constexpr()
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This function and it's related functions only exist for the utilities
that populate existing file systems, and do not exist in the upstream
kernel. Move this function and the related function into it's own
common source file and out of the kernel-shared sources, and then update
all of the users to include the new location of this code.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add the ability to enable simple quotas from mkfs with '-O squota'
There is some complication around handling enable_gen while still
counting the root node of an fs. To handle this, employ a hack of doing
a no-op write on the root node to bump its generation up above that of
the qgroup enable generation, which results in counting it properly.
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
Allow for RAID levels 0, 1 and 10 on zoned devices if the RAID stripe tree
is used.
Signed-off-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We're getting more features and the string size limit will not be
sufficient, so extend enough that we won't have to care for some time.
Signed-off-by: David Sterba <dsterba@suse.com>
The accelerated crc32c needs to check for two CPU features, the crc32c
instructions is in SSE 4.2 and 'pclmulqdq' is a separate. There's still
old hardware used that does not have the PCLMUL instructions. Detect it
and make it the condition.
The pclmul is not supported on old compilers so also add a
configure-time detection and leave the SSE 4.2 only implementation as
the accelerated one if possible.
Issue: #676
Signed-off-by: David Sterba <dsterba@suse.com>
Aligning with the kernel's struct btrfs_fs_devices:fs_list, rename
btrfs_fs_devices::list to btrfs_fs_devices::fs_list.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Copy faster implementation of crc32c from linux kernel as of 6.5-rc7
(x86_64, arch/x86/crypto/crc32c-pcl-intel-asm_64.S). This needs
assembler build support, so detect target architecture so
cross-compilation still works.
Add a special CPU flag so the old and new implementations can be
benchmarked and verified separately.
Sample benchmark:
CPU flags: 0x1ff
CPU features: SSE2 SSSE3 SSE41 SSE42 SHA AVX AVX2 CRC32C_PCL
Block size: 4096
Iterations: 1000000
Implementation: builtin
Units: CPU cycles
NULL-NOP: cycles: 77177218, cycles/i 77
NULL-MEMCPY: cycles: 226313072, cycles/i 226, 62133.395 MiB/s
CRC32C-ref: cycles: 24418596066, cycles/i 24418, 575.859 MiB/s
CRC32C-NI: cycles: 1188335920, cycles/i 1188, 11833.073 MiB/s
CRC32C-PCL: cycles: 463193456, cycles/i 463, 30358.037 MiB/s
XXHASH: cycles: 851606646, cycles/i 851, 16511.916 MiB/s
SHA256-ref: cycles: 74476234956, cycles/i 74476, 188.808 MiB/s
SHA256-NI: cycles: 34198637428, cycles/i 34198, 411.177 MiB/s
BLAKE2-ref: cycles: 14761411664, cycles/i 14761, 952.597 MiB/s
BLAKE2-SSE2: cycles: 18101896796, cycles/i 18101, 776.807 MiB/s
BLAKE2-SSE41: cycles: 12599091062, cycles/i 12599, 1116.087 MiB/s
BLAKE2-AVX2: cycles: 9668247506, cycles/i 9668, 1454.418 MiB/s
The new implementation is about 2.5x faster.
Note: there new version does not work on musl because of linkage
problems (relocations in .rodata), so it's still using the old
implementation.
Signed-off-by: David Sterba <dsterba@suse.com>
In some places we want to read a single u64 value from a sysfs path, or
from fsid directory. Add helpers that do that in one go.
Signed-off-by: David Sterba <dsterba@suse.com>
The sysfs could use more convenience helpers so move the current code to
own file before adding more helpers.
Signed-off-by: David Sterba <dsterba@suse.com>
API for extensible array of pointers for covenience. A simple wrapper
around a (void *) array with length.
Signed-off-by: David Sterba <dsterba@suse.com>
This is a potentially breaking change to json output. An all zeros uuid
was printed as "-" but we can utilize native json type null for that.
Note the va_copy must be used as va_arg advances the pointer.
{
"nulluuid": null
}
Signed-off-by: David Sterba <dsterba@suse.com>
Make the timestamp format more descriptive what is actually printed. We
may need separate date or time in the future.
Signed-off-by: David Sterba <dsterba@suse.com>
The json spec allows numeric values and it's recommended to use them
instead of the stringified numbers. This is a potentially breaking change
if some tools relied on the string value.
As most formats we now have are '%llu' and it's convenient to just pass
it to vprintf, don't add a special type for ints. Any new int type must
be added to the list.
{
"number": 1234
}
Signed-off-by: David Sterba <dsterba@suse.com>
The 'str' type was added in ecbb6a7fcd ("btrfs-progs: add json
formatter for escaped string") but not documented. It should be used
e.g. for paths or strings from unknown origin.
Signed-off-by: David Sterba <dsterba@suse.com>
For null or boolean values the "..." quoting must not be done, add
support for that. This is detected internally for each printed value.
Signed-off-by: David Sterba <dsterba@suse.com>
A newline character in option description text will break line and then
indent the text properly, can be used for lists or paragraphs.
Signed-off-by: David Sterba <dsterba@suse.com>
To be able to test errors at specific locations, add a simple way to
check for a condition in code and controlled from user space environment
variable INJECT. For now a single value is accepted.
Use like:
if (inject_error(0x1234)) {
do_something();
return -ERROR;
}
This is enabled in debugging build by default (make D=1) and can be
enabled on demand too (make EXTRA_CFLAGS=-DINJECT).
Signed-off-by: David Sterba <dsterba@suse.com>
There's a report that btrfs-find-root does not work as built-in tool in
btrfs.box, while it's advertised in the help:
$ ./btrfs.box help --box
Standalone tools built-in in the busybox style:
- mkfs.btrfs
- btrfs-image
- btrfs-convert
- btrfstune
- btrfs-find-root
Add the support as it might be useful tool sometimes. In the future the
command should be moved to e.g. inspect-internal or rescue.
Issue: #648
Signed-off-by: David Sterba <dsterba@suse.com>
The function check_where_mounted() scans the system for all other btrfs
devices, which is necessary for its operation. However, in certain
cases, devices remaining in the scanned state is undesirable. Introduce
the 'noscan' argument to make devices unscanned before return.
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
To prepare for handling command line given devices factor out
btrfs_scan_argv_devices().
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The variable 'is_btrfs' is declared as an integer but should be a boolean
instead.
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The send.h for libbtrfs has been separated some time ago so we're now
free to keep up with kernel, 6.4-rc1.
Signed-off-by: David Sterba <dsterba@suse.com>
The following functions accept a buffer for write, which can be marked
as const:
- btrfs_pwrite()
- write_data_to_disk()
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
It's not a common practice to use the same io function for both read and
write (we have pread() and pwrite(), not pio()).
Furthermore the original function has the following problems:
- Not returning proper error number
If we had ioctl/stat errors we just return 0 with errno set.
Thus caller would treat it as a short read, not a proper error.
- Unnecessary @ret_rw
This is not that obvious if we have different handling for read and
write, but if we split them it's super obvious we can reuse @ret.
- No proper copy back for short read
- Unable to constify the @buf pointer for write operation
All those problems would be addressed in this patch.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The fixes involve the following changes:
- Unexport functions which are not utilized out of the file
* print_path_column()
* parse_reflink_range()
* btrfs_list_setup_print_column()
* device_get_partition_size_sysfs()
* max_zone_append_size()
- Include related headers before implementing the function
* change-uuid.c
* convert-bgt.c
* seed.h
- Add missing headers caused by the above header changes
* include <uuid/uuid.h> for tune/tune.h.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We won't actually use the async code in progs, however we call the
helpers and such all over the normal code, so sync this into btrfs-progs
to make syncing other parts of the kernel easier.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Glibc provides an interface to extend the printf formats but this is not
standardized and does not work on musl. The code brought from kernel
uses %pV for varargs and also has own implementation of printk.
As a workaround for musl expand the pV value to a string and then
simply print it. The details are hidden behind macros:
- DECLARE_PV(vaf)
- PV_ASSIGN(vaf, format, args)
- PV_FMT in printf string
- PV_VAL in arguments
Signed-off-by: David Sterba <dsterba@suse.com>
We use the struct va_format to do nested printk's internally with our
message handling. Add the appropriate user space code to make this work
properly so when we start copying this code into btrfs-progs we get the
proper messages.
Note: this breaks build on musl, printf.h is not available.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
These are the printk helpers from the kernel. There were a few
modifications, the hi-lights are
- We do not have fs_info::fs_state, so that needed to be removed.
- We do not have discard.h sync'ed yet, so that dependency was dropped.
- Anything related to struct super_block was commented out.
- The transaction abort had to be modified to fit with the current
btrfs-progs code.
- Added a btrfs_no_printk() helper to common/messages.* so that the
print statements still worked.
- The 32bit limit checkers are not needed so are behind __KERNEL__
Additionally there were kerncompat.h changes that needed to be made to
handle the dependencies properly. Those are easier to spot.
Any function that needed to be modified has a MODIFIED tag in the
comment section with a list of things that were changed.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We want to keep this file locally as we want to be uptodate with
upstream, so we can build btrfs-progs regardless of which kernel is
currently installed. Sync this with the upstream version and put it in
kernel-shared/uapi to maintain some semblance of where this file comes
from.
There are some changes that need to be synced back to kernel. A local
definition of static_assert is used to avoid compilation problems on gcc
(< 9) due to mandatory 2nd parameter.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
These helpers all do variations on the same thing, so add a helper to
just do the printf part, and a macro to handle the special prefix and
postfix, and then make the helpers just use the macro and new helper.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We have different helpers for warning_on and error_on(), which take the
condition and then do the printf. However we can just check the
condition in the macro and call the normal warning or error helper, so
clean this usage up and delete the unneeded message helpers.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Everybody who calls print_trace wraps it around this check, move the
check instead to print_trace and remove the check from all the callers.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
While syncing messages.[ch] I had to back out the ASSERT() code in
kerncompat.h, which means we now rely on the kernel code for ASSERT().
In order to maintain some semblance of separation introduce UASSERT()
and use that in all the purely userspace code.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The block-group-tree used to be under experimental flag in -R but now
that we've deprecated -R it does not make sense to leave
block-group-tree there for compatibility, this has never been exposed to
users.
Signed-off-by: David Sterba <dsterba@suse.com>
The feedback from the community on block group tree is very positive,
the only complain is, end users need to recompile btrfs-progs with
experimental features to enjoy the new feature.
So let's move it out of experimental features and let more people enjoy
faster mount speed.
Also change the option of btrfstune, from `-b` to
`--enable-block-group-tree` to avoid short option.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The option -R|--runtime-features was introduced to support features that
don't result in a full incompat flag change, thus things like
free-space-tree and quota features are put here.
But to end users, such separation of features is not helpful and can be
sometimes confusing.
Thus we're already migrating those runtime features into -O|--features
option under experimental builds.
I believe this is the proper time to move those runtime features into
-O|--features option, and mark the -R|--runtime-features option
deprecated.
For now we still keep the old option as for compatibility purposes.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
There's a report that a static build fails when there's a static version
of libudev:
/usr/lib/gcc/x86_64-pc-linux-gnu/12.2.0/../../../../x86_64-pc-linux-gnu/bin/ld: /usr/lib/libudev.a(path-util.o): in function `path_is_mount_point':
path-util.c:(.text+0xbc0): multiple definition of `path_is_mount_point'; common/path-utils.o:path-utils.c:(.text+0x290): first defined here
There's a helper path_is_mount_point in libudev too but not exported so
dynamic library is fine, unlike static build. The static build of
libudev is not common but we can support that with a simple rename.
Issue: #611
Signed-off-by: David Sterba <dsterba@suse.com>
On a 32bit host the split qgroupid is wrong due to the way the numbers
are passed to the formatter as variable length arguments. The level is
u16, promoted to int and then parsed as u64. This means that the values
are shifted and some stack data are printed instead.
Example error messages from yast2-bootloader:
SystemCmd.cc(addLine):569 Adding Line 7 " "qgroupid": "21474836480/23885859321282560","
The value 21474836480 = 0x5000000 is 0x5 shifted by 32 bits,
23885859321282560 is 0x54dc1000000000 and shifting by 32 does not
lead to a valid value which should be 0 in this case.
Bugzilla: https://bugzilla.suse.com/show_bug.cgi?id=1209136
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Currently cli/009 test case failed with different exit number:
====== RUN CHECK /home/adam/btrfs-progs/btrfstune --help
usage: btrfstune [options] device
[...]
failed: /home/adam/btrfs-progs/btrfstune --help
test failed for case 009-btrfstune
[CAUSE]
In tune/main.c, we have the following call on usage():
static void print_usage(int ret)
{
usage(&tune_cmd);
exit(ret);
}
However usage() itself would always call exit(1):
void usage(const struct cmd_struct *cmd)
{
usage_command_usagestr(cmd->usagestr, NULL, 0, true, true);
exit(1);
}
This makes prevents any caller of usage() to modify its exit number.
[FIX]
Add a new argument @error for print_usage(), so we can properly return 0
for -h/--help usage.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add CPU_FLAG_NONE that's first in the list and represents a CPU without
any acceleration features. Can be used for reference/fallback
implementation.
Signed-off-by: David Sterba <dsterba@suse.com>
Add support for run-time detection of CPU features on x86_64 to allow
selection of accelerated implementations of hash algorithms.
When possible use the compiler builtin (works on gcc and clang).
The SHA extensions can't be detected by __builtin_cpu_supports and the
__cpuid/__cpuidex macros are not consistently provided in all supported
gcc and clang versions. Copy the __cpuidex and call it manually for the
SHA extensions. Complete list https://en.wikipedia.org/wiki/CPUID .
Signed-off-by: David Sterba <dsterba@suse.com>
[FALSE ALERT]
Unlike gcc, clang doesn't really understand the comments, thus it's
reportings tons of fall through related errors:
cmds/reflink.c:124:3: warning: unannotated fall-through between switch labels [-Wimplicit-fallthrough]
case 'r':
^
cmds/reflink.c:124:3: note: insert '__attribute__((fallthrough));' to silence this warning
case 'r':
^
__attribute__((fallthrough));
cmds/reflink.c:124:3: note: insert 'break;' to avoid fall-through
case 'r':
^
break;
[CAUSE]
Although gcc is fine with /* fallthrough */ comments, clang is not.
[FIX]
So just introduce a fallthrough macro to handle the situation properly,
and use that macro instead.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Use the configured widths and print padding directly instead of the
embedded printf format and fixed width strings.
Signed-off-by: David Sterba <dsterba@suse.com>
To make option formatting a bit easier so the spacing is unified add
macros and formatting helpers.
Usage in the help text:
OPTLINE("-o value", "description")
Internally the option and description are delimiters by chars that are
not part of normal text, the formatter separates that and uses fixed
with for output. The description text can be of any length, multi-line
text should still end up as one token (i.e. newline without ',' between).
Signed-off-by: David Sterba <dsterba@suse.com>
Add formatter type 'str' where the string must be escaped, e.g. paths or
internal data. Otherwise plain %s can be printed if it's known that
there are no special characters.
Signed-off-by: David Sterba <dsterba@suse.com>
Track how many rows belong to a header (names and separators) so they
can be printed or cleared separately. The rest is body that could be
cleared and filled repeatedly. The ranged API allows to print any number
of rows if the table is filled partially.
Signed-off-by: David Sterba <dsterba@suse.com>
Cleanups are for integer types, prototypes and comments.
New functionality: spacing can be set after table allocation as
->spacing, now able to print 1 or 2 spaces between columns.
Signed-off-by: David Sterba <dsterba@suse.com>
This patch copies in compression.h from the kernel. This is relatively
straightforward, we just have to drop the compression types definition
from ctree.h, and update the image to use BTRFS_NR_COMPRESS_TYPES
instead of BTRFS_COMPRESS_LAST, and add a few things to kerncompat.h to
make everything build smoothly.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This doesn't really belong with the ioctl definitions, and when we sync
the ioctl definitions with the kernel this helper will go away, so
adjust this now.
The ioctl.h is a public API header but the helper is not used in any 3rd
party tool so we can safely move it.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We're trying to get a U32 for the attributes, but the kernel sends a U64
(which is correct as we store attributes in a u64 flags field of the
inode). This makes anyone trying to receive a v2 send stream to fail with:
ERROR: invalid size for attribute, expected = 4, got = 8
We actually recently got such a report of someone using send stream v2 and
getting such failure. See the Link tag below.
Link: https://lore.kernel.org/linux-btrfs/6cb11fa5-c60d-e65b-0295-301a694e66ad@inbox.ru/
Fixes: 7a6fb356dc ("btrfs-progs: receive: process setflags ioctl commands")
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
We should warn that there's an experimental feature used. Add a helper
with optional description. Should be used only if such feature is used
and not always.
Signed-off-by: David Sterba <dsterba@suse.com>
The -O and -R help texts say that compatible version for
block-group-tree is 6.0 but it's in fact 6.1.
Issue: #523
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Even with chunk_objectid bug fixed, mkfs.btrfs can still caused stack
overflow when enabling extent-tree-v2 feature (need experimental
features enabled):
# ./mkfs.btrfs -f -O extent-tree-v2 ~/test.img
btrfs-progs v5.19.1
See http://btrfs.wiki.kernel.org for more information.
ERROR: superblock magic doesn't match
NOTE: several default settings have changed in version 5.15, please make sure
this does not affect your deployments:
- DUP for metadata (-m dup)
- enabled no-holes (-O no-holes)
- enabled free-space-tree (-R free-space-tree)
Label: (null)
UUID: 205c61e7-f58e-4e8f-9dc2-38724f5c554b
Node size: 16384
Sector size: 4096
Filesystem size: 512.00MiB
Block group profiles:
Data: single 8.00MiB
Metadata: DUP 32.00MiB
System: DUP 8.00MiB
SSD detected: no
Zoned device: no
=================================================================
[... Skip full ASAN output ...]
==65655==ABORTING
[CAUSE]
For experimental build, we have unified feature output, but the old
buffer size is only 64 bytes, which is too small to cover the new full
feature string:
extref, skinny-metadata, no-holes, free-space-tree, block-group-tree, extent-tree-v2
Above feature string is already 84 bytes, over the 64 on-stack memory
size.
This can also be proved by the ASAN output:
==65655==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7ffc4e03b1d0 at pc 0x7ff0fc05fafe bp 0x7ffc4e03ac60 sp 0x7ffc4e03a408
WRITE of size 17 at 0x7ffc4e03b1d0 thread T0
#0 0x7ff0fc05fafd in __interceptor_strcat /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:377
#1 0x55cdb7b06ca5 in parse_features_to_string common/fsfeatures.c:316
#2 0x55cdb7b06ce1 in btrfs_parse_fs_features_to_string common/fsfeatures.c:324
#3 0x55cdb7a37226 in main mkfs/main.c:1783
#4 0x7ff0fbe3c28f (/usr/lib/libc.so.6+0x2328f)
#5 0x7ff0fbe3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
#6 0x55cdb7a2cb34 in _start ../sysdeps/x86_64/start.S:115
[FIX]
Introduce a new macro, BTRFS_FEATURE_STRING_BUF_SIZE, along with a new
sanity check helper, btrfs_assert_feature_buf_size().
The problem is I can not find a build time method to verify
BTRFS_FEATURE_STRING_BUF_SIZE is large enough to contain all feature
names, thus have to go the runtime function to do the BUG_ON() to verify
the macro size.
Now the minimal buffer size for experimental build is 138 bytes, just
bump it to 160 for future expansion.
And if further features go beyond that number, mkfs.btrfs/btrfs-convert
will immediately crash at that BUG_ON(), so we can definitely detect it.
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Tested-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
[BUG]
Commit "btrfs-progs: prepare merging compat feature lists" tries to
merged "-O" and "-R" options, as they don't correctly represents
btrfs features.
But that commit caused the following bug during mkfs for experimental
build:
$ mkfs.btrfs -f -O block-group-tree /dev/nvme0n1
btrfs-progs v5.19.1
See http://btrfs.wiki.kernel.org for more information.
ERROR: superblock magic doesn't match
ERROR: illegal nodesize 16384 (not equal to 4096 for mixed block group)
[CAUSE]
Currently btrfs_parse_fs_features() will return a u64, and reuse the
same u64 for both incompat and compat RO flags for experimental branch.
This can easily leads to conflicts, as
BTRFS_FEATURE_INCOMPAT_MIXED_BLOCK_GROUP and
BTRFS_FEATURE_COMPAT_RO_BLOCK_GROUP_TREE both share the same bit
(1 << 2).
Thus for above case, mkfs.btrfs believe it has set MIXED_BLOCK_GROUP
feature, but what we really want is BLOCK_GROUP_TREE.
[FIX]
Instead of incorrectly re-using the same bits in btrfs_feature, split
the old flags into 3 flags:
- incompat_flag
- compat_ro_flag
- runtime_flag
The first two flags are easy to understand, the corresponding flag of
each feature.
The last runtime_flag is to compensate features which doesn't have any
on-disk flag set, like QUOTA and LIST_ALL.
And since we're no longer using a single u64 as features, we have to
introduce a new structure, btrfs_mkfs_features, to contain above 3
flags.
This also mean, things like default mkfs features must be converted to
use the new structure, thus those old macros are all converted to
const static structures:
- BTRFS_MKFS_DEFAULT_FEATURES + BTRFS_MKFS_DEFAULT_RUNTIME_FEATURES
-> btrfs_mkfs_default_features
- BTRFS_CONVERT_ALLOWED_FEATURES -> btrfs_convert_allowed_features
And since we're using a structure, it's not longer as easy to implement
a disallowed mask.
Thus functions with @mask_disallowed are all changed to using
an @allowed structure pointer (which can be NULL).
Finally if we have experimental features enabled, all features can be
specified by -O options, and we can output a unified feature list,
instead of the old split ones.
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Add similar helper to pr_verbose that prints on stderr, for commands
that need to print to stderr based on the set verbosity level.
Signed-off-by: David Sterba <dsterba@suse.com>
Factor out the level check so we can add helper for stderr as some
commands don't/can't print to stdout.
Signed-off-by: David Sterba <dsterba@suse.com>
There are messages that are supposed to be printed by default and now
use the LOG_ALWAYS level, but that's a negative level and was meant as a
workaround for commands that must really print the message.
The default log level should be 1 and can be adjusted by the -q or -v
global commands.
Signed-off-by: David Sterba <dsterba@suse.com>
There's a group of helpers to read device size, the btrfs_device_size
should be one of them. Rename it and so minor cleanup.
Signed-off-by: David Sterba <dsterba@suse.com>
Switch the remaining use of assert() as it lacks the verbose assert that
we have for ASSERT (but otherwise is equivalent).
Signed-off-by: David Sterba <dsterba@suse.com>