In order to make recovery.c identical with kernel, we need endianness
conversion macros (such as cpu_to_be32 and friends) defined in
e2fsprogs. This patch defines these macros and also fixes recovery.c
to use these. These macros are also needed for fast commit recovery
patches later in this series.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The function calculate_summary_stats sets the global metadata of the
file system. Tune2fs had this function defined statically in
tune2fs.c. Fast commit replay needs this function to set global
metadata at the end of the replay phase. So, move this function to
libext2fs.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In our benchmarking for PiB size filesystem, pass5 takes
10446s to finish and 99.5% of time takes on reading bitmaps.
It makes sense to reading bitmaps using multiple threads,
a quickly benchmark show 10446s to 626s with 64 threads.
[ This has all of many bug fixes for rw_bitmaps.c from the original
luster patch set collapsed into a single commit. In addition it has
the new ext2fs_rw_bitmaps() api proposed by Ted. ]
Signed-off-by: Wang Shilong <wshilong@ddn.com>
Signed-off-by: Saranya Muruganandam <saranyamohan@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add initial implementation support for the unix_io manager.
Applications which want to use threading should pass in
IO_FLAG_THREADS when opening the channel. Channels which support
threading (which as of this commit is unix_io and test_io if the
backing io_manager supports threading) will set the
CHANNEL_FLAGS_THREADS bit in io->flags. Library code or applications
can test if threading is enabled by checking this flag.
Applications using libext2fs can pass in EXT2_FLAG_THREADS to
ext2fs_open() or ext2fs_open2() to request threading support.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Support for pthreads can be forcibly disabled by passing
"--without-pthread" to the configure script.
The actual changes in this commit are in configure.ac and MCONFIG.in;
the other files were generated as a result of running aclocal,
autoconf, and autoheader on a Debian testing system.
Note: the autoconf-archive package must now be installed before
rerunning aclocal, to supply the AX_PTHREAD macro.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently, when constructing the <default> configuration pseudo-file using
the profile-to-c.awk script we will just pass the double quotes as they
appear in the mke2fs.conf.
This is problematic, because the resulting default_profile.c will either
fail to compile because of syntax error, or leave the resulting
configuration invalid.
It can be reproduced by adding the following line somewhere into
mke2fs.conf configuration and forcing mke2fs to use the <default>
configuration by specifying nonexistent mke2fs.conf
MKE2FS_CONFIG="nonexistent" ./misc/mke2fs -T ext4 /dev/device
default_mntopts = "acl,user_xattr"
^ this will fail to compile
default_mntopts = ""
^ this will result in invalid config file
Syntax error in mke2fs config file (<default>, line #4)
Unknown code prof 17
Fix it by escaping the double quotes with a backslash in
profile-to-c.awk script.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The support of setting (and reading) of passive translators from
GNU/Linux has been added to the Linux kernel by the commit [1].
The name index '10' has been reserved for GNU/Hurd.
Hurd passive translators are stored as a xattr value with name
"gnu.translator" [2].
If "gnu.translator" xattr value has been set before calling
mkfs.ext2, it will segfault since "gnu." is not present in
ea_names[].
$ setfattr -n gnu.translator -v "/hurd/exec\0" ${TARGET_DIR}/servers/exec
$ mkfs.ext2 -d ${TARGET_DIR} -o hurd -O ext_attr rootfs.ext2 "1G"
Adding "gnu." to ea_names[], allow to create ext2 filesystem
for GNU/Hurd with passive translator already set.
[1] https://git.savannah.gnu.org/cgit/hurd/hurd.git/commit/?id=a04c7bf83172faa7cb080fbe3b6c04a8415ca645
[2] https://lists.gnu.org/archive/html/bug-hurd/2016-08/msg00075.html
Signed-off-by: Romain Naour <romain.naour@gmail.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
It is possible to crash filefrag with a "Floating point exception" in
two different scenarios:
1. When fstat() returns a device ID set to 0
2. When FIGETBSZ ioctl returns a blocksize of 0
In both scenarios a divide-by-zero will occur in frag_report() because
variable blksize will be set to zero.
I've managed to trigger this crash with an old CephFS kernel client,
using xfstest generic/519. The first scenario has been fixed by kernel
commit 75c9627efb72 ("ceph: map snapid to anonymous bdev ID"). The
second scenario is also fixed with commit 8f97d1e99149 ("vfs: fix
FIGETBSZ ioctl on an overlayfs file").
However, it is desirable to handle these two scenarios gracefully by
checking these conditions explicitly.
Signed-off-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
populate_fs do copy the xattrs for all files and directories, but the
root directory is skipped and as a result its extended attributes aren't
set. This is an issue when using mkfs to build a full system image that
can be used with SElinux in enforcing mode without making any runtime
fix at first boot.
This patch adds logic to set the root directory's extended attributes.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
These are the changes which are needed after running gettextize to
update to gettext 0.19.8 in the previous commit.
* Add support for maintainer mode (which doesn't do as much given
that gettext now has settings in Makevars which allows us to
suppress automatic updates of the po and gmo files)
* Add support to expand the '@' abbreviations in e2fsck/problem.c
and give an explanation of how they work for translators
* Add support for configure --enable-verbose-makecmds and default to
"kernel-style" quieter make output --- this makes it easier
to see warnings and errors by suppressing the distracting
details.
* Teach the makefile where to find the generated error table C files
in the build directory.
* Add make targets (e.g., all-static, fullcheck, coverage.txt) which
are required by the top-level Makefile.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This also removes the built-in "intl" directory as this is now
considered deprecated by the gettext package. This means that we
won't try to use an internal version of gettext if it's not installed
on the build system. We will simply disable NLS support in that case.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The logic for handling 64-bit structure elements was reversed, which
caused attempts to set fields like kbytes_written to fail:
% debugfs -w /tmp/foo.img
debugfs 1.45.6 (20-Mar-2020)
debugfs: set_super_value kbytes_written 1024
64-bit field kbytes_written has a second 64-bit field
defined; BUG?!?
https://github.com/tytso/e2fsprogs/issues/36
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In the case where mkdir -p is not thread-safe (for example, if the
build environment is using busybox's mkdir) the configure script will
fall back to the slow (but safe) install-sh script. In that case
MKDIR_P will be using a relative pathname; so we can't use speed
optimization of defining configure substitutions in MCONFIG.in, since
the substitution will be different depending on depth of the
subdirectory in the Makefile.in file.
https://github.com/tytso/e2fsprogs/issues/51
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
For 1k block file systems, resizing a file system larger than
1073610752 blocks will result in the size of the block group
descriptors to be so large that it will overlap with the backup
superblock in block group #1. This problem can be reproduced via:
mke2fs -t ext4 /tmp/foo.img 200M
resize2fs /tmp/foo.img 1T
e2fsck -f /tmp/foo.img
https://github.com/tytso/e2fsprogs/issues/50
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Comparing unsigned int with ULONG_MAX is always false.
Signed-off-by: Yi Kong <yikong@google.com>
Change-Id: Iae02aad1bcb271d3468828977be288ad04333821
From AOSP commit: 757a4d672dae1a15c57f5f0705ba90ed007da7e6
This is no longer a transitive include of android_filesystem_config.h
Bug: 149785767
Test: build
Change-Id: I954adf7879fa811eb2b290a0983c84d47ecae26c
From AOSP commit: 9fa6bed983d5ddd7226af647a0c4c0d922227be4
Use the letter 'x' to set/get dax attribute on a directory/file.
Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
While file has a large inode number (> 0xFFFFFFFF), mkfs.ext4 could
not parse hardlink correctly.
Prepare three hardlink files for mkfs.ext4
$ ls -il rootfs_ota/a rootfs_ota/boot/b rootfs_ota/boot/c
11026675846 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 rootfs_ota/a
11026675846 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 rootfs_ota/boot/b
11026675846 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 rootfs_ota/boot/c
$ truncate -s 1M rootfs_ota.ext4
$ mkfs.ext4 -F -i 8192 rootfs_ota.ext4 -L otaroot -U fd5f8768-c779-4dc9-adde-165a3d863349 -d rootfs_ota
$ mkdir mnt && sudo mount rootfs_ota.ext4 mnt
$ ls -il mnt/a mnt/boot/b mnt/boot/c
12 -rw-r--r-- 1 hjia users 0 Jul 20 17:44 mnt/a
14 -rw-r--r-- 1 hjia users 0 Jul 20 17:44 mnt/boot/b
15 -rw-r--r-- 1 hjia users 0 Jul 20 17:44 mnt/boot/c
After applying this fix
$ ls -il mnt/a mnt/boot/b mnt/boot/c
12 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 mnt/a
12 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 mnt/boot/b
12 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 mnt/boot/c
Since commit [382ed4a1 e2fsck: use proper types for variables][1]
applied, it used ext2_ino_t instead of ino_t for referencing inode
numbers, but the type of is_hardlink's `ino' should not be instead,
The ext2_ino_t is 32bit, if inode > 0xFFFFFFFF, its value will be
truncated.
Add a debug printf to show the value of inode, when it check for hardlink
files, it will always return false if inode > 0xFFFFFFFF
|--- a/misc/create_inode.c
|+++ b/misc/create_inode.c
|@@ -605,6 +605,7 @@ static int is_hardlink(struct hdlinks_s *hdlinks, dev_t dev, ext2_ino_t ino)
| {
| int i;
|
|+ printf("%s %d, %lX, %lX\n", __FUNCTION__, __LINE__, hdlinks->hdl[i].src_ino, ino);
| for (i = 0; i < hdlinks->count; i++) {
| if (hdlinks->hdl[i].src_dev == dev &&
| hdlinks->hdl[i].src_ino == ino)
Here is debug message:
is_hardlink 608, 2913DB886, 913DB886
The length of ext2_ino_t is 32bit (typedef __u32 __bitwise ext2_ino_t;),
and ino_t is 64bit on 64bit system (such as x86-64), recover `ino' to ino_t.
[1] https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/commit/?id=382ed4a1c2b60acb9db7631e86dda207bde6076e
Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we are creating filesystem on DAX capable device, warn if set block
size is incompatible with DAX to give admin some hint why DAX might not
be available.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Providing -S and a path to 'add_key' previously exhibited an
unintuitive behavior: instead of using the salt explicitly provided by
the user, e4crypt would use the salt obtained via
EXT4_IOC_GET_ENCRYPTION_PWSALT on the path. This was because
set_policy() was still called with NULL as salt.
With this change we now remember the explicitly provided salt (if any)
and use it as argument for set_policy().
Eventually
e4crypt add_key -S s:my-spicy-salt /foo
will now actually use 'my-spicy-salt' and not something else as salt
for the policy set on /foo.
Signed-off-by: Florian Schmaus <flo@geekplace.eu>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If tune2fs cannot perform the requested change, ensure that the MMP
block is reset to the unused state before exiting. Otherwise, the
filesystem will be left with mmp_seq = EXT4_MMP_SEQ_FSCK set, which
prevents it from being mounted afterward:
EXT4-fs warning (device dm-9): ext4_multi_mount_protect:311:
fsck is running on the filesystem
Add a test to try some failed tune2fs operations and verify that the
MMP block is left in a clean state afterward.
Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13672
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
len argument in string_copy() is int, but it is used with malloc(),
strlen(), strncpy() and some callers use sizeof() to pass value in. So
it really ought to be size_t rather than int. Fix it.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If the file system is corrupted, there is a potential of a read-only
buffer overrun. Fortunately, we don't actually use the result of that
pointer dereference, and the overrun is at most 64k.
Google-Bug-Id: #158564737
Fixes: eb88b75174 ("libext2fs: make ext2fs_dirent_has_tail() more strict")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix bitrot for building a restricted version of debugfs, which does
not require read/write access to the file system, and which only
allows access to the file system metadata.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When opening a file system which is mounted, it's possible that when
ext2fs_open2() is racing with the kernel modifying the orphaned inode
list, the superblock's checksum could be incorrect. So retry reading
the superblock in the hopes that the problem will self-correct.
Google-Bug-Id: 151453112
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When allocating blocks for an indirect block mapped file, accumulate
blocks to be zero'ed and then call ext2fs_zero_blocks2() to zero them
in large chunks instead of block by block.
This significantly speeds up mkfs.ext3 since we don't send a large
number of ZERO_RANGE requests to the kernel, and while the kernel does
batch write requests, it is not batching ZERO_RANGE requests. It's
more efficient to batch in userspace in any case, since it avoids
unnecessary system calls.
Reported-by: Mario Schuknecht <mario.schuknecht@dresearch-fe.de>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The pointer operand to the binary `+` operator must be to a complete
object type.
Signed-off-by: Michael Forney <mforney@mforney.org>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Similar to encrypt and verity, once the stable_inodes feature has been
enabled there may be files anywhere on the filesystem that require this
feature. Therefore, in general it's unsafe to allow clearing it. Don't
allow tune2fs to do so. Like encrypt and verity, it can still be
cleared with debugfs if someone really knows what they're doing.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The stable_inodes feature is intended to indicate that it's safe to use
IV_INO_LBLK_64 encryption policies, where the encryption depends on the
inode numbers and thus filesystem shrinking is not allowed. However
since inode numbers are not unique across filesystems, the encryption
also depends on the filesystem UUID, and I missed that there is a
supported way to change the filesystem UUID (tune2fs -U).
So, make 'tune2fs -U' report an error if stable_inodes is set.
We could add a separate stable_uuid feature flag, but it seems unlikely
it would be useful enough on its own to warrant another flag.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
There is an off-by-one error in dx_grow_tree() when checking whether we
can add another level to the tree. Thus we can grow tree too much
leading to possible crashes in the library or corrupted filesystem. Fix
the bug.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
dx_lookup() uses errcode_t return values. As such anything non-zero is
an error, not values less than zero. Fix the error checking to avoid
crashes on corrupted filesystems.
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The error code allows the kernel to bucket the possible cause of an
ext4 corruption by Unix errno codes (e.g., EIO, EFSBADCRC, ESHUTDOWN,
etc.)
Signed-off-by: Theodore Ts'o <tytso@mit.edu>