Casefold can be safely disabled if there are no directories with +F
attribute ( EXT4_CASEFOLD_FL ). This checks all inodes for that flag and in
case there isn't any, it disables casefold FS feature. When FS has
directories with +F attributes, user could convert these directories,
probably by mounting FS and executing some script or by doing it
manually. Afterwards, it would be possible to disable casefold FS flag
via tune2fs.
Link: https://lore.kernel.org/r/20220708122658.17907-1-slava@bacher09.org
Signed-off-by: Slava Bacherikov <slava@bacher09.org>
Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently in varisous e2fsprogs tools, most notably tune2fs and e2fsck
we will get the device name by passing the user provided string into
blkid_get_devname(). This library function however is primarily intended
for parsing "NAME=value" tokens. It will return the device matching the
specified token, NULL if nothing is found, or copy of the string if it's
not in "NAME=value" format.
However in case where we're passing in a file name that contains an
equal sign blkid_get_devname() will treat it as a token and will attempt
to find the device with the match. Likely finding nothing.
Fix it by checking existence of the file first and then attempt to call
blkid_get_devname(). In case of a collision, notify the user and
automatically prefer the one returned by blkid_get_devname(). Otherwise
return either the existing file, or NULL.
We do it this way to avoid some existing file in working directory (for
example LABEL=volume-name) masking an actual device containing the
matchin LABEL. User can specify full, or relative path (e.g.
./LABEL=volume-name) to make sure the file is used instead.
Link: https://lore.kernel.org/r/20220812130122.69468-1-lczerner@redhat.com
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reported-by: Daniel Ng <danielng@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Neither of these two warnings can actually happen (other limits will
be hit first), but widening the integer to a 64-bit unsigned integer
is an cheap and effective way to silence the Coverity warnings.
Addresses-Coverity-Bug: 1500760
Addresses-Coverity-Bug: 1507886
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
It is logal (albeit rare) for the number of block groups per flex_bg
to 2**31 (which effectively means to put all of the block groups into
a single flex_bg). However, in that case "1 << 31" is undefined on
architectures with a 32-bit integer. Fix this UBSAN complaint by
using "1U << 31" instead.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix the "chattr -h" usage message to properly document that the
"-p" option takes a project argument, like "-v" takes a version.
Update the man page formatting to emphasize literal strings.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Implement support for FS_IOC_SETFSLABEL and FS_IOC_GETFSLABEL ioctls.
Try to use the ioctls if possible even before we open the file system
since we don't need it. Only fall back to the old method in the case the
file system is not mounted, is mounted read only in the set label case,
or the ioctls are not suppported by the kernel.
The new ioctls can also be supported by file system drivers other than
ext4. As a result tune2fs and e2label will work for those file systems
as well as long as the file system is mounted. Note that we still truncate
the label exceeds the supported lenghth on extN file system family, while
we keep the label intact for others.
Update tune2fs and e2label as well.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This patch adds an extended option "assume_storage_prezeroed" to
mke2fs. When enabled, this option acts as a hint to mke2fs that the
underlying block device was zeroed before mke2fs was called. This
allows mke2fs to optimize out the zeroing of the inode table and the
journal, which speeds up the filesystem creation time.
Additionally, on thinly provisioned storage devices (like Ceph,
dm-thin, newly created sparse loopback files), reads on unmapped
extents return zero. This property allows mke2fs (with
assume_storage_prezeroed) to avoid pre-allocating metadata space for
inode tables for the entire filesystem and saves space that would
normally be preallocated for zero inode tables.
Tests
-----
1) Running 'mke2fs -t ext4' on 10G sparse files on an ext4
filesystem drops the time taken by mke2fs from 0.09s to 0.04s
and reduces the initial metadata space allocation (stat on
sparse file) from 139736 blocks (545M) to 8672 blocks (34M).
2) On ChromeOS (running linux kernel 4.19) with dm-thin
and 200GB thin logical volumes using 'mke2fs -t ext4 <dev>':
- Time taken by mke2fs drops from 1.07s to 0.08s.
- Avoiding zeroing out the inode table and journal reduces the
initial metadata space allocation from 0.48% to 0.01%.
- Lazy inode table zeroing results in a further 1.45% of logical
volume space getting allocated for inode tables, even if no file
data is added to the filesystem. With assume_storage_prezeroed,
the metadata allocation remains at 0.01%.
[ Fixed regression test to work on newer versions of e2fsprogs -- TYT ]
Signed-off-by: Sarthak Kukreti <sarthakkukreti@chromium.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Print inode number of orphan file in outputs, dump e2image file to
filesystem image.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When tune2fs is enabling quota feature, it looks for old-style quota
files and tries to transfer limits stored in these files into newly
created hidded quota files. However the code doing the transfer setups
the quota scan wrongly and instead of transferring limits we transfer
usage. So not only quota limits are lost (at least they can still be
recovered from the old quota files) but also usage information may be
wrong if the accounting in e2fsprogs does not exactly match the
accounting in quota-tools (which is actually the case). Fix the setup of
the quota scan.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
quota_update_limits() is a misnomer because what it actually does is
that it updates 'usage' counters and leaves 'limit' counters intact.
Rename quota_update_limits() to quota_read_all_dquots() and while
changing prototype also add a flags argument so that callers can control
which quota information is actually updated from the disk.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This is useful for keeping the regression test results consistent
when run on non-Linux systems, especially on GNU Hurd.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Since we have done a lot of testing with a cluster size equal to 64k
(or 16 times the default 4k block size), mke2fs will only warn for
bigalloc file systems where the cluster size is greater than 16 times
the block size.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Filesystems with 128-byte inodes do not support timestamps beyond the
year 2038. Since we're now less than 16.5 years away from that point,
it's time to start warning users about this lack of support when they
format an ext4 filesystem with small inodes.
(Note that even for ext2 and ext3, we changed the default for
non-small file systems in 2008 in commit commit b1631cce64 ("Create
new filesystems with 256-byte inodes by default").)
So change the mke2fs.conf file to specify 256-byte inodes even for
small filesystems, and then add a warning to mke2fs itself if someone
is trying to make us format a file system with 128-byte inodes. This
can be suppressed by setting the boolean option warn_y2038_dates in
the mke2fs.conf file to false, which we do in the case of GNU Hurd,
since it only supports 128 byte inodes as of this writing.
[ Patch reworked by tytso to only warn in the case of GNU Hurd, since
the default for ext2/ext3 was changed for all but small file systems
in 2008. ]
Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Dump quota files to resulting filesystem image. They are fs metadata.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In lsattr_dir_proc(), if malloc() return NULL, it will cause
a segmentation fault problem.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The mke2fs program should allow creating a file system image when an
explicit file system size is specified, even if the file doesn't yet
exist. By deferring the call to check_plausible() in commit
942b00cb9d ("mke2fs: do not warn about a pre-existing partition
table when using a non-zero offset") this behaviour was broken.
Fix this regression by explicitly creating the file if the file system
size is specified.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Use target.bionic.system_shared_libs when it is used to limit the
default shared libraries (as opposed to remove them completely).
This avoids attempting to add a host dependency on libc when
system_shared_libs is modified to apply to all variants.
Also remove system_shared_libs from static binaries where it has
no effect, and consolidate it into e2fsprogs-defaults.
Bug: 193559105
Test: m checkbuild
Change-Id: I2d447b006afc783f4acd6c1acd93f338a68a01ed
From AOSP commit: 48fa7248112701c30d3cabfb8d3360b2408d6491
Address a number of signed vs. unsigned comparison errors, unused
function parameters, casts which drop const, etc.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The existing code attempted to avoid warning about a pre-existing file
system with a non-zero offset, but because the offset was not set at
the time of the check, this intention was not actually working. So
this commit will suppress warnings about pre-existing a partition
table as well as pre-existing file system when there is a non-zero
offset.
Addresses-Debian-Bug: #989612
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Commit d2bfdc7ff1 ("Use punch hole as "discard" on regular files")
added a test to see if the storage device actually supports discard.
The intent was to try discarding the first block but since
io_channel_discard() interprets the offset and count arguments in
blocks, and not bytes, mke2fs was actually discarding the first 16
megabytes (when the block size is 4k). This is normally not a
problem, since most file systems are larger than that, and requests to
discard beyond the end of the block device are ignored.
However, when creating a small file system as part of a image
containing multiple partitions, the initial test discard can end up
discarding data beyond the file system being created.
Addresses-Debian-Bug: #989630
Reported-by: Josh Triplett <josh@joshtriplett.org>
Fixes: d2bfdc7ff1 ("Use punch hole as "discard" on regular files")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In scandir(), temp_list[num_dent] is allocated by calling
malloc(), we should check whether malloc() returns NULL before
accessing temp_list[num_dent].
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In zap_sector(), need free buf before return,
otherwise it will cause memory leak.
Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com>
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Reviewed-by: Wu Bo <wubo40@huawei.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Fix all warnings about unused variables that were introduced since
e2fsprogs v1.45.4.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Since commit e8c858047b ("libext2fs: fix build issue for on
Windows/Cygwin systems"), ext2fs_get_device_size2() is available in
Windows builds of libext2fs. So there is no need for mke2fs to call
ext2fs_get_device_size() instead.
This fixes a -Wincompatible-pointer-types warning because
ext2fs_get_device_size() was being passed a 'blk64_t *', but it expected
a 'blk_t *'.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When adding or removing journal from a filesystem, we also need to add /
remove journal blocks from overhead stored in the superblock. Otherwise
total number of blocks in the filesystem as reported by statfs(2) need
not match reality and could lead to odd results like negative number of
used blocks reported by df(1).
Fixes: 9046b4dfd0ce ("mke2fs: set overhead in super block")
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Add '-V' to filefrag to print the installed version of the tool.
If '-V' is used twice, print out the list of supported FIEMAP flags.
This can be used to check if filefrag understands a specific feature.
Include FIEMAP in the error message printed when filefrag cannot
get the file layout. Since FIEMAP is commonly available and tried
first, it should also be mentioned in the error message unless it
was requested to only run FIBMAP.
Update filefrag.1.in man page to cover the new -V option.
Fix a formatting error with the recently added '-P' options, and
include '-E' and '-P' in the SYNOPSIS section.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Lustre-bug-id: https://jira.whamcloud.com/browse/LU-11848
Reviewed-by: Wang Shilong <wshilong@whamcloud.com>
Change-Id: Ib126bdd70efa1775aef6db761f54e27a593ebbe5
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reorganize the e2image.8 man page so that the command-line options
are listed in a dedicated OPTIONS section, rather than being
interspersed among the text in the DESCRIPTION section. Otherwise,
it is difficult to determine which options are available, and to
find where each option is described.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In set_inode_xattr(), there are two returns as follows,
-
return retval;
return 0;
-
Here, we remove useless 'return 0;' code.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This adds support for setting/querying the FS_NOCOMP_FL/EXT2_NOCOMPR_FL
file flag to chattr/lsattr. I picked the character "m" because it was
so far unused and all other characters that were more obvious candidates
were already taken.
The flag is available on btrfs, and with this patch it is possible to
manage it correctly.
Signed-off-by: Lennart Poettering <lennart@poettering.net>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Replace the remaining loff_t uses with ext2_loff_t, as
was done in patch 1df6a4555, since loff_t is a GCC'ism
and is not portable.
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Follow the example in do_mkdir_internal, and do_symlink_internal,
and find the correct parent directory if the destination
is not just a plain basename.
https://github.com/tytso/e2fsprogs/issues/61
Signed-off-by: Earl Chew <earl_chew@yahoo.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Also remove a dead assignment (the value of try is overwritten after
the continue statement).
Addresses-Coverity-Bug: 1297515
Addresses-Coverity-Bug: 1464573
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
On most systems where we compile e2fsprogs, the u64 type is an
unsigned long long. However, there are platforms (such as the
PowerPC) where a long 64-bits and so u64 is typedef'ed to be unsigned
long instead of a unsigned long long. Fix this by using explicit
casts in printf statements. For scanf calls, we need to receive the
value into a unsigned long long, and then assign it to a u64, after
doing range checks.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If an inode which is copied into a file system using "mke2fs -d" has
an ACL (or extended attributes) and it is also using inline data, when
the extended attribute(s) are copied in, the inline data gets dropped due to a missing call to ext2fs_xattrs_read().
Addresses-Debian-Bug: #971014
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The vendor_ramdisk variant is dynamic, unlike the ramdisk variant.
Test: builds
Google-Bug-Id: 173425293
Change-Id: I45547b5ea99aae98727121c038129844b7930ed6
From AOSP commit: 073ede3200afeffd82889cb61a71fa1947314476
In preparation for upcoming kernel changes that will make the kernel
support both encryption and casefolding at the same time, allow mke2fs
to enable both features at the same time.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Google-Bug-Id: 138322712
Test: Create fs with casefold and encryption enabled via mke2fs
Change-Id: I4e2350e43e21cffb3d972310cd74df1e662bf87e
From AOSP commit: f8fc427df385260f3424e1e9d5485c8640606920
In preparation for upcoming kernel changes that will make the kernel
support both encryption and casefolding at the same time, allow tune2fs
to enable both features at the same time.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Google-Bug-Id: 138322712
Test: Create fs with casefold and encryption enabled via tune2fs
Change-Id: I36537a8b6dc5e2997b7016212f9b574c76673067
From AOSP commit: 44ac9dccd9853cc9807f0d6ce952bdcdeb93954a
This allows tune2fs to enable casefolding on an existing filesystem.
At the moment, casefolding is incompatible with encryption.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Google-Bug-Id: 138322712
Test: Create fs without casefold and enable it via tune2fs
Change-Id: Ic9ed63180ef28c36e083cee85ade432e4bfcc654
From AOSP commit: eb5b168decac07058e90ead191350be80c75aff4
Refering to EXT4_INCOMPAT_CASEFOLD as encoding is not as meaningful as
saying casefold.
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The main reason we didn't allow this before was because !CASEFOLDED
directories were expected to be normalized(). Since this is no longer
the case, and as long as the encrypt feature is not enabled, it should
be safe to enable this feature.
Disabling the feature is trickier, since we need to make sure there are
no existing +F directories in the filesystem. Leave that for a future
patch.
Also, enabling strict mode requires some filesystem-wide verification,
so ignore that for now.
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
Signed-off-by: Arnaud Ferraris <arnaud.ferraris@collabora.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Clang gets unhappy when passing an unsigned char to string functions.
For better or for worse we use __u8[] in the definition of the
superblock. So cast them these to "char *" to prevent clang
build-time warnings.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Explain which valid block sizes mke2fs supports in more detail.
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The $(SYSLIBS) was missing when linking the e4crypt application. This is
available in the e4crypt.profiled variant, so I assume this was just
missing in the normal variant and is not left out intentionally.
Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We were not checking the return value of check_fsck_needed() when
checking to clear the dir_index feature. As a result, tune2fs would
print that the file system needed to be checked first, but then go
ahead and clear the dir_index flag.
Addresses-Coverity-Bug: 1467671
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The function calculate_summary_stats sets the global metadata of the
file system. Tune2fs had this function defined statically in
tune2fs.c. Fast commit replay needs this function to set global
metadata at the end of the replay phase. So, move this function to
libext2fs.
Signed-off-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com>
Reviewed-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Currently, when constructing the <default> configuration pseudo-file using
the profile-to-c.awk script we will just pass the double quotes as they
appear in the mke2fs.conf.
This is problematic, because the resulting default_profile.c will either
fail to compile because of syntax error, or leave the resulting
configuration invalid.
It can be reproduced by adding the following line somewhere into
mke2fs.conf configuration and forcing mke2fs to use the <default>
configuration by specifying nonexistent mke2fs.conf
MKE2FS_CONFIG="nonexistent" ./misc/mke2fs -T ext4 /dev/device
default_mntopts = "acl,user_xattr"
^ this will fail to compile
default_mntopts = ""
^ this will result in invalid config file
Syntax error in mke2fs config file (<default>, line #4)
Unknown code prof 17
Fix it by escaping the double quotes with a backslash in
profile-to-c.awk script.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
It is possible to crash filefrag with a "Floating point exception" in
two different scenarios:
1. When fstat() returns a device ID set to 0
2. When FIGETBSZ ioctl returns a blocksize of 0
In both scenarios a divide-by-zero will occur in frag_report() because
variable blksize will be set to zero.
I've managed to trigger this crash with an old CephFS kernel client,
using xfstest generic/519. The first scenario has been fixed by kernel
commit 75c9627efb72 ("ceph: map snapid to anonymous bdev ID"). The
second scenario is also fixed with commit 8f97d1e99149 ("vfs: fix
FIGETBSZ ioctl on an overlayfs file").
However, it is desirable to handle these two scenarios gracefully by
checking these conditions explicitly.
Signed-off-by: Luis Henriques <lhenriques@suse.de>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
populate_fs do copy the xattrs for all files and directories, but the
root directory is skipped and as a result its extended attributes aren't
set. This is an issue when using mkfs to build a full system image that
can be used with SElinux in enforcing mode without making any runtime
fix at first boot.
This patch adds logic to set the root directory's extended attributes.
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In the case where mkdir -p is not thread-safe (for example, if the
build environment is using busybox's mkdir) the configure script will
fall back to the slow (but safe) install-sh script. In that case
MKDIR_P will be using a relative pathname; so we can't use speed
optimization of defining configure substitutions in MCONFIG.in, since
the substitution will be different depending on depth of the
subdirectory in the Makefile.in file.
https://github.com/tytso/e2fsprogs/issues/51
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Use the letter 'x' to set/get dax attribute on a directory/file.
Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
While file has a large inode number (> 0xFFFFFFFF), mkfs.ext4 could
not parse hardlink correctly.
Prepare three hardlink files for mkfs.ext4
$ ls -il rootfs_ota/a rootfs_ota/boot/b rootfs_ota/boot/c
11026675846 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 rootfs_ota/a
11026675846 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 rootfs_ota/boot/b
11026675846 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 rootfs_ota/boot/c
$ truncate -s 1M rootfs_ota.ext4
$ mkfs.ext4 -F -i 8192 rootfs_ota.ext4 -L otaroot -U fd5f8768-c779-4dc9-adde-165a3d863349 -d rootfs_ota
$ mkdir mnt && sudo mount rootfs_ota.ext4 mnt
$ ls -il mnt/a mnt/boot/b mnt/boot/c
12 -rw-r--r-- 1 hjia users 0 Jul 20 17:44 mnt/a
14 -rw-r--r-- 1 hjia users 0 Jul 20 17:44 mnt/boot/b
15 -rw-r--r-- 1 hjia users 0 Jul 20 17:44 mnt/boot/c
After applying this fix
$ ls -il mnt/a mnt/boot/b mnt/boot/c
12 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 mnt/a
12 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 mnt/boot/b
12 -rw-r--r-- 3 hjia users 0 Jul 20 17:44 mnt/boot/c
Since commit [382ed4a1 e2fsck: use proper types for variables][1]
applied, it used ext2_ino_t instead of ino_t for referencing inode
numbers, but the type of is_hardlink's `ino' should not be instead,
The ext2_ino_t is 32bit, if inode > 0xFFFFFFFF, its value will be
truncated.
Add a debug printf to show the value of inode, when it check for hardlink
files, it will always return false if inode > 0xFFFFFFFF
|--- a/misc/create_inode.c
|+++ b/misc/create_inode.c
|@@ -605,6 +605,7 @@ static int is_hardlink(struct hdlinks_s *hdlinks, dev_t dev, ext2_ino_t ino)
| {
| int i;
|
|+ printf("%s %d, %lX, %lX\n", __FUNCTION__, __LINE__, hdlinks->hdl[i].src_ino, ino);
| for (i = 0; i < hdlinks->count; i++) {
| if (hdlinks->hdl[i].src_dev == dev &&
| hdlinks->hdl[i].src_ino == ino)
Here is debug message:
is_hardlink 608, 2913DB886, 913DB886
The length of ext2_ino_t is 32bit (typedef __u32 __bitwise ext2_ino_t;),
and ino_t is 64bit on 64bit system (such as x86-64), recover `ino' to ino_t.
[1] https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/commit/?id=382ed4a1c2b60acb9db7631e86dda207bde6076e
Signed-off-by: Hongxu Jia <hongxu.jia@windriver.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we are creating filesystem on DAX capable device, warn if set block
size is incompatible with DAX to give admin some hint why DAX might not
be available.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Providing -S and a path to 'add_key' previously exhibited an
unintuitive behavior: instead of using the salt explicitly provided by
the user, e4crypt would use the salt obtained via
EXT4_IOC_GET_ENCRYPTION_PWSALT on the path. This was because
set_policy() was still called with NULL as salt.
With this change we now remember the explicitly provided salt (if any)
and use it as argument for set_policy().
Eventually
e4crypt add_key -S s:my-spicy-salt /foo
will now actually use 'my-spicy-salt' and not something else as salt
for the policy set on /foo.
Signed-off-by: Florian Schmaus <flo@geekplace.eu>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If tune2fs cannot perform the requested change, ensure that the MMP
block is reset to the unused state before exiting. Otherwise, the
filesystem will be left with mmp_seq = EXT4_MMP_SEQ_FSCK set, which
prevents it from being mounted afterward:
EXT4-fs warning (device dm-9): ext4_multi_mount_protect:311:
fsck is running on the filesystem
Add a test to try some failed tune2fs operations and verify that the
MMP block is left in a clean state afterward.
Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13672
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Similar to encrypt and verity, once the stable_inodes feature has been
enabled there may be files anywhere on the filesystem that require this
feature. Therefore, in general it's unsafe to allow clearing it. Don't
allow tune2fs to do so. Like encrypt and verity, it can still be
cleared with debugfs if someone really knows what they're doing.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The stable_inodes feature is intended to indicate that it's safe to use
IV_INO_LBLK_64 encryption policies, where the encryption depends on the
inode numbers and thus filesystem shrinking is not allowed. However
since inode numbers are not unique across filesystems, the encryption
also depends on the filesystem UUID, and I missed that there is a
supported way to change the filesystem UUID (tune2fs -U).
So, make 'tune2fs -U' report an error if stable_inodes is set.
We could add a separate stable_uuid feature flag, but it seems unlikely
it would be useful enough on its own to warrant another flag.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The loff_t type is a glibc'ism and is not fully portable. Use
ext2_loff_t instead.
Fixes: 382ed4a1c2 ("e2fsck: use proper types for variables")
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reported-by: Matthias Andree <matthias.andree@gmx.de>
Enable the build of e2freefrag to monitor the free space fragmentation
in ext2/3/4 filesystems.
Bug: 146078546
Test: m + e2freefrag on device
Change-Id: Ia56e443a789ae881a03bdaeae1093567e1736c4c
Signed-off-by: Alessio Balsini <balsini@google.com>
From AOSP commit: ab77f6c79f3dab697cd56ad3b793e7d555ac9415
We want to start shipping the toybox chattr and lsattr on the device all
the time, so the build system rightly complains that then we'd have two
modules with the same name.
I went with a suffix rather than a prefix so that tab completion works
for folks still wanting to use the e2fsprogs versions.
Bug: http://b/147769529
Test: builds
Change-Id: Ib904fa6c709d29ce709302c61e452383c02cb9a3
From AOSP commit: 8525a455e7410461560a99a42feb0dbfabab5c8e
When clearing dir_index feature while metadata_csum is enabled, we have
to rewrite checksums of all indexed directories to update checksums of
internal tree nodes.
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Plural form "directories" should be used along with "files".
"id's" should be "ids" (i.e., plural form, not apostrophe).
"much" should "must".
Signed-off-by: Sawood Alam <ibnesayeed@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Introduced with commit e7236a9476,
it was later renamed to encoding, and turned into a fs_type-only
option with commit 28887533bb.
Hence, remove an option that does not exist in the default
configuration.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Set the directory for directories in cases where the owner permissions
is not rwx. This was reported[1] by Robert Yang but we are using a
different approach to fixing the issue.
[1] https://lore.kernel.org/r/1582542522-97508-1-git-send-email-liezhi.yang@windriver.com
Also set the permissions in a more portable way by making a
distinction between the host OS's permissions stats and Linux's
permissions. We still assume the low 12 bits are the historical Unix
assignments, but we don't assume ST_IFMT bits are the same as Linux's.
Reported-by: Robert Yang <liezhi.yang@windriver.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Avoid overflowing the column-width calc printing files over 4B blocks.
Document the [KMG] suffixes for the "-b <blocksize>" option.
The blocksize is limited to at most 1GiB blocksize to avoid shifting
all extents down to zero GB in size. Even the use of 1GB blocksize
is unlikely, but non-ext4 filesystems may use multi-GB extents.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Even though we don't have support for filesystems with over 4B inodes
in the current e2fsprogs, this may happen in the future. There are
latent overflow bugs when calculating the number of inodes in the
filesystem that can trivially be fixed now, rather than waiting for
them to be hit at some point in the future. The block number calcs
are already correct in this code.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Use ext2_ino_t instead of ino_t for referencing inode numbers.
Use loff_t for for file offsets, and dgrp_t for group numbers.
Cast products to ssize_t before multiplication to avoid overflow.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Reviewed-by: Shilong Wang <wshilong@ddn.com>
Lustre-bug-id: https://jira.whamcloud.com/browse/LU-13197
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
By convention, lists of options in man pages use a label followed by an
indented description, such as this example from the Options section:
-R Recursively change attributes of directories and
their contents.
But the Attributes section places the available attributes mid-sentence,
which makes it visually more difficult to parse:
A file with the 'a' attribute set can only be opened
in append mode for writing. [...]
When a file with the 'A' attribute set is accessed, its
atime record is not modified. [...]
This patch places a label beside each attribute description, which (in
my opinion) improves readability, especially when visually skimming the
list. For example:
a A file with the 'a' attribute set can only be
opened in append mode for writing.
A When a file with the 'A' attribute set is accessed,
its atime record is not modified.
Signed-off-by: Jeremy Visser <jeremyvisser@google.com>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If overhead is not recorded in the super block, it is calculated
during mount in kernel, for bigalloc file systems it takes
O(groups**2) in time. For a 1PB device with 32K cluster size it takes
~12 mins to mount, with most of the time spent on figuring out
overhead.
While we can not improve the overhead algorithm in kernel due to the
nature of bigalloc, we can work out the overhead during mke2fs and set
it in the super block, avoiding calculating it every time when it
mounts.
Overhead is s_first_data_block plus internal journal blocks plus the
block and inode bitmaps, inode table, super block backups and group
descriptor blocks for every group. This patch introduces
ext2fs_count_used_clusters(), which calculates the clusters used in
the block bitmap for the given range.
When bad blocks are involved, it gets tricky because the blocks
counted as overhead and the bad blocks can end up in the same
allocation cluster. In this case we will unmark the bad blocks from
the block bitmap, convert to cluster bitmap and get the overhead, then
mark the bad blocks back in the cluster bitmap.
Reset the overhead to zero when resizing, we can not simply count the
used blocks as overhead like we do when mke2fs. The overhead can be
calculated by kernel side during mount.
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The printf("%.*s") format requires both the buffer size and buffer
pointer to be specified for each use. Since this is repeatedly given
as "(int)sizeof(buf), (char *)buf" for mmp_nodename and mmp_bdevname
fields, with typecasts to avoid compiler warnings.
Add a helper macro EXT2_LEN_STR() to avoid repeated boilerplate code.
This can also be used for other superblock buffer fields that may not
have NUL-terminated strings (e.g. s_volume_name, s_last_mounted,
s_{first,last}_error_func, s_mount_opts) to simplify code and avoid
the need for temporary buffers for NUL-termination.
Annotate the superblock string fields that may not be NUL-terminated.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Don't assume that mmp_nodename and mmp_bdevname are NUL terminated,
since very long node/device names may completely fill the buffers.
Limit string printing to the maximum buffer size for safety, and
change the field definitions to __u8 to make it more clear that
they are not NUL-terminated strings, as is done with other strings
in the superblock that do not have NUL termination.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Ext4 has an extent status cache; add the fiemap extensions so we can
query the kernel for the extent status cache information.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We mark the bad blocks as used on fs->block_map before allocating
group tables. Don't translate the block number to cluster number when
doing this, the fs->block_map is still a block-granularity allocation
map, it will be coverted later by ext2fs_convert_subcluster_bitmap().
Signed-off-by: Li Dongyang <dongyangli@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Also, add a missing dash and two missing brackets and two missing
spaces, and remove three excess spaces.
Signed-off-by: Benno Schulenberg <bensberg@telfort.nl>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The function ext2fs_inode_i_blocks() is a bit confusing whether it is
returning the inode's i_blocks value, or whether it is returning the
value ala the stat(2) system call, which returns i_blocks in units of
512 byte sectors. This caused ext2fs_inode_i_blocks() to be
incorrectly used in fuse2fs and the function quota_compute_usage().
To address this, we add a new function, ext2fs_get_stat_i_blocks()
which is clearly labelled what it is returning, and use it in fuse2fs
and quota_compute_usage(). It's also a bit more convenient to use it
in e2fsck, so use it there too.
Reported-by: Wang Shilong <wangshilong1991@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Previously, uids were truncated at 16 bits because we weren't properly
handling i_uid_high and i_gid_high.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This program calls a few ext2fs library functions used by the current
generation of libext2fs fuzzers, and is helpful in reproducing UBSAN
failures reported externally.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
An internal customer followed an erroneous AskUbuntu article[1] to try to
change the UUID of a live ext4 filesystem. The article claims that you
can work around tune2fs' "cannot change UUID on live fs" error by
disabling uninit_bg, changing the UUID, and re-enabling the feature.
This led to metadata corruption because tune2fs' journal descriptor
rewrite races with regular filesystem writes. Therefore, prevent
administrators from turning on or off uninit_bg on a mounted fs.
[1] https://askubuntu.com/questions/132079/how-do-i-change-uuid-of-a-disk-to-whatever-i-want/195839#459097
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Setting a bit to 0 is normally called "clearing", not "resetting"; and
chattr.1 already says "clear" in some places. Use "clear" consistently.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
This old text was never updated to mention ext4 in addition to ext2 and
ext3. Do so now. Also don't bother to mention old unmerged patches.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
- "can only be open" => "can only be opened"
- "is not candidate" => "is not a candidate"
- "written ... on the disk" => "written ... to the disk"
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When the casefold attribute ('F') was added to the chattr man page, it
was forgotten to add it to the mode string. Add it.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Adjust the documentation for the encryption attribute ('E') to clarify
that encryption isn't experimental anymore and isn't restricted to
regular files, and that the encryption is done by the filesystem.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
We had previously stuck to using the names from ext3/jbd kernel files,
and used a script in contrib/jbd2-resync.sh to convert the kernel
files to use the ext3/jbd conventions so we could keep the files
e2fsck/recovery.c and e2fsck/revoke.c in sync with jbd2/recovery.c and
jbd2/revoke.c, respectively.
This has been getting harder and harder, so let's make a global sweep
through e2fsprogs to use the jbd2 names. Fortunately none of the
ext3/jbd names had leaked out into publically exported header files,
so this is only an internal change. Which looks scary, but it's
basically a search and replace, so if it compiles it's going to be
correct.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reserve the codepoint for EXT4_FEATURE_COMPAT_STABLE_INODES, allow it to
be set and cleared, and teach resize2fs to forbid shrinking the
filesystem if it is set.
This feature will allow the use of encryption policies where the inode
number is included in the IVs (initialization vectors) for encryption,
so data would be corrupted if the inodes were to be renumbered.
For more details, see the kernel patchset:
https://lkml.kernel.org/linux-fsdevel/20191021230355.23136-1-ebiggers@kernel.org/T/#u
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
From AOSP commit: 9aa30c254dd57df54f00c5d520b7ac867ad7ca68
Opening the file system with EXT2_FLAG_SUPER_ONLY will leave
fs->group_desc to be NULL and modify "dumpe2fs -h" and tune2fs when it
is emulating e2label to use this flag. This speeds up "dumpe2fs -h"
and "e2label" when operating on very large file systems.
To allow other libext2fs functions to work without too many surprises,
ext2fs_group_desc() will read in the block group descriptors on
demand. This allows "dumpe2fs -h" to be able to read the journal
inode, for example.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Cray-bug-id: LUS-5777
Try to make it clearer that enabling 'encrypt' just enables *support*
for encryption; it doesn't actually encrypt anything by itself.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Try to make it clearer that enabling 'encrypt' just enables *support*
for encryption; it doesn't actually encrypt anything by itself.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The features are listed in alphabetic order, so put the casefold feature
in the right place.
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Teach fuse2fs the "-o norecovery" option, which will suppress any
journal replay that might be necessary, and mounts the file system
read-only.
Addresses-Debian-Bug: #878927
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The nonempty option isn't supported by fuse3, and so if fusermount is
from fuse3, having fuse2fs specify nonempty automatically will prevent
fuse2fs from working correctly.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Almost all posix_ functions return a positive errno value (without
setting errno) rather than -1 and setting errno. Most calls in this
project were correct, but these two weren't.
Reported-by: Hughes <enh@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In some cases if the translation file is missing some translations,
mke2fs can end up printing an English message, e.g.:
% LANG=it_IT.UTF-8 ./mke2fs /tmp/foo.img 8M
mke2fs 1.45.1 (12-May-2019)
/tmp/foo.img contiene un file system ext4
created on Mon May 27 19:35:48 2019
Proceed anyway? (y,N)
However, if there is a translation for string to match with "yY"
(e.g., to "sS" for Italian), then 'y' won't work. Fix this by falling
back to the english 'yY' characters.
Addresses-Debian-Bug: #907034
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The check in mke2fs is intended to be for the number of blocks in the
filesystem exceeding the maximum number of addressable blocks in 2^32
bitmaps, which is (2^32 * 8 bits/byte * blocksize) = 2^47 blocks,
or 2^59 bytes = 512PiB for the common 4KiB blocksize.
However, s_log_blocksize holds log2(blocksize_in_kb), so the current
calculation is a factor of 2^10 too small. This caused mke2fs to fail
while trying to format a 900TB filesystem.
Fixes: 101ef2e93c ("mke2fs: Avoid crashes / infinite loops for absurdly large devices")
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Also change mke2fs so that the encoding and encoding flags are
specified in mke2fs.conf in the fs_types and defaults stanzas instead
of the options stanza.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>