If -E unshare_blocks is used with -n, it will normally fail since the
filesystem is read-only. For Android's "adb remount" it is more useful
to report whether or not the unshare operation would succeed, were the
filesystem writable. We do that here by ignoring certain write
operations if -E unshare_blocks is specified with -n. It is not perfect,
since the actual unshare operation could still fail (for example if
new extents need to consume additional blocks).
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Google-Bug-Id: 64109868
Test: e2fsck -f -n -E unshare_blocks on deduplicated image
Change-Id: Ia50ceb7b3745fdf8766cff06c697818f07411635
From AOSP commit: 9e76dc0f65d8a8dec27f57b9020e81cbbbe12faf
Add an -E unshare_blocks flag for unsharing blocks that were created for
a filesystem with block sharing enabled. If the filesystem does not have
this feature enabled, the flag has no effect. If the filesystem does not
have free space, e2fsck will error.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Google-Bug-Id: 64109868
Test: f_unshare_blocks_no_space, f_unshare_blocks_ok
Change-Id: I8821353e9e6200c6c0c71dd22f4f43d796fc720c
From AOSP commit: 8ba190e3135d61501d3a694b6960c2fbee98e7a6
As e2fsck processes each file in pass1, the actual file system quota is
increased by the number of blocks discovered in the file. This can
include both non-multiply-claimed and multiply-claimed blocks, if the
latter exist. However, if a file containing multiply-claimed blocks
is then deleted in pass1b, those blocks are not taken into account when
decreasing the actual quota. In this case, the new quota values written
to the file system by e2fsck overstate the space actually consumed.
And, e2fsck must be run twice on the file system to fully correct
quota.
Fix this by counting multiply-claimed blocks as a debit to quota when
deleting files in pass1b.
Signed-off-by: Eric Whitney <enwlinux@gmail.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andreas Dilger <adilger@dilger.ca>
Use a large_inode so that when e2fsck is fixing a file system with
project quota enabled, the correct project id's quota is adjusted when
a corrupted inode is deleted.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Create separate predicate functions to test/set/clear feature flags,
thereby replacing the wordy old macros. Furthermore, clean out the
places where we open-coded feature tests.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The quota code required that we included dict.o in libsupport.a, so we
might as well just move dict.c and dict.h to lib/support, and then
have e2fsck use the version of dict.c in libsupport.a. This
simplifies the build system and eliminates having two identical copies
of dict.o floating around in the build tree.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The compression patches were an out-of-kernel patch set that was (a)
only available for ext2, (b) something that was never could be
stablized due to file system corruption, and (c) the most recent
patches were for 3.1, last updated in 2011.
The history of the compression patches has been a bit checkered.
There is a long history here at http://e2compr.sourceforge.net which
lists the perspective of the people working on it from the e2compr
side.
From the ext2/3/4 mainline developers' perspective, initial
compression support was added to e2fsprogs in 2000 (in the Linux 2.2
era), but due to stability concerns the kernel patches were never
merged into the mainline kernel. While there were some sporadic
efforts to try to get the ext2 compression patches working in the 2.4
and 2.6 era, by that time mainline work had moved on to ext4, and the
e2compr approach could only work with 32-bit block numbers and
indirect mapped files.
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Decrement the bad count *after* we've shown that (a) we can allocate a
replacement block and (b) remap the file block. Unfortunately,
the only way to tell if the remapping succeeded is to wait until the
next clone_file_block() call or block_iterate3() returns.
Otherwise, there's a corruption error: we decrease the badcount once in
preparation to remap, then the remap fails (either we can't find a
replacement block or we have to split the extent tree and can't find a
new extent block), so we delete the file, which decreases the badcount
on the block a second time. Later on e2fsck will think that it's
straightened out all the duplicate blocks, which isn't true.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we're forced to delete a crosslinked file, only call
ext2fs_block_alloc_stats2() on cluster boundaries, since the block
bitmaps are all cluster bitmaps at this point. It's safe to do this
only once per cluster since we know all the blocks are going away.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
As far as I can tell, logical block mappings on a bigalloc filesystem are
supposed to follow a few constraints:
* The logical cluster offset must match the physical cluster offset.
* A logical cluster may not map to multiple physical clusters.
Since the multiply-claimed block recovery code can be used to fix these
problems, teach e2fsck to find these transgressions and fix them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
During the later passes of efsck, we sometimes need to allocate and
map blocks into a file. This can happen either by fsck directly
calling new_block() or indirectly by the library calling new_block
because it needs to allocate a block for lower level metadata (bmap2()
with BMAP_SET; block_iterate3() with BLOCK_CHANGED).
We need to force new_block to allocate blocks from the found block
map, because the FS block map could be inaccurate for various reasons:
the map is wrong, there are missing blocks, the checksum failed, etc.
Therefore, any time fsck does something that could to allocate blocks,
we need to intercept allocation requests so that they're sourced from
the found block map. Remove the previous code that swapped bitmap
pointers as this is now unneeded.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If we encounter an inode with IND/DIND/TIND blocks or internal extent
tree blocks that point into critical FS metadata such as the
superblock, the group descriptors, the bitmaps, or the inode table,
it's quite possible that the validation code for those blocks is not
going to like what it finds, and it'll ask to try to fix the block.
Unfortunately, this happens before duplicate block processing (pass
1b), which means that we can end up doing stupid things like writing
extent blocks into the inode table, which multiplies e2fsck'
destructive effect and can render a filesystem unfixable.
To solve this, create a bitmap of all the critical FS metadata. If
before pass1b runs (basically check_blocks) we find a metadata block
that points into these critical regions, continue processing that
block, but avoid making any modifications, because we could be
misinterpreting inodes as block maps. Pass 1b will find the
multiply-owned blocks and fix that situation, which means that we can
then restart e2fsck from the beginning and actually fix whatever
problems we find.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If there's a problem with the inode scan during pass 1b, report the
inode that we were trying to examine when the error happened, not the
inode that just went through the checker.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
When pass1 finds blocks that are mapped to multiple files, it will
print every duplicated block. If there are long sequences of
duplicate blocks (e.g. the e_pblk field is wrong in an extent), this
can cause a gigantic flood of output when a range could convey the
same information. Therefore, teach pass1b to print ranges when
possible.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Compiling with LLVM generates a large number of warnings due
to the use of _() for wrapping strings for i18n:
warning: format string is not a string literal
(potentially insecure) [-Wformat-security]
./nls-enable.h:4:14: note: expanded from macro '_'
#define _(a) (gettext (a))
^~~~~~~~~~~~
These warnings are fixed by using "%s" as the format string,
and then _() is used as the string argument.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
We need to store some error codes using an int to keep recovery.c as
close as possible to the recovery.c source file in the kernel.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Perhaps the most serious fix up is a type-punning warning which could
result in miscompilation with overly enthusiastic compilers.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Calculate and verify the checksum for separate (i.e. not in the inode)
extended attribute blocks; the checksum lives in the header.
[ Merged in change from Tao so that we always use the fs checksum seed
for the xattr blocks. ]
Signed-off-by: Darrick J. Wong <djwong@us.ibm.com>
Signed-off-by: Tao Ma <boyu.mt@taobao.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Now that we have multiple backend implementations of the bitmap code,
this commit teaches e2fsck to use either the most appropriate backend
for each use case.
Since we don't know for sure if we will get it all right, the default
choices can be overridden via e2fsck.conf. The various definitions
are shown here, with the current defaults (which may change as we add
more bitmap implementations and as learn what works better).
; EXT2FS_BAMP64_BITARRAY is 1
; EXT2FS_BMAP64_RBTREE is 2
; EXT2FS_BMAP64_AUTODIR is 3
[bitmaps]
inode_used_map = 2 ; pass1
inode_dir_map = 3 ; pass1
inode_reg_map = 2 ; pass1
block_found_map = 2 ; pass1
inode_bad_map = 2 ; pass1
inode_imagic_map = 2 ; pass1
block_dup_map = 2 ; pass1
block_ea_map = 2 ; pass1
inode_link_info = 2 ; pass1
inode_dup_map = 2 ; pass1b
inode_done_map = 3 ; pass3
inode_loop_detect = 3 ; pass3
fs_bitmaps = 2
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
We need to do an accounting of duplicate clusters on a per-cluster
instead of a per-block basis so we know when we've correctly accounted
for all of the multiply claimed blocks in a particular inode.
Thanks to Robin Dong for reporting this bug.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext2fs_file_acl_block() and ext2fs_set_file_acl_block() needs to
only check i_file_acl_high if the 64-bit flag is set. This is needed
because otherwise we will run into problems on Hurd systems which
actually use that field for h_i_mode_high.
This involves an ABI change since we need to pass ext2_filsys to these
functions. Fortunately these functions were first included in the
1.42-WIP series, so it's OK for us to change them now. (This is why
we have 1.42-WIP releases. :-)
Addresses-Sourceforge-Bug: #3379227
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Multi-mount protection is feature that allows mke2fs, e2fsck, and
others to detect if the filesystem is mounted on a remote node (on
SAN disks) and avoid corrupting the filesystem. For e2fsprogs this
means that it checks the MMP block to see if the filesystem is in use,
and marks the filesystem busy while e2fsck is running on the system.
This is useful on SAN disks that are shared between high-availability
servers, or accessible by multiple nodes that aren't in HA pairs. MMP
isn't intended to serve as a primary HA exclusion mechanism, but as a
failsafe to protect against user, software, or hardware errors.
There is no requirement that e2fsck updates the MMP block at regular
intervals, but e2fsck does this occasionally to provide useful
information to the sysadmin in case of a detected conflict.
For the kernel (since Linux 3.0) MMP adds a "heartbeat" mechanism to
periodically write to disk (every few seconds by default) to notify
other nodes that the filesystem is still in use and unsafe to modify.
Originally-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The DEFS line in MCONFIG had gotten so long that it exceeded 4k, and
this was starting to cause some tools heartburn. It also made "make
V=1" almost useless, since trying to following the individual commands
run by make was lost in the noise of all of the defines.
So fix this by putting the configure-generated defines in lib/config.h
and the directory pathnames to lib/dirpaths.h.
In addition, clean up some vestigal defines in configure.in and in the
Makefiles to further shorten the cc command lines.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This patch adds support for doing quota accounting during full
e2fsck scan if the 'quota' feature was set on the superblock.
If user-visible quota inodes are in use, they will be hidden
and converted to the reserved quota inodes.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
I had an extremely corrupted customer filesystem which, after thousands
of lines of e2fsck output, found one more problem on an immediately
subsequent e2fsck. In short, a file had had its i_file_acl block
cloned due to being a duplicate. That ultimately got cleared
because the fs did not have the xattr feature, and the inode
was subsequently removed due to invalid mode.
The 2nd e2fsck pass found the cloned xattr block as in use, but
not owned by any file, and had to fix up the block bitmaps.
Simply skipping the processing of duplicate xattr blocks on a
non-xattr filesystem seems reasonable, since they will be cleared
later in any case.
(also fix existing brace misalignment)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
in the case of ! defined RESOURCE_TRACK, so that we can clean up #ifdef
throughout e2fsck source.
Signed-off-by: Ken Chen <kenchen@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
On ext2, time tracking for pass1 includes both error detection and
specific type of fs fix-up phase (e.g. block referenced by multiple
inodes). The multi-reference fix-up phase some time take significant
amount of time to complete. We would like to track time spent in sub
component of pass1 by having a finer granularity during pass1b through
pass1d phase.
Signed-off-by: Ken Chen <kenchen@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Factor out code to clear a bogus inode and update e2fsck's internal
data structures accordingly into a common routine,
e2fsck_clear_inode(). This saves about 200 bytes in the compiled x86
e2fsck executable, and makes the code more maintainable in the
long-term.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Another small bug I think: if the root directory contains shared
blocks, e2fsck pass1c search_dirent_proc() will be looking for
one more containing directory than it will ever find, and thus
loses an opportunity to terminate early.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
I think this is a small buglet in e2fsck: if a file has multiple hard
links, e2fsck pass1c search_dirent_proc() doesn't maintain its count
properly and may return DIRENT_ABORT before it has found containing
directories for all inodes sharing blocks.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>