As far as I can tell, logical block mappings on a bigalloc filesystem are
supposed to follow a few constraints:
* The logical cluster offset must match the physical cluster offset.
* A logical cluster may not map to multiple physical clusters.
Since the multiply-claimed block recovery code can be used to fix these
problems, teach e2fsck to find these transgressions and fix them.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
During the later passes of efsck, we sometimes need to allocate and
map blocks into a file. This can happen either by fsck directly
calling new_block() or indirectly by the library calling new_block
because it needs to allocate a block for lower level metadata (bmap2()
with BMAP_SET; block_iterate3() with BLOCK_CHANGED).
We need to force new_block to allocate blocks from the found block
map, because the FS block map could be inaccurate for various reasons:
the map is wrong, there are missing blocks, the checksum failed, etc.
Therefore, any time fsck does something that could to allocate blocks,
we need to intercept allocation requests so that they're sourced from
the found block map. Remove the previous code that swapped bitmap
pointers as this is now unneeded.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
If there's a problem with the inode scan during pass 1b, report the
inode that we were trying to examine when the error happened, not the
inode that just went through the checker.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Compiling with LLVM generates a large number of warnings due
to the use of _() for wrapping strings for i18n:
warning: format string is not a string literal
(potentially insecure) [-Wformat-security]
./nls-enable.h:4:14: note: expanded from macro '_'
#define _(a) (gettext (a))
^~~~~~~~~~~~
These warnings are fixed by using "%s" as the format string,
and then _() is used as the string argument.
Signed-off-by: Andreas Dilger <adilger@dilger.ca>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
We need to store some error codes using an int to keep recovery.c as
close as possible to the recovery.c source file in the kernel.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Perhaps the most serious fix up is a type-punning warning which could
result in miscompilation with overly enthusiastic compilers.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Now that we have multiple backend implementations of the bitmap code,
this commit teaches e2fsck to use either the most appropriate backend
for each use case.
Since we don't know for sure if we will get it all right, the default
choices can be overridden via e2fsck.conf. The various definitions
are shown here, with the current defaults (which may change as we add
more bitmap implementations and as learn what works better).
; EXT2FS_BAMP64_BITARRAY is 1
; EXT2FS_BMAP64_RBTREE is 2
; EXT2FS_BMAP64_AUTODIR is 3
[bitmaps]
inode_used_map = 2 ; pass1
inode_dir_map = 3 ; pass1
inode_reg_map = 2 ; pass1
block_found_map = 2 ; pass1
inode_bad_map = 2 ; pass1
inode_imagic_map = 2 ; pass1
block_dup_map = 2 ; pass1
block_ea_map = 2 ; pass1
inode_link_info = 2 ; pass1
inode_dup_map = 2 ; pass1b
inode_done_map = 3 ; pass3
inode_loop_detect = 3 ; pass3
fs_bitmaps = 2
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
We need to do an accounting of duplicate clusters on a per-cluster
instead of a per-block basis so we know when we've correctly accounted
for all of the multiply claimed blocks in a particular inode.
Thanks to Robin Dong for reporting this bug.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The ext2fs_file_acl_block() and ext2fs_set_file_acl_block() needs to
only check i_file_acl_high if the 64-bit flag is set. This is needed
because otherwise we will run into problems on Hurd systems which
actually use that field for h_i_mode_high.
This involves an ABI change since we need to pass ext2_filsys to these
functions. Fortunately these functions were first included in the
1.42-WIP series, so it's OK for us to change them now. (This is why
we have 1.42-WIP releases. :-)
Addresses-Sourceforge-Bug: #3379227
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Multi-mount protection is feature that allows mke2fs, e2fsck, and
others to detect if the filesystem is mounted on a remote node (on
SAN disks) and avoid corrupting the filesystem. For e2fsprogs this
means that it checks the MMP block to see if the filesystem is in use,
and marks the filesystem busy while e2fsck is running on the system.
This is useful on SAN disks that are shared between high-availability
servers, or accessible by multiple nodes that aren't in HA pairs. MMP
isn't intended to serve as a primary HA exclusion mechanism, but as a
failsafe to protect against user, software, or hardware errors.
There is no requirement that e2fsck updates the MMP block at regular
intervals, but e2fsck does this occasionally to provide useful
information to the sysadmin in case of a detected conflict.
For the kernel (since Linux 3.0) MMP adds a "heartbeat" mechanism to
periodically write to disk (every few seconds by default) to notify
other nodes that the filesystem is still in use and unsafe to modify.
Originally-by: Kalpak Shah <kalpak@clusterfs.com>
Signed-off-by: Johann Lombardi <johann@whamcloud.com>
Signed-off-by: Andreas Dilger <adilger@whamcloud.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
The DEFS line in MCONFIG had gotten so long that it exceeded 4k, and
this was starting to cause some tools heartburn. It also made "make
V=1" almost useless, since trying to following the individual commands
run by make was lost in the noise of all of the defines.
So fix this by putting the configure-generated defines in lib/config.h
and the directory pathnames to lib/dirpaths.h.
In addition, clean up some vestigal defines in configure.in and in the
Makefiles to further shorten the cc command lines.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
This patch adds support for doing quota accounting during full
e2fsck scan if the 'quota' feature was set on the superblock.
If user-visible quota inodes are in use, they will be hidden
and converted to the reserved quota inodes.
Signed-off-by: Aditya Kali <adityakali@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
I had an extremely corrupted customer filesystem which, after thousands
of lines of e2fsck output, found one more problem on an immediately
subsequent e2fsck. In short, a file had had its i_file_acl block
cloned due to being a duplicate. That ultimately got cleared
because the fs did not have the xattr feature, and the inode
was subsequently removed due to invalid mode.
The 2nd e2fsck pass found the cloned xattr block as in use, but
not owned by any file, and had to fix up the block bitmaps.
Simply skipping the processing of duplicate xattr blocks on a
non-xattr filesystem seems reasonable, since they will be cleared
later in any case.
(also fix existing brace misalignment)
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
in the case of ! defined RESOURCE_TRACK, so that we can clean up #ifdef
throughout e2fsck source.
Signed-off-by: Ken Chen <kenchen@google.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
On ext2, time tracking for pass1 includes both error detection and
specific type of fs fix-up phase (e.g. block referenced by multiple
inodes). The multi-reference fix-up phase some time take significant
amount of time to complete. We would like to track time spent in sub
component of pass1 by having a finer granularity during pass1b through
pass1d phase.
Signed-off-by: Ken Chen <kenchen@google.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Factor out code to clear a bogus inode and update e2fsck's internal
data structures accordingly into a common routine,
e2fsck_clear_inode(). This saves about 200 bytes in the compiled x86
e2fsck executable, and makes the code more maintainable in the
long-term.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Another small bug I think: if the root directory contains shared
blocks, e2fsck pass1c search_dirent_proc() will be looking for
one more containing directory than it will ever find, and thus
loses an opportunity to terminate early.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
I think this is a small buglet in e2fsck: if a file has multiple hard
links, e2fsck pass1c search_dirent_proc() doesn't maintain its count
properly and may return DIRENT_ABORT before it has found containing
directories for all inodes sharing blocks.
Signed-off-by: Jim Garlick <garlick@llnl.gov>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The dict_lookup() function can potentially return a NULL dnode_t. It is
not checked in two places in the clone_file() function. Looks to be
safe to continue if n is NULL, so just print a warning message and
continue.
Coverity ID: 9: Null Returns
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
For loops iterating over all group descriptors, consistently define
first_block and last_block in a way that they are inclusive of the
range, and do not overflow.
Previously on the last block group we did a test of <= first +
dec_blocks; this would actually wrap back to 0 for a total block count
of 2^32-1
Also add handling of last block group which may be smaller.
Signed-off-by: Eric Sandeen <esandeen@redhat.com>
This fixes some (but not all) of the compatibility bugs which prevented
e2fsprogs from being compiled on a Linux 2.0.35 system. There are still
some unprotected use of long long's, and apparently some type problems
with the uuid library, but these can be fixed up later.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Change the format string(%d, %ld) for a block number and inode number
to %u or %lu.
Signed-off-by: Takashi Sato <sho@tnes.nec.co.jp>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
autoconf 2.13 version of AC_CHECK_TYPE. Otherwise, on some platforms
intptr_t might get erroneously #define'd to be long. (Addresses
Debian Bug #289133)
field calculation so that it only counts EA block entries
as a single multiply claimed block (since once we clone
the EA blocks for one inode, we fix the problem for all of
the other inodes). Also, I moved the num_bad calculation
from process_pass1b_block to the end of pass1b. This
fixes a *significant* performance bug in pass1b which hit
people who had to had a lot of multiply claimed blocks.
(Can you say O(n**3) boys and girls? I knew you could...
Fortunately, this case didn't happen that much in actual practice.)
blocks. Moved free of block_buf to after the code which clones the
extattr block, and fixed logic for changing pointers to the extended
attribute field in the inodes which were affected.
(decrement_badcount): New function which is used whenever we need to
decrement the number of files which claim a particular bad block.
Fixed bug where delete_file wasn't checking check_if_fs_block() before
clearing the entry in block_dup_map. This could cause a block which
was claimed by multiple files as well as the filesystem metadata to
not be completely fixed.
pass1.c (pass1_get_blocks, pass1_read_inode, pass1_write_inode,
pass1_check_directory): Add a safety check to make sure
ctx->stashed_inode is non-zero.
pass1b.c (pass1b): Use e2fsck_use_inode_shortcuts() to disable the
inode shortcut processing, instead of manually clearing only half of
the function pointers that needed to be NULL'ed out. This caused
nasty bugs if the last inode in the filesystem needed dup block
processing.
pass1b.c (clone_file_block): When cloning a directory's metadata
block, don't try to update the directory block list database, since
indirect blocks aren't stored in the database and the resulting error
will abort the file clone operation.