Provide the user with an option to create an undo file so that they
can roll back a failed repair operation.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Teach e2fsck to (re)construct extent trees. This enables us to do
either of the following: compress a highly sparse extent tree into
fewer ETB blocks; or convert a ext3-style block mapped file to an
extent file. The reconstruction is performed during pass 1E or 3A,
as detailed below.
For files that are already extent based, this algorithm will
automatically run (pending user approval) if pass1 determines either
(1) that a whole level of extent tree will fit into a higher level of
the tree; (2) that the size of any level can be reduced by at least
one ETB block; or (3) the extent tree is unnecessarily deep. It will
not run at all if errors are found and the user declines to fix the
errors.
The option "-E bmap2extent" can be used to force e2fsck to convert all
block map files to extent trees, and to rebuild all extent files'
extent trees. After conversion, files larger than 12 blocks should be
defragmented to eliminate empty holes where a block lives.
The extent tree constructor is pretty dumb -- it creates a list of
leaf extents (adjacent extents are collapsed), marks all indirect
blocks / ETB blocks free, installs a new extent tree root in the
inode, then loads the leaf extents into the tree.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
e2fsck pass1 is modified to use the block group data prefetch function
to try to fetch the inode tables into the pagecache before it is
needed. We iterate through the blockgroups until we have enough inode
tables that need reading such that we can issue readahead; then we sit
and wait until the last inode table block read of the last group to
start fetching the next bunch.
pass2 is modified to use the dirblock prefetching function to prefetch
the list of directory blocks that are assembled in pass1. We use the
"iterate a subset of a dblist" and avoid copying the dblist. Directory
blocks are fetched incrementally as we walk through the directory
block list. In previous iterations of this patch we would free the
directory blocks after processing, but the performance hit to e2fsck
itself wasn't worth it. Furthermore, it is anticipated that most
users will then mount the FS and start using the directories, so they
may as well remain in the page cache.
pass4 is modified to prefetch the block and inode bitmaps in
anticipation of pass 5, because pass4 is entirely CPU bound.
In general, these mechanisms can decrease fsck time by 10-40%, if the
host system has sufficient memory and the storage system can provide a
lot of IOPs. Pretty much any storage system capable of handling
multiple IOs in-flight at any time will see a fairly large performance
boost. (Single-issue USB mass storage disks seem to suffer badly.)
By default, the readahead buffer size will be set to the size of a block
group's inode table (which is 2MiB for a regular ext4 FS). The -E
readahead_kb= option can be given to specify the amount of memory to
use for readahead or zero to disable it entirely; or an option can be
given in e2fsck.conf.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Provide a mechanism for a user to switch fsck into '-y' mode if they
start an interactive session and then get tired of pressing 'y' in
response to numerous prompts.
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
As recently discussed on linux-ext4@vger.kernel.org add an option to e2fsck
to allow to replay the journal only. That will allow scripts, such as
pacemakers 'Filesystem' RA to first replay the journal and if that sets
an error state from the journal replay, further check for that error
(dumpe2fh -h | grep "Filesystem state:") and if that shows and error
to refuse to mount. It also allows automatic e2fsck scripts to first
replay the journal and on a second run after the real pass1 to passX checks
to test for the return code.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
In Pass 5 when we are checking block and inode bitmaps we have great
opportunity to discard free space and unused inodes on the device,
because bitmaps has just been verified as valid. This commit takes
advantage of this opportunity and discards both, all free space and
unused inodes.
I have added new set of options, 'nodiscard' and 'discard'. When the
underlying devices does not support discard, or discard ends with an
error, or when any kind of error occurs on the filesystem, no further
discard attempt will be made and the e2fsck will behave as it would
with nodiscard option provided.
As an addition, when there is any not-yet-zeroed inode table and
discard zeroes data, then inode table is marked as zeroed.
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
A user was surprised when -n -D caused the file system to be opened
read/write, and then outsmarted himself when e2fsck asked the question:
WARNING!!! Running e2fsck on a mounted filesystem may cause
SEVERE filesystem damage.
Do you really want to continue (y/n)?
This is partially our fault for not documenting the fact that -D
overrode opening the filesystem read-write. But the bottom line is it
much safer if -n *always* opens the file system read-only, so there
can be no confusion. This means that we have to disable certain
combination of options, such as "-n -c", "-n -l", and "-n -L", and
"-n -D", but the utility of these combinations is pretty low, and
is more than offset by making e2fsck idiot-proof.
Addresses-Launchpad-Bug: #537483
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The e2fsprogs programs have historically just said that they operate
on ext2 and ext3 file system in their man pages. Update them to say
that they also operate on ext4 file systems.
Addresses-Launchpad-bug: #381854
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Also added support for "e2fsck -E fragcheck" which issues a
comprehensive report of discontiguous file extents.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
If a negative progress argument is given to -C, initially suppress the
progress information. It can be enabled later by sending the e2fsck
process a SIGUSR1 signal.
Addresses-Launchpad-Bug: #203323
Addresses-Sourceforge-Bug: #1926023
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Document in the e2fsck man page that e2fsck finds duplicate filenames
only when the -D option is passed to e2fsck.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Add an explanation of how e2fsck might decide to optimize a few
directories even without the -D option being specified.
Addresses-Debian-Bug: #441872
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
The need for fixing byte-swapped filesystems is long-gone, and this is
getting in the way of cleaning up e2fsprogs's bitmaps code. So let's
get rid of it; modern kernels haven't been able to deal with a
byte-swapped filesystem in in about 9 years.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
A user was confused about whether or not e2fsck -c performed a destructive
test on the filesystem, since it stated that -cc resulted in a non-destructive
read/write test. Clarify that -c does a read/only test.
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
make sure we gracefully clean up and only exit at safe points.
For fsck, we pass the SIGINT/SIGTERM signal to the child processes,
so they can do their own cleanup.
a read/write test on the disk. Update the man pages to encourage
using the -c option, and to discouraging running badblocks separately,
since users tend to forget to set the blocksize when running
badblocks.
journal.c: implement loading of ext3 journal for recovery code
problem.c (fix_problem): return answer from PR_AFTER_CODE to caller.
Add journal problems.
recovery.c (journal_recover): user-space ext3 journal recovery code
unix.c (main) : check journal and do recovery in separate steps
jfs.h, recovery.c: Files ext3 kernel code.
jfs_compat.h: Compatibility header file to allow kernel code to be
linked to e2fsck.
badblocks.8.in, chattr.1.in, dumpe2fs.8.in, e2label.8.in,
fsck.8.in, lsattr.1.in, mke2fs.8.in, mklost+found.8.in,
tune2fs.8.in, uuidgen.1.in: Update man page to use a more standard
format (bold option flags and italicized variables), as suggested by
Andreas Dilger (adilger@enel.ucalgary.ca)
ChangeLog, e2fsck.8.in:
e2fsck.8.in: Update man page to use a more standard format (bold
option flags and italicized variables), as suggested by Andreas Dilger
(adilger@enel.ucalgary.ca)
ChangeLog, uuid_generate.3.in:
uuid_generate.8.in: Update man page to use a more standard format
(bold option flags and italicized variables), as suggested by Andreas
Dilger (adilger@enel.ucalgary.ca)
unix.c: Add support for calculating a progress bar if the -C0 option
is given. The function e2fsck_clear_progbar() clears the progress bar
and must be called before any message is issued. SIGUSR1 will enable
the progress bar, and SIGUSR2 will disable the progress bar. This is
used by fsck to handle parallel filesystem checks. Also, set the
device_name from the filesystem label if it is available.
e2fsck.h: Add new flags E2F_FLAG_PROG_BAR and E2F_FLAG_PROG_SUPRESS.
Add new field in the e2fsck structure which contains the last tenth of
a percent printed for the user.
message.c (print_e2fsck_message): Add call to e2fsck_clear_progbar().
pass1.c (e2fsck_pass1):
pass2.c (e2fsck_pass2):
pass3.c (e2fsck_pass3):
pass4.c (e2fsck_pass4):
pass5.c (e2fsck_pass5): Add call to e2fsck_clear_progbar when printing
the resource tracking information.
pass5.c (check_block_bitmaps, check_inode_bitmaps): If there is an
error in the bitmaps, suppress printing the progress bar using the
suppression flag for the remainder of the check, in order to clean up
the display.
unix.c (PRS): Added new option -C, which causes e2fsck to print
progress updates so that callers can keep track of the completion
progress of e2fsck. Designed for use by progress, except for -C 0,
which prints a spinning report which may be useful for some users.
pass5.c (e2fsck_pass5): Use a finer-grained progress reporting scheme
(useful for larger filesystems).
e2fsck.h: Add progress_fd and progress_pos, for use by the Unix
progress reporting functions.
e2fsck.h: Add new field, priv_data to the e2fsck context structure.
It should be used by callers of the e2fsck functions only, and not by
anything in e2fsck itself.
super.c: Instead of call ext2fs_get_device_size(), define and call
e2fsck_get_device_size(). (This function may be stubbed out in
special versions of e2fsck.)
pass3.c, pass4.c: Remove extra calls to the progress function that
weren't needed.
mke2fs.8.in: Update man page to note that the format of the bad block
file is the same as the one generated by badblocks.
unix.c (main): In the case where the filesystem revision is too high,
print the message about the superblock possibly being corrupt.
e2fsck.8.in: Add expanded comments about how the -b option works.