We've grown a fair amount of code for managing recovery passes; tracking
which ones we're running, which ones need to be run, and flagging in the
superblock which ones need to be run on the next recovery.
So it's worth splitting out into its own file, this code is pretty
different from the code in recovery.c.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This converts -EIOs related to btree node errors to private error codes,
which will help with some ongoing debugging by giving us better error
messages.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
BCH_FS_fsck_done -> BCH_FS_fsck_running; set when we might be fixing
fsck errors. Also; set fix_errors to ask by default when fsck is
running.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Now we can print out filesystem flags in sysfs, useful for debugging
various "what's my filesystem doing" issues.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Add two new superblock fields. Since the main section of the superblock
is now fully, we have to add a new variable length section for them -
bch_sb_field_ext.
- recovery_passes_requried: recovery passes that must be run on the
next mount
- errors_silent: errors that will be silently fixed
These are to improve upgrading and dwongrading: these fields won't be
cleared until after recovery successfully completes, so there won't be
any issues with crashing partway through an upgrade or a downgrade.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This patch adds a superblock error counter for every distinct fsck
error; this means that when analyzing filesystems out in the wild we'll
be able to see what sorts of inconsistencies are being found and repair,
and hence what bugs to look for.
Errors validating bkeys are not yet considered distinct fsck errors, but
this patch adds a new helper, bkey_fsck_err(), in order to add distinct
error types for them as well.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Add a new superblock section to keep counts of errors seen since
filesystem creation: we'll be addingcounters for every distinct fsck
error.
The new superblock section has entries of the for [ id, count,
time_of_last_error ]; this is intended to let us see what errors are
occuring - and getting fixed - via show-super output.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We now track IO errors per device since filesystem creation.
IO error counts can be viewed in sysfs, or with the 'bcachefs
show-super' command.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
More reorganization, this splits up io.c into
- io_read.c
- io_misc.c - fallocate, fpunch, truncate
- io_write.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This adds bch2_run_explicit_recovery_pass(), for rewinding recovery and
explicitly running a specific recovery pass - this is a more general
replacement for how we were running topology repair before.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Before, it was parsed as a bool but internally it was really an enum:
this lets us pass in all the possible values.
But we special case the option parsing: no supplied value is parsed as
FSCK_FIX_yes, to match the previous behaviour.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
- getline() output includes a newline, without stripping that we were
just looping
- Make the prompt clearer
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We had some bugs with setting/using first_this_inode in the inode
walker in the dirents/xattr code.
This patch changes to not clear first_this_inode until after
initializing the new hash info.
Also, we fix an error message to not print on transaction restart, and
add a comment to related fsck error code.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This changes the ask_yn() function used by fsck to accept Y or N,
meaning yes or no for all errors of a given type.
With this, the user can be prompted only for distinct error types -
useful when a filesystem has lots of errors.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This ensures that failure to read a btree node error is treated as a
topology error, and returns the correct error so that the topology
repair pass will be run.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
- Marking a non-static function as inline doesn't actually work and is
now causing problems - drop that
- Introduce BCACHEFS_LOG_PREFIX for when we want to prefix log messages
with bcachefs (filesystem name)
- Userspace doesn't have real percpu variables (maybe we can get this
fixed someday), put an #ifdef around bch2_disk_reservation_add()
fastpath
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
When errors=panic, we want to make sure we print the error before
calling bch2_inconsistent_error().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
- factor out fsck_err_get()
- if the "bcachefs (%s):" prefix has already been applied, don't
duplicate it
- convert to printbufs instead of static char arrays
- tidy up control flow a bit
- use bch2_print_string_as_lines(), to avoid messages getting truncated
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Change the error messages in bch2_inconsistent_error() and
bch2_fatal_error() so we can distinguish them.
Also, prefer bch2_fs_fatal_error() (which also logs an error message) to
bch2_fatal_error(), and change a call to bch2_inconsistent_error() to
bch2_fatal_error() when we can't continue.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
It's unhelpful if we see "Halting mark and sweep to start topology
repair" but we don't see the error that triggered it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This splits out btree topology repair into a separate pass, and makes
some improvements:
- When we have to pick which of two overlapping nodes to drop keys
from, we use the btree node header sequence number to preserve the
newer node
- the gc code has been changed so that it doesn't bail out if we're
continuing/ignoring on fsck error - this way the dump tool can skip
running the repair pass but still walk all reachable metadata
- add a new superblock flag indicating when a filesystem is known to
have btree topology issues, and the topology repair pass should be
run
- changing the start/end of a node might mean keys in that node have to
be deleted: this patch handles that better by splitting it out into a
separate function and running it explicitly in the topology repair
code, previously those keys were only being dropped when the btree
node was read in.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This patch standardizes all the enums that have associated string tables
(probably more enums should have string tables).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Now that all filesystem operatinos that manipulate the filesystem
heirachy and i_nlink are fully atomic, we can add a feature bit to
indicate i_nlink doesn't need to be checked.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write
filesystem with every feature you could possibly want.
Website: https://bcachefs.org
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>