This implements code for storing small bkeys on the stack and allocating
out of a mempool if they're too big.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We weren't checking for errors when trying to delet stripes, which meant
ec_stripe_delete_work() would spin trying to delete the same stripe over
and over.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
mark_lock is a frequently taken lock, and there's also potential for
deadlocks since currently bch2_clear_page_bits which is called from
memory reclaim has to take it to drop disk reservations.
The disk reservation get path takes it when it recalculates the number
of sectors known to be available, but it's not really needed for
consistency. We just want to make sure we only have one thread updating
the sectors_available count, which we can do with a dedicated mutex.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Now, we store blacklisted journal sequence numbers in the superblock,
not the journal: this helps to greatly simplify the code, and more
importantly it's now implemented in a way that doesn't require all btree
nodes to be visited before starting the journal - instead, we
unconditionally blacklist the next 4 journal sequence numbers after an
unclean shutdown.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
In flight btree updates could update alloc info until they're flushed -
so we have to try writing again after they've been flushed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
- Does not persist alloc info for stripes yet
- Also does not yet include filesystem block/sector counts yet, from
struct fs_usage
- Not made use of just yet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
journal reclaim writes btree nodes, which can end up waiting for in
flight btree writes to complete, and btree write completions run out of
workqueues - so we can't run out of the same workqueue or we risk
deadlock
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
this lets us get rid of a lot of extra switch statements - in a lot of
places we dispatch on the btree node type, and then the key type, so
this is a nice cleanup across a lot of code.
Also improve the on disk format versioning stuff.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This means we can now use gc to verify the allocation information -
important for testing persistant alloc info
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This way the last mount time is actually meaningful instead of just being
various times from 1970 (which happens with the monotonic clock).
Also, roundup_pow_of_two() is undefined when passed in 0, so check before
calling it.
Signed-off-by: Tim Schlueter <schlueter.tim@linux.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Due to compression, the different replicas of a replicated extent don't
necessarily have to take up the same amount of space - so replicated
data sector counts shouldn't be stored divided by the number of
replicas.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write
filesystem with every feature you could possibly want.
Website: https://bcachefs.org
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>