We have to free the old (in memory) btree node _before_ unlocking the
new nodes - else, some other thread with a read lock on the old node
could see stale data after another thread has already updated the new
node.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
It's possible to get -EIO in __btree_iter_traverse_all() after looping,
with orig_iter NULL.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The btree_trans struct needs to memoize/cache btree iterators, so that
on transaction restart we don't have to completely redo btree lookups,
and so that we can do them all at once in the correct order when the
transaction had to restart to avoid a deadlock.
This switches the btree iterator lookups to work based on iterator
position, instead of trying to match them up based on the stack trace.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Last of the basic operations for iterating forwards and backwards over
the btree: we now have
- peek(), returns key >= iter->pos
- next(), returns key > iter->pos
- peek_prev(), returns key <= iter->pos
- prev(), returns key < iter->pos
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Call bch2_btree_iter_verify from bch2_btree_node_iter_fix(); also verify
in btree_iter_peek_uptodate() that iter->k matches what's in the btree.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Being more rigorous about noting when the key the iterator currently
poins to has changed - which should also give us a nice performance
improvement due to not having to check if we have to skip other bsets
backwards as much.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
bch2_btree_node_iter_prev_all() depends on an invariant that wasn't
being maintained for extent leaf nodes - specifically, the node iterator
may not have advanced past any keys that compare after the key the node
iterator points to.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This is prep work for the btree key cache: btree iterators will point to
either struct btree, or a new struct bkey_cached.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
If upgrade fails on one iterator, but it was copied from another
iterator and will be freed before transaction restart, then the original
iterator will get traversed first, so we need to make required btree
nodes on the original iterator will be traversed too.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Now, we store blacklisted journal sequence numbers in the superblock,
not the journal: this helps to greatly simplify the code, and more
importantly it's now implemented in a way that doesn't require all btree
nodes to be visited before starting the journal - instead, we
unconditionally blacklist the next 4 journal sequence numbers after an
unclean shutdown.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>