Commit Graph

1215594 Commits

Author SHA1 Message Date
Kent Overstreet
3826ee0b17 bcachefs: Add a lock to bch_page_state
We can't use the page lock to protect it, because on writeback IO error
we need to access the page state before calling end_page_writeback() and
the page lock semantics are completely insane so that deadlocks.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
43de7376f3 bcachefs: Fix erasure coding disk space accounting
Disk space accounting for erasure coding + compression was completely
broken - we need to calculate the parity sectors delta the same way we
calculate disk_sectors, by calculating the old and new usage and
subtracting to get the difference.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
9ec211b0ff bcachefs: Fix ec_stripes_read()
The bkey_s_c returned by btree_iter_(peek|next) points into the btree
iter type, so advancing the iterator and then using the one previously
returned is a bug...

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
37954a275f bcachefs: Limit pointers to being in only one stripe
This make the disk accounting code saner, and it's not clear why we'd
ever want the same data to be in multiple stripes simultaneously.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
9ef6068c4d bcachefs: Fix bch2_extent_ptr_durability()
We were looking up the wrong entry in the stripes radix tree.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
332c6e5370 bcachefs: Fix bch2_mark_extent()
If an extent only contained cached or erasure coded pointers, there
won't be any devices in the normal dirty replicas list or an entry to
update.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Justin Husted
bf974f9203 bcachefs: Initialize journal pad data in bch_replica_entry objects.
Running the filesystem under valgrind exposed some garbage data being
written to disk in bch2_journal_super_entries_add_common(), in the
portion which encodes bch_replica_entry objects.

Signed-off-by: Justin Husted <sigstop@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Justin Husted
f7c0fcdd39 bcachefs: Fix uninitialized data in bch2_gc_btree()
Running the filesystem under valgrind exposed a path where the max_stale
variable in bch2_gc_btree() might not be initialized before use in a
rare case when there are no btree nodes in a transaction.

Signed-off-by: Justin Husted <sigstop@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:29 -04:00
Kent Overstreet
a40d97a771 bcachefs: Fix incorrect use of bch2_extent_atomic_end()
bch2_extent_atomic_end counts the number of iterators requried for
marking overwrites - but journal replay never marks overwrites, so that
part was incorrect. And counting iterators for the key being inserted
should be unnecessary because we did that prior to the key being
inserted before it was first journalled.

This should fix an iterator overflow bug - the iterators for walking
overwrites were totally unneeded.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
63fbf458cb bcachefs: Can't be holding read locks while taking write locks
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
0741d37821 bcachefs: Don't allocate memory under mark_lock
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
137b0ed907 bcachefs: bch2_extent_atomic_end() now traverses iter
This fixes a bug in io.c bch2_write_index_default() - it was missing the
traverse call, but bch2_extent_atomic_end returns an error now and can
just call it itself.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
9638574229 bcachefs: Factor out fs-common.c
This refactoring makes the code easier to understand by separating the
bcachefs btree transactional code from the linux VFS code - but more
importantly, it's also to share code with the fuse port.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
73501ab82c bcachefs: Don't use sha256 for siphash str hash key
With the refactoring that's coming to add fuse support, we want
bch2_hash_info_init() to be cheaper so we don't have to rely on anything
cached besides the inode in the btree.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
bd09d26897 bcachefs: Only look up inode io opts in extents btree
We currently don't have a way to propagate inode io opts to indirect
extents. This is a problem...

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
d65a8551c8 bcachefs: Fix deref of error pointer
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
58677a1d40 bcachefs: bch2_inode_peek()/bch2_inode_write()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
618b9e575b bcachefs: Fix undefined behaviour
roundup_pow_of_two(0) is undefined

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
ab9ff73322 bcachefs: Fix an error path
It's possible to get -EIO in __btree_iter_traverse_all() after looping,
with orig_iter NULL.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
8de819f834 bcachefs: Fix __bch2_buffered_write() returning -ENOMEM
When grab_cache_page_write_begin() fails but we did pin some pages, we
shouldn't return -ENOMEM, we should do a partial write.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
6988e85be5 bcachefs: Trust inode in btree over bch_inode_info
This is the start of some refactoring work to make less code depend on
the linux VFS - here the inode cache - to make e.g. the fuse port
easier.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
21ad9ddee8 bcachefs: Fix counting iterators for reflink pointers
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
64bc001153 bcachefs: Rework btree iterator lifetimes
The btree_trans struct needs to memoize/cache btree iterators, so that
on transaction restart we don't have to completely redo btree lookups,
and so that we can do them all at once in the correct order when the
transaction had to restart to avoid a deadlock.

This switches the btree iterator lookups to work based on iterator
position, instead of trying to match them up based on the stack trace.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
a7199432c3 bcachefs: Kill deferred btree updates
Will be replaced by cached btree iterators

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
877dfb348d bcachefs: Fix for partial buffered writes
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
bbd8d2038b bcachefs: BTREE_ITER_SLOTS isn't a type of btree iter
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
ef9f95ba41 bcachefs: Improve error handling for for_each_btree_key_continue()
Change it to match for_each_btree_key()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
b43a0f60a6 bcachefs: Cleanup i_nlink handling
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:28 -04:00
Kent Overstreet
d55460bb09 bcachefs: Trivial cleanup
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
fb472ac528 bcachefs: Convert a BUG_ON() to a warning
We shouldn't ever be writing past i_size - but, apparently there's still
a bug to track down.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
fe9cdf61cc bcachefs: Count iterators for reflink_p overwrites correctly
In order to avoid trying to allocate too many btree iterators,
bch2_extent_atomic_end() needs to count how many iterators are going to
be needed for insertions and overwrites - but we weren't counting the
iterators for deleting a reflink_v when the refcount goes to 0.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
fb975d14b7 bcachefs: Drop unnecessary rcu_read_lock()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
fdfab313b6 bcachefs: Update path microoptimizations
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
0a426c3239 bcachefs: Handle bio_iov_iter_get_pages() returning unaligned bio
If the user buffer isn't aligned to the filesystem block size, on a
large enough IO - where it won't fit into a single bio -
bio_iov_iter_get_pages() won't necessarily return a bio with the proper
alignment.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
ec350b90d7 bcachefs: Drop unused arg to bch2_open_buckets_stop_dev()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
eab32c8e4e bcachefs: Fix validation of replicas entries
When an extent is erasure coded, we need to record a replicas entry to
indicate that data is present on the devices that extent has pointers to
- but nr_required should be 0, because it's erasure coded.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
5f786787ad bcachefs: Add support for FALLOC_FL_INSERT_RANGE
Somewhat tricky and ugly, because iterating over extents backwards is a
pain.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
ccf5a10958 bcachefs: bch2_btree_iter_peek_prev()
Last of the basic operations for iterating forwards and backwards over
the btree: we now have
 - peek(),	returns key >= iter->pos
 - next(),	returns key >  iter->pos
 - peek_prev(),	returns key <= iter->pos
 - prev(),	returns key < iter->pos

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
6cc3535dcb bcachefs: Don't write past eof
When converting from PAGE_SIZE to block_size, the .mkwrite path was
missed

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
9c37b63207 bcachefs: Check for extents past eof correctly
bcachefs used to work mostly in terms of PAGE_SIZE, not block size at
the vfs level - but that has since been fixed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
059e4134d2 bcachefs: Debug assertion improvements
Call bch2_btree_iter_verify from bch2_btree_node_iter_fix(); also verify
in btree_iter_peek_uptodate() that iter->k matches what's in the btree.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
554d219ebb bcachefs: Add missing bch2_btree_node_iter_fix() call
Any time we're modifying what's in the btree, iterators potentially have
to be updated - this one was exposed by the reflink code.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
4d13e818f5 bcachefs: Avoid deadlocking on the allocator
The allocator needs to make sure there's buckets available on the
RESERVE_NONE freelist if at all possible - otherwise foreground IO will
get stuck.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
f4b613410c bcachefs: More btree iter improvements
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
3745efd618 bcachefs: Improve btree_iter_pos_in_node()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
a9d1f91018 bcachefs: Debug code improvements
.key_debugcheck no longer needs to take a pointer to the btree node

Also, try to make sure wherever we're inserting or modifying keys in the
btree.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
4cac0bf2c2 bcachefs: Add missing bch2_btree_node_iter_fix() calls
With multiple iterators, if another iterator points to the key being
modified, we need to call bch2_btree_node_iter_fix() to re-unpack the
key into the iter->k

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
9b02d1c49a bcachefs: Optimize calls to bch2_btree_iter_traverse()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:27 -04:00
Kent Overstreet
5a8a52d610 bcachefs: Fix a typo
_iter, not iter

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:26 -04:00
Kent Overstreet
6309589468 bcachefs: Improved bch2_fcollapse()
Move extents instead of copying them - this way, we can iterate over
only live extents, not the entire keyspace. Also, this means we can
mostly skip running triggers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:26 -04:00