linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-15 16:24:13 +08:00

Author	SHA1	Message	Date
Kent Overstreet	73bd774d28	bcachefs: Assorted sparse fixes - endianness fixes - mark some things static - fix a few __percpu annotations - fix silent enum conversions Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:06 -04:00
Kent Overstreet	e53a961c6b	bcachefs: Rename enum alloc_reserve -> bch_watermark This is prep work for consolidating with JOURNAL_WATERMARK. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:04 -04:00
Kent Overstreet	9473cff989	bcachefs: Fix more lockdep splats in debug.c Similar to previous fixes, we can't incur page faults while holding btree locks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:04 -04:00
Kent Overstreet	bb125baf51	bcachefs: Delete warning from promote_alloc() It's possible to see a -BCH_ERR_ENOSPC_disk_reservation here, and that's fine. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:03 -04:00
Kent Overstreet	4c4a8f20d1	bcachefs: Fix error handling in promote path The promote path had a BUG_ON() for unknown error type, which we're now seeing: change it to a WARN_ON() - because we're curious what this is - and otherwise handle it in the normal error path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:03 -04:00
Kent Overstreet	19c304bebd	bcachefs: GFP_NOIO -> GFP_NOFS GFP_NOIO dates from the bcache days, when we operated under the block layer. Now, GFP_NOFS is more appropriate, so switch all GFP_NOIO uses to GFP_NOFS. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:03 -04:00
Kent Overstreet	dbda63bbb0	bcachefs: bch2_bkey_make_mut() now calls bch2_trans_update() It's safe to call bch2_trans_update with a k/v pair where the value hasn't been filled out, as long as the key part has been and the value is filled out by transaction commit time. This patch folds the bch2_trans_update() call into bch2_bkey_make_mut(), eliminating a bit of boilerplate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:01 -04:00
Kent Overstreet	f12a798a89	bcachefs: bch2_bkey_get_mut() now calls bch2_trans_update() It's safe to call bch2_trans_update with a k/v pair where the value hasn't been filled out, as long as the key part has been and the value is filled out by transaction commit time. This patch folds the bch2_trans_update() call into bch2_bkey_get_mut(), eliminating a bit of boilerplate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:01 -04:00
Kent Overstreet	34dfa5db19	bcachefs: bch2_bkey_get_mut() improvements - bch2_bkey_get_mut() now handles types increasing in size, allocating a buffer for the type's current size when necessary - bch2_bkey_make_mut_typed() - bch2_bkey_get_mut() now initializes the iterator, like bch2_bkey_get_iter() Also, refactor so that most of the code is in functions - now macros are only used for wrappers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:01 -04:00
Kent Overstreet	bcb79a51cb	bcachefs: bch2_bkey_get_iter() helpers Introduce new helpers for a common pattern: bch2_trans_iter_init(); bch2_btree_iter_peek_slot(); - bch2_bkey_get_iter_type() returns -ENOENT if it doesn't find a key of the correct type - bch2_bkey_get_val_typed() copies the val out of the btree to a (typically stack allocated) variable; it handles the case where the value in the btree is smaller than the current version of the type, zeroing out the remainder. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:00 -04:00
Kent Overstreet	dc6274bcb8	bcachefs: Fix nocow write path closure bug With regular waitlists, we need to ensure we always call finish_wait(). With closures, the equivalent is that we need to call closure_sync() before returning with a stack-allocated closure. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:58 -04:00
Kent Overstreet	ac77810cb4	bcachefs: Nocow write error path fix The nocow write error path was iterating over pointers in an extent, aftre we'd dropped btree locks - oops. Fortunately we'd already stashed what we need in nocow_lock_bucket, so use that instead. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:58 -04:00
Kent Overstreet	abab7609de	bcachefs: Fix bch2_extent_fallocate() in nocow mode When we allocate disk space, we need to be incrementing the WRITE io clock, which perhaps should be renamed to sectors allocated - copygc uses this io clock to know when to run. Also, we should be incrementing the same clock when allocating btree nodes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:57 -04:00
Kent Overstreet	b40901b0f7	bcachefs: New erasure coding shutdown path This implements a new shutdown path for erasure coding, which is needed for the upcoming BCH_WRITE_WAIT_FOR_EC write path. The process is: - Cancel new stripes being built up - Close out/cancel open buckets on write points or the partial list that are for stripes - Shutdown rebalance/copygc - Then wait for in flight new stripes to finish With BCH_WRITE_WAIT_FOR_EC, move ops will be waiting on stripes to fill up before they complete; the new ec shutdown path is needed for shutting down copygc/rebalance without deadlocking. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:57 -04:00
Kent Overstreet	b9fa375bab	bcachefs: bch2_fs_moving_ctxts_to_text() This also adds bch2_write_op_to_text(): now we can see outstand moves, useful for debugging shutdown with the upcoming BCH_WRITE_WAIT_FOR_EC and likely for other things in the future. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:57 -04:00
Kent Overstreet	65d48e3525	bcachefs: Private error codes: ENOMEM This adds private error codes for most (but not all) of our ENOMEM uses, which makes it easier to track down assorted allocation failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:57 -04:00
Kent Overstreet	751c025f0d	bcachefs: Kill bch_write_op->btree_update_ready This changes the write path to not add write ops to to the write_point's list of pending work items until it's ready; this means we have to change the lock protecting it to an irq-safe lock, but means bch2_write_point_do_index_updates() no longer has to iterate over the list, which is beneficial with the way the new BCH_WRITE_WAIT_FOR_EC code works. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:57 -04:00
Kent Overstreet	db64a8e8a1	fixup bcachefs: Use for_each_btree_key_upto() more consistently Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:56 -04:00
Kent Overstreet	0b943b973c	bcachefs: Free move buffers as early as possible Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:56 -04:00
Kent Overstreet	ac2ccddc26	bcachefs: Drop some anonymous structs, unions Rust bindgen doesn't cope well with anonymous structs and unions. This patch drops the fancy anonymous structs & unions in bkey_i that let us use the same helpers for bkey_i and bkey_packed; since bkey_packed is an internal type that's never exposed to outside code, it's only a minor inconvenienc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:55 -04:00
Kent Overstreet	e902095868	bcachefs: bch2_write_queue() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:55 -04:00
Kent Overstreet	33669e0cc9	bcachefs: Add option for completely disabling nocow This adds an option for completely disabling nocow mode, including the locking in the data move path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:54 -04:00
Daniel Hill	3482dd6a25	bcachefs: don't block reads if we're promoting The promote path calls data_update_init() and now that we take locks here, there's potential for promote to block our read path, just error when we can't take the lock instead of blocking. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:52 -04:00
Kent Overstreet	0093b9e970	bcachefs: Fix promote path leak Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:52 -04:00
Kent Overstreet	7ffb6a7ec6	bcachefs: Fix deadlock on nocow locks in data move path The recent nocow locking rework introduced a deadlock in the data move path: the new nocow locking scheme uses a hash table with a fixed size array for chaining, meaning on hash collision we may have to wait for other locks to be released before we can lock a bucket. And since the data move path needs to submit writes from the same thread that's taking nocow locks and submitting reads, this introduces a deadlock. This shouldn't happen often in practice, but since the data move path can keep large numbers of IOs in flight simultaneously, it's something we have to handle. This patch makes move_ctxt_wait_event() available to bch2_data_update_init() and uses it when appropriate, which is our normal solution to this kind of thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:52 -04:00
Kent Overstreet	350175bf9b	bcachefs: Improved nocow locking This improves the nocow lock table so that hash table entries have multiple locks, and locks specify which bucket they're for - i.e. we can now resolve hash collisions. This is important because the allocator has to skip buckets that are locked in the nocow lock table, and previously hash collisions would cause it to spuriously skip unlocked buckets. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:52 -04:00
Kent Overstreet	a8b3a677e7	bcachefs: Nocow support This adds support for nocow mode, where we do writes in-place when possible. Patch components: - New boolean filesystem and inode option, nocow: note that when nocow is enabled, data checksumming and compression are implicitly disabled - To prevent in-place writes from racing with data moves (data_update.c) or bucket reuse (i.e. a bucket being reused and re-allocated while a nocow write is in flight, we have a new locking mechanism. Buckets can be locked for either data update or data move, using a fixed size hash table of two_state_shared locks. We don't have any chaining, meaning updates and moves to different buckets that hash to the same lock will wait unnecessarily - we'll want to watch for this becoming an issue. - The allocator path also needs to check for in-place writes in flight to a given bucket before giving it out: thus we add another counter to bucket_alloc_state so we can track this. - Fsync now may need to issue cache flushes to block devices instead of flushing the journal. We add a device bitmask to bch_inode_info, ei_devs_need_flush, which tracks devices that need to have flushes issued - note that this will lead to unnecessary flushes when other codepaths have already issued flushes, we may want to replace this with a sequence number. - New nocow write path: look up extents, and if they're writable write to them - otherwise fall back to the normal COW write path. XXX: switch to sequence numbers instead of bitmask for devs needing journal flush XXX: ei_quota_lock being a mutex means bch2_nocow_write_done() needs to run in process context - see if we can improve this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:51 -04:00
Kent Overstreet	792031116b	bcachefs: Unwritten extents support - bch2_extent_merge checks unwritten bit - read path returns 0s for unwritten extents without actually reading - reflink path skips over unwritten extents - bch2_bkey_ptrs_invalid() checks for extents with both written and unwritten extents, and non-normal extents (stripes, btree ptrs) with unwritten ptrs - fiemap checks for unwritten extents and returns FIEMAP_EXTENT_UNWRITTEN Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:51 -04:00
Kent Overstreet	2f1f7fe98d	bcachefs: bch2_extent_update_i_size_sectors() In the io path, when we do the extent update we also have to update the inode - for i_size and i_sectors updates, as well as for bi_journal_seq for fsync. This factors that out into a new helper which will be used in the new nocow mode, in the unwritten extent conversion path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:51 -04:00
Kent Overstreet	70de7a47e2	bcachefs: bch2_extent_fallocate() This factors out part of __bchfs_fallocate() in fs-io.c into an new, lower level io.c helper, which creates a single extent reservation. This is prep work for nocow support - the new helper will shortly gain the ability to create unwritten extents. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:51 -04:00
Kent Overstreet	9bcbc0307d	bcachefs: Skip inode unpack/pack in bch2_extent_update() This takes advantage of the new inode type to skip the expensive pack/unpack when inode updates are required in the extent update path. Additionally, we now skip the inode update entirely when i_sectors and i_size aren't changing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:51 -04:00
Kent Overstreet	b08b492ed3	bcachefs: Drop old maybe_extending optimization The extend update path had an optimization to avoid updating the inode if we knew we were definitely not extending the file. But now that we're updating inodes on every extent update - for fsync - that code can be deleted. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:51 -04:00
Kent Overstreet	8dd69d9f64	bcachefs: KEY_TYPE_inode_v3, metadata_version_inode_v3 Move bi_size and bi_sectors into the non-varint portion of the inode, so that the write path can update them without going through the relatively expensive unpack/pack operations. Other changes: - Add a field for the offset of the varint section, so we can add new non-varint fields without needing a new inode type, like alloc_v3 - Move bi_mode into the flags field, so that the varint section can be u64 aligned Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:51 -04:00
Kent Overstreet	c9828cea31	bcachefs: Delete in memory ec backpointers Post btree backpointers, these aren't needed anymore. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:51 -04:00
Kent Overstreet	60b5538877	bcachefs: trans->notrace_relock_fail When we unlock in order to submit IO, the next relock event is likely to fail if submit_bio() blocked - we shouldn't those events in our _fail stats, since those are expected events and shouldn't cause test failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	d94189ad56	bcachefs: Debug mode for c->writes references This adds a debug mode where we split up the c->writes refcount into distinct refcounts for every codepath that takes a reference, and adds sysfs code to print the value of each ref. This will make it easier to debug shutdown hangs due to refcount leaks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	c72f687a1f	bcachefs: Use for_each_btree_key_upto() more consistently It's important that in BTREE_ITER_FILTER_SNAPSHOTS mode we always use peek_upto() and provide an end for the interval we're searching for - otherwise, when we hit the end of the inode the next inode be in a different subvolume and not have any keys in the current snapshot, and we'd iterate over arbitrarily many keys before returning one. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	393a1f6863	bcachefs: Better inlining in core write path Provide inline versions of some allocation functions - bch2_alloc_sectors_done_inlined() - bch2_alloc_sectors_append_ptrs_inlined() and use them in the core IO path. Also, inline bch2_extent_update_i_size_sectors() and bch2_bkey_append_ptr(). In the core write path, function call overhead matters - every function call is a jump to a new location and a potential cache miss. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:49 -04:00
Kent Overstreet	87ced107f3	bcachefs: Convert EAGAIN errors to private error codes More error code cleanup, for better error messages and debugability. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:49 -04:00
Kent Overstreet	858536c7ce	bcachefs: Convert EROFS errors to private error codes More error code improvements - this gets us more useful error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:49 -04:00
Kent Overstreet	5f659376fc	bcachefs: Suppress -EROFS messages when shutting down This isn't actually an error condition, this just indicates a normal shutdown - no reason for these to be in the log. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:47 -04:00
Kent Overstreet	e88a75ebe8	bcachefs: New bpos_cmp(), bkey_cmp() replacements This patch introduces - bpos_eq() - bpos_lt() - bpos_le() - bpos_gt() - bpos_ge() and equivalent replacements for bkey_cmp(). Looking at the generated assembly these could probably be improved further, but we already see a significant code size improvement with this patch. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:47 -04:00
Kent Overstreet	42fab2695b	bcachefs: Kill BCH_FEATURE_incompressible This isn't needed anymore, we only support metadata versions that have this. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:47 -04:00
Kent Overstreet	7fec8266af	bcachefs: Error message improvement - Centralize format strings in bcachefs.h - Add bch2_fmt_inum_offset() and related helpers - Switch error messages for inodes to also print out the offset, in bytes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:46 -04:00
Kent Overstreet	a1ee777bfc	bcachefs: Kill BCH_WRITE_FLUSH BCH_WRITE_FLUSH is a write flag that causes a journal flush. It's only used in the direct IO path, and this will allow for some consolidation with the regular fsync path, which will help with the upcoming nocow mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:45 -04:00
Kent Overstreet	07de1803b8	bcachefs: Kill bch2_alloc_sectors_start() Only used in one place, just inline it there. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:45 -04:00
Kent Overstreet	1df3e19996	bcachefs: BCH_WRITE_SYNC This adds a new flag for the write path, BCH_WRITE_SYNC, and switches the O_DIRECT write path to use it when we're not running asynchronously. It runs the btree update after the write in the original thread's context instead of a kworker, cutting context switches in half. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:45 -04:00
Kent Overstreet	d4bce63636	bcachefs: Kill BCH_WRITE_JOURNAL_SEQ_PTR Dead code, delete. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:45 -04:00
Daniel Hill	2d485df3da	bcachefs: fix bch2_write_extent() crc corruption. crc.compression_type & nouce gets reset to inside bch2_rechecksum_bio(), we set it back to the previous values calculated. This fixes incompressible extents being marked as uncompressed. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:44 -04:00
Kent Overstreet	e0eaf86259	bcachefs: Factor out bch2_write_drop_io_error_ptrs() Move slowpath code to a separate, non-inline function. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:42 -04:00

1 2 3 4 5

202 Commits