linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-15 08:14:15 +08:00

Author	SHA1	Message	Date
Kent Overstreet	3e3e02e6bc	bcachefs: Assorted checkpatch fixes checkpatch.pl gives lots of warnings that we don't want - suggested ignore list: ASSIGN_IN_IF UNSPECIFIED_INT - bcachefs coding style prefers single token type names NEW_TYPEDEFS - typedefs are occasionally good FUNCTION_ARGUMENTS - we prefer to look at functions in .c files (hopefully with docbook documentation), not .h file prototypes MULTISTATEMENT_MACRO_USE_DO_WHILE - we have _many_ x-macros and other macros where we can't do this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:44 -04:00
Kent Overstreet	1be887979b	bcachefs: Handle dropping pointers in data_update path Cached pointers are generally dropped, not moved: this led to an assertion firing in the data update path when there were no new replicas being written. This path adds a data_options field for pointers to be dropped, and tweaks move_extent() to check if we're only dropping pointers, not writing new ones, before kicking off a data update operation. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:42 -04:00
Kent Overstreet	674cfc2624	bcachefs: Add persistent counters for all tracepoints Also, do some reorganizing/renaming, convert atomic counters in bch_fs to persistent counters, and add a few missing counters. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:39 -04:00
Kent Overstreet	549d173c1b	bcachefs: EINTR -> BCH_ERR_transaction_restart Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:37 -04:00
Kent Overstreet	d4bf5eecd7	bcachefs: Use bch2_err_str() in error messages Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:36 -04:00
Kent Overstreet	7c0732b88d	bcachefs: Fix move path when move_stats == NULL This isn't done very often, but it is legitimate Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	4081ace307	bcachefs: Get ref on c->writes in move.c There's no point reading an extent in order to move it if the write is going to fail because we're shutting down. This patch changes the move path so that moving_io now owns a ref on c->writes - as a bonus, rebalance and copygc will now notice that we're shutting down and exit quicker. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	0337cc7eee	bcachefs: move.c refactoring - add bch2_moving_ctxt_(init\|exit) - split out __bch2_evacutae_bucket() which takes an existing moving_ctxt, this will be used for improving copygc performance by pipelining across multiple buckets Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:35 -04:00
Daniel Hill	c91996c50a	bcachefs: data jobs, including rebalance wait for copygc. move_ratelimit() now has a bool that specifies whether we want to wait for copygc to finish. When copygc is running, we're probably low on free buckets instead of consuming the remaining buckets, we want to wait for copygc to finish. This should help with performance, and run away bucket fragmentation. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:35 -04:00
Kent Overstreet	7f5c5d20f0	bcachefs: Redo data_update interface This patch significantly cleans up and simplifies the data_update interface. Instead of only being able to specify a single pointer by device to rewrite, we're now able to specify any or all of the pointers in the original extent to be rewrited, as a bitmask. data_cmd is no more: the various pred functions now just return true if the extent should be moved/updated. All the data_update path does is rewrite existing replicas, or add new ones. This fixes a bug where with background compression on replicated filesystems, where rebalance -> data_update would incorrectly drop the wrong old replica, and keep trying to recompress an extent pointer and each time failing to drop the right replica. Oops. Now, the data update path doesn't look at the io options to decide which pointers to keep and which to drop - it only goes off of the data_update_options passed to it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:34 -04:00
Kent Overstreet	c501fef6de	bcachefs: Pull out data_update.c This is the start of reorganizing the data IO paths. The plan is to also break apart io.c into data_read.c and data_write.c, and migrate_write will be renamed to the data_update path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:34 -04:00
Kent Overstreet	7bb61e8c0e	bcachefs: Make IO in flight by copygc/rebalance configurable This adds a new option, move_bytes_in_flight, for configuring the amount of IO in flight by copygc/rebalance - users with many devices in their filesystem will want to increase this. In the future we should be smarter about this, but this is an easy improvement. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:34 -04:00
Daniel Hill	104c69745f	bcachefs: Add persistent counters This adds a new superblock field for persisting counters and adds a sysfs interface in counters/ exposing these counters. The superblock field is ignored by older versions letting us avoid an on disk version bump. Each sysfs file outputs a counter that tracks since filesystem creation and a counter for the current mount session. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:32 -04:00
Kent Overstreet	1f93726e63	bcachefs: Tracepoint improvements Delete some obsolete tracepoints, organize alloc tracepoints better, make a few tracepoints more consistent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:32 -04:00
Kent Overstreet	c0960603e2	bcachefs: Shutdown path improvements We're seeing occasional firings of the assertion in the key cache shutdown code that nr_dirty == 0, which means we must sometimes be doing transaction commits after we've gone read only. Cleanups & changes: - BCH_FS_ALLOC_CLEAN renamed to BCH_FS_CLEAN_SHUTDOWN - new helper bch2_btree_interior_updates_flush(), which returns true if it had to wait - bch2_btree_flush_writes() now also returns true if there were btree writes in flight - __bch2_fs_read_only now checks if btree writes were in flight in the shutdown loop: btree write completion does a transaction update, to update the pointer in the parent node - assert that !BCH_FS_CLEAN_SHUTDOWN in __bch2_trans_commit Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:32 -04:00
Kent Overstreet	d905f67ec8	bcachefs: Copygc allocations shouldn't be nowait We don't actually want copygc allocations to be nowait - an allocation for copygc might fail and then later succeed due to a bucket needing to wait on journal commit, or to be discarded. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:29 -04:00
Kent Overstreet	3e1547116f	bcachefs: x-macroize alloc_reserve enum This makes an array of strings available, like our other enums. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	91d961badf	bcachefs: darrays Inspired by CCAN darray - simple, stupid resizable (dynamic) arrays. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:28 -04:00
Kent Overstreet	f61816d0fc	bcachefs: Fix a use after free In move_read_endio, we were checking if the next pending write has its read completed - but this can turn after a use after free (and we were accessing the list without a lock), so instead just better to just unconditionally do the wakeup. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	eb331fe5a4	bcachefs: Check for stale dirty pointer before reads Since we retry reads when we discover we read from a pointer that went stale, if a dirty pointer is erroniously stale it would cause us to loop retrying that read forever - unless we check before issuing the read, while the btree is still locked, when we know that a dirty pointer should never be stale. This patch adds that check, along with printing some helpful debug info. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	52eef42c5f	bcachefs: Fix locking in data move path We need to ensure we don't have any btree locks held when calling do_pending_writes() - besides issuing IOs, upcoming allocator changes will have allocations doing btree lookups directly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	8ede99101e	bcachefs: Handle transaction restarts in __bch2_move_data() We weren't checking for -EINTR in the main loop in __bch2_move_data - this code predates modern transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00
Kent Overstreet	bf0fdb4d89	bcachefs: Don't erasure code cached ptrs It doesn't make much sense to be erasure coding cached pointers, we should be erasure coding one of the dirty pointers in an extent. This patch makes sure we're passing BCH_WRITE_CACHED when we expect the new pointer to be a cached pointer, and tweaks the write path to not allocate from a stripe when BCH_WRITE_CACHED is set - and fixes an assertion we were hitting in the ec path where when adding the stripe to an extent and deleting the other pointers the pointer to the stripe didn't exist (because dropping all dirty pointers from an extent turns it into a KEY_TYPE_error key). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	47b15c5760	bcachefs: Fix copygc sectors_to_move calculation With erasure coding, copygc's count of sectors to move was off, which matters for the debug statement it prints out when it's not able to move all the data it tried to. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	3e52c22255	bcachefs: Add journal_seq to inode & alloc keys Add fields to inode & alloc keys that record the journal sequence number when they were most recently modified. For alloc keys, this is needed to know what journal sequence number we have to flush before the bucket can be reused. Currently this is tracked in memory, but we'll be getting rid of the in memory bucket array. For inodes, this is needed for fsync when the inode has been evicted from the vfs cache. Currently we use a bloom filter per outstanding journal buf - but that mechanism has been broken since we added the ability to not issue a flush/fua for every journal write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	961b2d6282	bcachefs: Assorted ec fixes - The backpointer that ec_stripe_update_ptrs() uses now needs to include the snapshot ID, which means we have to change where we add the backpointer to after getting the snapshot ID for the new extents - ec_stripe_update_ptrs() needs to be calling bch2_trans_begin() - improve error message in bch2_mark_stripe() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	23af498cc4	bcachefs: Ensure we flush btree updates in evacuate path This fixes a possible race where we fail to remove a device because of btree nodes still on it, that are being deleted by in flight btree updates. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	f3cf0999ac	bcachefs: bch2_btree_node_rewrite() now returns transaction restarts We have been getting away from handling transaction restarts locally - convert bch2_btree_node_rewrite() to the newer style. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	b0d1b70af8	bcachefs: Must check for errors from bch2_trans_cond_resched() But we don't need to call it from outside the btree iterator code anymore, since it's called by bch2_trans_begin() and bch2_btree_path_traverse(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	b71717dac6	bcachefs: Handle transaction restarts in bch2_blacklist_entries_gc() It shouldn't be necessary when we're only using a single iterator and not doing updates, but that's harder to debug at the moment. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	9a796fdb06	bcachefs: bch2_trans_exit() no longer returns errors Now that peek_node()/next_node() are converted to return errors directly, we don't need bch2_trans_exit() to return errors - it's cleaner this way and wasn't used much anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	d355c6f4f7	bcachefs: for_each_btree_node() now returns errors directly This changes for_each_btree_node() to work like for_each_btree_key(), and to that end bch2_btree_iter_peek_node() and next_node() also return error ptrs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	b9a7d8ac5f	bcachefs: Fix implementation of KEY_TYPE_error When force-removing a device, we were silently dropping extents that we no longer had pointers for - we should have been switching them to KEY_TYPE_error, so that reads for data that was lost return errors. This patch adds the logic for switching a key to KEY_TYPE_error to bch2_bkey_drop_ptr(), and improves the logic somewhat. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:13 -04:00
Kent Overstreet	e8bde78a17	bcachefs: Fix rereplicate_pred() It was switching off of the key type incorrectly - this code must've been quite old, and not rereplicating anything that wasn't a btree_ptr_v1 or a plain old extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:13 -04:00
Kent Overstreet	4b09ef12e7	bcachefs: Fix bch2_move_btree() bch2_trans_begin() is now required for transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:13 -04:00
Kent Overstreet	18443cb9f0	bcachefs: Update data move path for snapshots The data move path operates on existing extents, and not within a subvolume as the regular IO paths do. It needs to change because it may cause existing extents to be split, and when splitting an existing extent in an ancestor snapshot we need to make sure the new split has the same visibility in child snapshots as the existing extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	6fed42bb77	bcachefs: Plumb through subvolume id To implement snapshots, we need every filesystem btree operation (every btree operation without a subvolume) to start by looking up the subvolume and getting the current snapshot ID, with bch2_subvolume_get_snapshot() - then, that snapshot ID is used for doing btree lookups in BTREE_ITER_FILTER_SNAPSHOTS mode. This patch adds those bch2_subvolume_get_snapshot() calls, and also switches to passing around a subvol_inum instead of just an inode number. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	67e0dd8f0d	bcachefs: btree_path This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:11 -04:00
Kent Overstreet	8f54337dc6	bcachefs: Fix initialization of bch_write_op.nonce If an extent ends up with a replica that is encrypted an a replica that isn't encrypted (due the user changing options), and then copygc/rebalance moves one of the replicas by reading from the unencrypted replica, we had a bug where we wouldn't correctly initialize op->nonce - for each crc field in an extent, crc.offset + crc.nonce must be equal. This patch fixes that by moving op.nonce initialization to bch2_migrate_write_init. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Kent Overstreet	5f8077cca8	bcachefs: Kill BTREE_ITER_SET_POS_AFTER_COMMIT BTREE_ITER_SET_POS_AFTER_COMMIT is used internally to automagically advance extent btree iterators on sucessful commit. But with the upcomnig btree_path patch it's getting more awkward to support, and it adds overhead to core data structures that's only used in a few places, and can be easily done by the caller instead. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Brett Holman	8dd6ed9451	bcachefs: add progress stats to sysfs This adds progress stats to sysfs for copygc, rebalance, recovery, and the cmd_job ioctls. Signed-off-by: Brett Holman <bholman.devel@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:10 -04:00
Kent Overstreet	700c25b32a	bcachefs: Use bch2_trans_begin() more consistently Upcoming patch will require that a transaction restart is always immediately followed by bch2_trans_begin(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	8b3e9bd65f	bcachefs: Always check for transaction restarts On transaction restart iterators won't be locked anymore - make sure we're always checking for errors. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	e3a67bdb6e	bcachefs: Regularize argument passing of btree_trans btree_trans should always be passed when we have one - iter->trans is disfavoured. This mainly updates old code in btree_update_interior.c, some of which predates btree_trans. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:08 -04:00
Kent Overstreet	618b1c0e20	bcachefs: Split out SPOS_MAX Internal btree code really wants a POS_MAX with all fields ~0; external code more likely wants the snapshot field to be 0, because when we're passing it to bch2_trans_get_iter() it's used for the snapshot we're operating in, which should be 0 for most btrees that don't use snapshots. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:07 -04:00
Kent Overstreet	bc3f8b25f3	bcachefs: Check for errors from bch2_trans_update() Upcoming refactoring is going to change bch2_trans_update() to start returning transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	c0ebe3e48c	bcachefs: Assorted endianness fixes Found by sparse Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	9f311f2166	bcachefs: Don't use bch_write_op->cl for delivering completions We already had op->end_io as an alternative mechanism to op->cl.parent for delivering write completions; this switches all code paths to using op->end_io. Two reasons: - op->end_io is more efficient, due to fewer atomic ops, this completes the conversion that was originally only done for the direct IO path. - We'll be restructing the write path to use a different mechanism for punting to process context, refactoring to not use op->cl will make that easier. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	af17118319	bcachefs: Kill bch_write_op.index_update_fn This deletes bch_write_op.index_update_fn: indirect function calls have gotten considerably more expensive post spectre/meltdown, and we only have two different index_update_fns - this patch adds a flag to specify which one to use (normal vs. data move path). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:04 -04:00
Kent Overstreet	443d2760e5	bcachefs: Fix a null ptr deref bch2_btree_iter_peek() won't always return a key - whoops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:04 -04:00

1 2 3

112 Commits