linux

korg/linux

mirror of https://mirrors.bfsu.edu.cn/git/linux.git synced 2024-11-13 14:24:11 +08:00

Author	SHA1	Message	Date
Pei Li	472237b69d	bcachefs: Fix shift-out-of-bounds in bch2_blacklist_entries_gc This series fix the shift-out-of-bounds issue in bch2_blacklist_entries_gc(). Instead of passing 0 to eytzinger0_first() when iterating the entries, we explicitly check 0 and initialize i to be 0. syzbot has tested the proposed patch and the reproducer did not trigger any issue: Reported-and-tested-by: syzbot+835d255ad6bc7f29ee12@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=835d255ad6bc7f29ee12 Signed-off-by: Pei Li <peili.dev@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-06-25 17:53:31 -04:00
Kent Overstreet	a5c3e265d3	bcachefs: Plumb bch_validate_flags to sb_field_ops.validate() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-09 16:23:36 -04:00
Kent Overstreet	f04158290d	bcachefs: journal seq blacklist gc no longer has to walk btree Since btree_ptr_v2, we no longer require the journal seq blacklist table for skipping blacklisted bsets (btree node entries); the pointer to a given node indicates how much data is present. Therefore there's no longer any need for journal seq blacklist gc to walk the btree - we can prune entries older than journal last_seq. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-08 17:29:20 -04:00
Kent Overstreet	5dd8c60e1e	bcachefs: iter/update/trigger/str_hash flag cleanup Combine iter/update/trigger/str_hash flags into a single enum, and x-macroize them for a to_text() function later. These flags are all for a specific iter/key/update context, so it makes sense to group them together - iter/update/trigger flags were already given distinct bits, this cleans up and unifies that handling. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-05-08 17:29:18 -04:00
Kent Overstreet	ca1e02f7e9	bcachefs: Etyzinger cleanups Pull out eytzinger.c and kill eytzinger_cmp_fn. We now provide eytzinger0_sort and eytzinger0_sort_r, which use the standard cmp_func_t and cmp_r_func_t callbacks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-04-03 14:44:18 -04:00
Erick Archer	3e48999816	bcachefs: Prefer struct_size over open coded arithmetic This is an effort to get rid of all multiplications from allocation functions in order to prevent integer overflows [1][2]. As the "op" variable is a pointer to "struct promote_op" and this structure ends in a flexible array: struct promote_op { [...] struct bio_vec bi_inline_vecs[]; }; and the "t" variable is a pointer to "struct journal_seq_blacklist_table" and this structure also ends in a flexible array: struct journal_seq_blacklist_table { [...] struct journal_seq_blacklist_table_entry { u64 start; u64 end; bool dirty; } entries[]; }; the preferred way in the kernel is to use the struct_size() helper to do the arithmetic instead of the argument "size + size * count" in the kzalloc() functions. This way, the code is more readable and safer. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments [1] Link: https://github.com/KSPP/linux/issues/160 [2] Signed-off-by: Erick Archer <erick.archer@gmx.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-13 21:22:26 -04:00
Kent Overstreet	6fa30fe7f7	bcachefs: journal_seq_blacklist_add() now handles entries being added out of order bch2_journal_seq_blacklist_add() was bugged when the new entry overlapped with multiple existing entries, and it also assumed new entries are being added in increasing order. This is true on any sane filesystem, but when trying to recover from very badly mangled filesystems we might end up with the journal sequence number rewinding vs. what the blacklist list knows about - easiest to just handle that here. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-03-10 15:09:59 -04:00
Kent Overstreet	3c471b6588	bcachefs: convert bch_fs_flags to x-macro Now we can print out filesystem flags in sysfs, useful for debugging various "what's my filesystem doing" issues. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:38 -05:00
Kent Overstreet	4637429e39	bcachefs: bch2_sb_field_get() refactoring Instead of using token pasting to generate methods for each superblock section, just make the type a parameter to bch2_sb_field_get(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:16 -04:00
Kent Overstreet	6bd68ec266	bcachefs: Heap allocate btree_trans We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:13 -04:00
Kent Overstreet	65d48e3525	bcachefs: Private error codes: ENOMEM This adds private error codes for most (but not all) of our ENOMEM uses, which makes it easier to track down assorted allocation failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:57 -04:00
Kent Overstreet	78c0b75c34	bcachefs: More errcode cleanup We shouldn't be overloading standard error codes now that we have provisions for bcachefs-specific errorcodes: this patch converts super.c and super-io.c to per error site errcodes, with a bit of cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:48 -04:00
Kent Overstreet	549d173c1b	bcachefs: EINTR -> BCH_ERR_transaction_restart Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:37 -04:00
Kent Overstreet	401ec4db63	bcachefs: Printbuf rework This converts bcachefs to the modern printbuf interface/implementation, synced with the version to be submitted upstream. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:33 -04:00
Kent Overstreet	12bf93a429	bcachefs: Add .to_text() methods for all superblock sections This patch improves the superblock .to_text() methods and adds methods for all types that were missing them. It also improves printbufs by allowing them to specfiy what units we want to be printing in, and adds new wrapper methods for unifying our kernel and userspace environments. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	9b6e2f1e70	Revert "bcachefs: Delete some obsolete journal_seq_blacklist code" This reverts commit f95b61228efd04c9c158123da5827c96e9773b29. It turns out, we're seeing filesystems in the wild end up with blacklisted btree node bsets - this should not be happening, and until we understand why and fix it we need to keep this code around. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:21 -04:00
Kent Overstreet	98c80d6df6	bcachefs: Fix bch2_journal_seq_blacklist_add() The old code correctly handled the case where we were blacklisting a range that exactly matched an existing entry, but not the case where the new range partially overlaps an existing entry. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:21 -04:00
Kent Overstreet	efe68e1d65	bcachefs: Improved superblock-related error messages This patch converts bch2_sb_validate() and the .validate methods for the various superblock sections to take printbuf, to which they can print detailed error messages, including printing the entire section that was invalid. This is a great improvement over the previous situation, where we could only return static strings that didn't have precise information about what was wrong. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:21 -04:00
Kent Overstreet	04f0f77df2	bcachefs: Delete some obsolete journal_seq_blacklist code Since metadata version bcachefs_metadata_version_btree_ptr_sectors_written, we haven't needed the journal seq blacklist mechanism for ignoring blacklisted btree node writes - we now only need it for ignoring journal entries that were written after the newest flush journal entry, and then we only need to keep those blacklist entries around until journal replay is finished. That means we can delete the code for scanning btree nodes to GC journal_seq_blacklist entries. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	b71717dac6	bcachefs: Handle transaction restarts in bch2_blacklist_entries_gc() It shouldn't be necessary when we're only using a single iterator and not doing updates, but that's harder to debug at the moment. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	d355c6f4f7	bcachefs: for_each_btree_node() now returns errors directly This changes for_each_btree_node() to work like for_each_btree_key(), and to that end bch2_btree_iter_peek_node() and next_node() also return error ptrs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	67e0dd8f0d	bcachefs: btree_path This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:11 -04:00
Kent Overstreet	c0ebe3e48c	bcachefs: Assorted endianness fixes Found by sparse Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	adbcada43f	bcachefs: Don't require flush/fua on every journal write This patch adds a flag to journal entries which, if set, indicates that they weren't done as flush/fua writes. - non flush/fua journal writes don't update last_seq (i.e. they don't free up space in the journal), thus the journal free space calculations now check whether nonflush journal writes are currently allowed (i.e. are we low on free space, or would doing a flush write free up a lot of space in the journal) - write_delay_ms, the user configurable option for when open journal entries are automatically written, is now interpreted as the max delay between flush journal writes (default 1 second). - bch2_journal_flush_seq_async is changed to ensure a flush write >= the requested sequence number has happened - journal read/replay must now ignore, and blacklist, any journal entries newer than the most recent flush entry in the journal. Also, the way the read_entire_journal option is handled has been improved; struct journal_replay now has an entry, 'ignore', for entries that were read but should not be used. - assorted refactoring and improvements related to journal read in journal_io.c and recovery.c Previously, we'd have to issue a flush/fua write every time we accumulated a full journal entry - typically the bucket size. Now we need to issue them much less frequently: when an fsync is requested, or it's been more than write_delay_ms since the last flush, or when we need to free up space in the journal. This is a significant performance improvement on many write heavy workloads. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:49 -04:00
Kent Overstreet	9f115ce9e9	bcachefs: Fix a bug with the journal_seq_blacklist mechanism Previously, we would start doing btree updates before writing the first journal entry; if this was after an unclean shutdown, this could cause those btree updates to not be blacklisted. Also, move some code to headers for userspace debug tools. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:43 -04:00
Kent Overstreet	1c3ff72c0f	bcachefs: Convert some enums to x-macros Helps for preventing things from getting out of sync. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:33 -04:00
Kent Overstreet	20bceecb31	bcachefs: More work to avoid transaction restarts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:22 -04:00
Kent Overstreet	3ea2b1e128	bcachefs: cmp_int() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:20 -04:00
Kent Overstreet	1dd7f9d98d	bcachefs: Rewrite journal_seq_blacklist machinery Now, we store blacklisted journal sequence numbers in the superblock, not the journal: this helps to greatly simplify the code, and more importantly it's now implemented in a way that doesn't require all btree nodes to be visited before starting the journal - instead, we unconditionally blacklist the next 4 journal sequence numbers after an unclean shutdown. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:20 -04:00
Kent Overstreet	424eb88130	bcachefs: Only get btree iters from btree transactions Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:18 -04:00
Kent Overstreet	1c6fdbd8f2	bcachefs: Initial commit Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write filesystem with every feature you could possibly want. Website: https://bcachefs.org Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:08:07 -04:00

31 Commits