Use jiffies macros instead of using jiffies directly to handle wraparound.
Signed-off-by: Chen Yufan <chenyufan@vivo.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
bch2_bset_fix_lookup_table is too complicated to be easily understood,
the comment "l now > where" there is also incorrect when where ==
t->end_offset. This patch therefore refactor the function, the idea is
that when where >= rw_aux_tree(b, t)[t->size - 1].offset, we don't need
to adjust the rw aux tree.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We rely on the trans->locked to know if a trans has nodes locked for
assertions about deadlocks; there can't be more than one trans in the
same process that is locked.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
folio_has_private() is an attractive nuisance; filesystem authors
generally don't realise that it actually checks two flags (one of which
is never set by bcachefs). There's no need to check the private flag at
all; for folios owned by bcachefs, we know that folio->private is NULL
when the private flag is clear and non-NULL when the private flag is set.
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
It's really not needed: the only locks used here are the btree cache
lock, which we drop for GFP_WAIT allocations, and btree node locks - but
we also drop those for GFP_WAIT allocations.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Use helper functions to make code more readable.
Similar to commit a5488f2983 ("fs: simplify ->listxattr() implementation")
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Remove struct nop_posix_acl_{access,default} for bcachefs filesystem
that don't depend on the xattr handler in their inode->i_op->listxattr()
method in any way. There's nothing more to do than to simply remove the
handler. It's been effectively unused ever since we introduced the new
posix acl api. See [1] for details.
Link [1]: https://patchwork.kernel.org/project/linux-fsdevel/cover/20230125-fs-acl-remove-generic-xattr-handlers-v3-0-f760cc58967d@kernel.org/
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
After reducing the search range when building the aux tree, the prev array
stuff is no longer useful, so remove it.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
When the search key's mantissa is larger than the node i's, we know that
the search key is larger than the first key of the cacheline corresponding
to node i, so that when we are calculating the mantissa of right side
nodes of node i, the left side of the search range can be the first key
of node i. Once the search range is minimized, the mantissa we are
calculating can have more useful bits, thus reduce the slow path
comparison. Besides, we can now remove all the prev array stuff.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This patch replaces open-coded extra computation to eytzinger1_extra.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
This logic is no longer useful since commit
3ce8b463e3 ("bcachefs: kill bset_tree->max_key"), so remove it.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The idx parameter of bkey_mantissa_bits_dropped is unused, remove it.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The idx parameter of bkey_mantissa became unused since commit
b904a79918 ("bcachefs: Go back to 16 bit mantissa bkey floats"),
so remove it.
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
In the macro definition of bkey_crc_next, five parameters
were accepted, but only four of them were used. Let's remove
the unused one.
The patch has only passed compilation tests, but it should be fine.
Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The macro allocate_dropping_locks accepts a parameter _trans,
but it was not used, rather the variable trans was directly used,
which may be a local variable inside a function that calls the macros.
Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The macro allocate_dropping_locks_errocode accepts a parameter _trans,
but it was not used, rather the variable trans was directly used,
which may be a local variable inside a function that calls the macros.
Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
macro bch2_kthread_wait_event_ioclock_timeout is no longer used,
let's remove it.
The patch has passed compilation test.
Signed-off-by: Julian Sun <sunjunchao2870@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Same as the recent change for __bch2_read(); also, kill now unnecessary
btree_trans_too_many_iters() calls.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
After commit 230e9fc286 ("slab: add SLAB_ACCOUNT flag"), we need to mark
the inode cache as SLAB_ACCOUNT, similar to commit 5d097056c9 ("kmemcg:
account for certain kmem allocations to memcg")
Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Add the __counted_by compiler attribute to the flexible array member
bucket to improve access bounds-checking via CONFIG_UBSAN_BOUNDS and
CONFIG_FORTIFY_SOURCE.
Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
When building for a 32-bit architecture, for which 'size_t' is
'unsigned int', there is a compiler warning due to use of '%lu':
In file included from fs/bcachefs/vstructs.h:5,
from fs/bcachefs/bcachefs_format.h:80,
from fs/bcachefs/bcachefs.h:207,
from fs/bcachefs/btree_key_cache.c:3:
fs/bcachefs/btree_key_cache.c: In function 'bch2_btree_key_cache_to_text':
fs/bcachefs/btree_key_cache.c:795:25: error: format '%lu' expects argument of type 'long unsigned int', but argument 3 has type 'size_t' {aka 'unsigned int'} [-Werror=format=]
795 | prt_printf(out, "pending:\t%lu\r\n", per_cpu_sum(bc->nr_pending));
| ^~~~~~~~~~~~~~~~~~~
fs/bcachefs/util.h:78:63: note: in definition of macro 'prt_printf'
78 | #define prt_printf(_out, ...) bch2_prt_printf(_out, __VA_ARGS__)
| ^~~~~~~~~~~
fs/bcachefs/btree_key_cache.c:795:38: note: format string is defined here
795 | prt_printf(out, "pending:\t%lu\r\n", per_cpu_sum(bc->nr_pending));
| ~~^
| |
| long unsigned int
| %u
cc1: all warnings being treated as errors
Use the proper specifier, '%zu', to resolve the warning.
Fixes: e447e49977 ("bcachefs: key cache can now allocate from pending")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
btree_trans objects can hold the btree_trans_barrier srcu read lock for
an extended amount of time (they shouldn't, but it's difficult to
guarantee).
the srcu barrier blocks memory reclaim, so to avoid too many stranded
key cache items, this uses the new pending_rcu_items to allocate from
pending items - like we did before, but now without a global lock on the
key cache.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Generic data structure for explicitly tracking pending RCU items,
allowing items to be dequeued (i.e. allocate from items pending
freeing). Works with conventional RCU and SRCU, and possibly other RCU
flavors in the future, meaning this can serve as a more generic
replacement for SLAB_TYPESAFE_BY_RCU.
Pending items are tracked in radix trees; if memory allocation fails, we
fall back to linked lists.
A rcu_pending is initialized with a callback, which is invoked when
pending items's grace periods have expired. Two types of callback
processing are handled specially:
- RCU_PENDING_KVFREE_FN
New backend for kvfree_rcu(). Slightly faster, and eliminates the
synchronize_rcu() slowpath in kvfree_rcu_mightsleep() - instead, an
rcu_head is allocated if we don't have one and can't use the radix
tree
TODO:
- add a shrinker (as in the existing kvfree_rcu implementation) so that
memory reclaim can free expired objects if callback processing isn't
keeping up, and to expedite a grace period if we're under memory
pressure and too much memory is stranded by RCU
- add a counter for amount of memory pending
- RCU_PENDING_CALL_RCU_FN
Accelerated backend for call_rcu() - pending callbacks are tracked in
a radix tree to eliminate linked list overhead.
to serve as replacement backends for kvfree_rcu() and call_rcu(); these
may be of interest to other uses (e.g. SLAB_TYPESAFE_BY_RCU users).
Note:
Internally, we're using a single rearming call_rcu() callback for
notifications from the core RCU subsystem for notifications when objects
are ready to be processed.
Ideally we would be getting a callback every time a grace period
completes for which we have objects, but that would require multiple
rcu_heads in flight, and since the number of gp sequence numbers with
uncompleted callbacks is not bounded, we can't do that yet.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We can't call __wait_on_freeing_inode() with btree locks held; we're
waiting on another thread that's in evict(), and before it clears that
bit it needs to write that inode to flush timestamps - deadlock.
Fixing this involves a fair amount of re-jiggering to plumb a new
transaction restart.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
the standard vfs inode hash table suffers from painful lock contention -
this is long overdue
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>