linux/fs/afs
David Howells 88c853c3f5 afs: Fix cell refcounting by splitting the usage counter
Management of the lifetime of afs_cell struct has some problems due to the
usage counter being used to determine whether objects of that type are in
use in addition to whether anyone might be interested in the structure.

This is made trickier by cell objects being cached for a period of time in
case they're quickly reused as they hold the result of a setup process that
may be slow (DNS lookups, AFS RPC ops).

Problems include the cached root volume from alias resolution pinning its
parent cell record, rmmod occasionally hanging and occasionally producing
assertion failures.

Fix this by splitting the count of active users from the struct reference
count.  Things then work as follows:

 (1) The cell cache keeps +1 on the cell's activity count and this has to
     be dropped before the cell can be removed.  afs_manage_cell() tries to
     exchange the 1 to a 0 with the cells_lock write-locked, and if
     successful, the record is removed from the net->cells.

 (2) One struct ref is 'owned' by the activity count.  That is put when the
     active count is reduced to 0 (final_destruction label).

 (3) A ref can be held on a cell whilst it is queued for management on a
     work queue without confusing the active count.  afs_queue_cell() is
     added to wrap this.

 (4) The queue's ref is dropped at the end of the management.  This is
     split out into a separate function, afs_manage_cell_work().

 (5) The root volume record is put after a cell is removed (at the
     final_destruction label) rather then in the RCU destruction routine.

 (6) Volumes hold struct refs, but aren't active users.

 (7) Both counts are displayed in /proc/net/afs/cells.

There are some management function changes:

 (*) afs_put_cell() now just decrements the refcount and triggers the RCU
     destruction if it becomes 0.  It no longer sets a timer to have the
     manager do this.

 (*) afs_use_cell() and afs_unuse_cell() are added to increase and decrease
     the active count.  afs_unuse_cell() sets the management timer.

 (*) afs_queue_cell() is added to queue a cell with approprate refs.

There are also some other fixes:

 (*) Don't let /proc/net/afs/cells access a cell's vllist if it's NULL.

 (*) Make sure that candidate cells in lookups are properly destroyed
     rather than being simply kfree'd.  This ensures the bits it points to
     are destroyed also.

 (*) afs_dec_cells_outstanding() is now called in cell destruction rather
     than at "final_destruction".  This ensures that cell->net is still
     valid to the end of the destructor.

 (*) As a consequence of the previous two changes, move the increment of
     net->cells_outstanding that was at the point of insertion into the
     tree to the allocation routine to correctly balance things.

Fixes: 989782dcdc ("afs: Overhaul cell database management")
Signed-off-by: David Howells <dhowells@redhat.com>
2020-10-16 14:38:22 +01:00
..
addr_list.c afs: Use kfree_rcu() instead of casting kfree() to rcu_callback_t 2020-03-13 10:47:33 -07:00
afs_cm.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
afs_fs.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
afs_vl.h afs: Implement client support for the YFSVL.GetCellName RPC op 2020-06-04 15:37:57 +01:00
afs.h afs: Implement client support for the YFSVL.GetCellName RPC op 2020-06-04 15:37:57 +01:00
cache.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
callback.c afs: Fix the by-UUID server tree to allow servers with the same UUID 2020-06-04 15:37:57 +01:00
cell.c afs: Fix cell refcounting by splitting the usage counter 2020-10-16 14:38:22 +01:00
cmservice.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
dir_edit.c afs: Remove set but not used variables 'before', 'after' 2019-11-21 20:36:00 +00:00
dir_silly.c afs: Fix silly rename 2020-06-16 22:00:28 +01:00
dir.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
dynroot.c afs: Fix cell refcounting by splitting the usage counter 2020-10-16 14:38:22 +01:00
file.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
flock.c afs: Remove erroneous fallthough annotation 2020-08-27 14:33:01 -05:00
fs_operation.c afs: Fix key ref leak in afs_put_operation() 2020-08-20 10:41:45 -07:00
fs_probe.c rxrpc: Make rxrpc_kernel_get_srtt() indicate validity 2020-08-20 18:21:28 +01:00
fsclient.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
inode.c afs: Fix deadlock between writeback and truncate 2020-10-08 10:50:55 -07:00
internal.h afs: Fix cell refcounting by splitting the usage counter 2020-10-16 14:38:22 +01:00
Kconfig docs: filesystems: fix renamed references 2020-04-20 15:45:22 -06:00
main.c afs: Fix rapid cell addition/removal by not using RCU on cells tree 2020-10-16 14:04:59 +01:00
Makefile afs: Detect cell aliases 1 - Cells with root volumes 2020-06-04 15:37:57 +01:00
misc.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
mntpt.c afs: Fix cell refcounting by splitting the usage counter 2020-10-16 14:38:22 +01:00
proc.c afs: Fix cell refcounting by splitting the usage counter 2020-10-16 14:38:22 +01:00
protocol_uae.h afs: Add support for the UAE error table 2019-06-28 18:37:53 +01:00
protocol_yfs.h afs: Implement client support for the YFSVL.GetCellName RPC op 2020-06-04 15:37:57 +01:00
rotate.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
rxrpc.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
security.c treewide: Remove uninitialized_var() usage 2020-07-16 12:35:15 -07:00
server_list.c afs: Reorganise volume and server trees to be rooted on the cell 2020-06-04 15:37:57 +01:00
server.c afs: Fix hang on rmmod due to outstanding timer 2020-06-20 12:01:58 -07:00
super.c afs: Fix cell refcounting by splitting the usage counter 2020-10-16 14:38:22 +01:00
vl_alias.c afs: Fix cell refcounting by splitting the usage counter 2020-10-16 14:38:22 +01:00
vl_list.c afs: Don't use VL probe running state to make decisions outside probe code 2020-08-20 18:21:28 +01:00
vl_probe.c afs: Don't use VL probe running state to make decisions outside probe code 2020-08-20 18:21:28 +01:00
vl_rotate.c afs: Fix cell refcounting by splitting the usage counter 2020-10-16 14:38:22 +01:00
vlclient.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00
volume.c afs: Fix cell refcounting by splitting the usage counter 2020-10-16 14:38:22 +01:00
write.c afs: Fix deadlock between writeback and truncate 2020-10-08 10:50:55 -07:00
xattr.c afs: Build an abstraction around an "operation" concept 2020-06-04 15:37:17 +01:00
xdr_fs.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
yfsclient.c treewide: Use fallthrough pseudo-keyword 2020-08-23 17:36:59 -05:00