linux/fs/xfs
Shiyang Ruan fa422b353d mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind
Now, if we suddenly remove a PMEM device(by calling unbind) which
contains FSDAX while programs are still accessing data in this device,
e.g.:
```
 $FSSTRESS_PROG -d $SCRATCH_MNT -n 99999 -p 4 &
 # $FSX_PROG -N 1000000 -o 8192 -l 500000 $SCRATCH_MNT/t001 &
 echo "pfn1.1" > /sys/bus/nd/drivers/nd_pmem/unbind
```
it could come into an unacceptable state:
  1. device has gone but mount point still exists, and umount will fail
       with "target is busy"
  2. programs will hang and cannot be killed
  3. may crash with NULL pointer dereference

To fix this, we introduce a MF_MEM_PRE_REMOVE flag to let it know that we
are going to remove the whole device, and make sure all related processes
could be notified so that they could end up gracefully.

This patch is inspired by Dan's "mm, dax, pmem: Introduce
dev_pagemap_failure()"[1].  With the help of dax_holder and
->notify_failure() mechanism, the pmem driver is able to ask filesystem
on it to unmap all files in use, and notify processes who are using
those files.

Call trace:
trigger unbind
 -> unbind_store()
  -> ... (skip)
   -> devres_release_all()
    -> kill_dax()
     -> dax_holder_notify_failure(dax_dev, 0, U64_MAX, MF_MEM_PRE_REMOVE)
      -> xfs_dax_notify_failure()
      `-> freeze_super()             // freeze (kernel call)
      `-> do xfs rmap
      ` -> mf_dax_kill_procs()
      `  -> collect_procs_fsdax()    // all associated processes
      `  -> unmap_and_kill()
      ` -> invalidate_inode_pages2_range() // drop file's cache
      `-> thaw_super()               // thaw (both kernel & user call)

Introduce MF_MEM_PRE_REMOVE to let filesystem know this is a remove
event.  Use the exclusive freeze/thaw[2] to lock the filesystem to prevent
new dax mapping from being created.  Do not shutdown filesystem directly
if configuration is not supported, or if failure range includes metadata
area.  Make sure all files and processes(not only the current progress)
are handled correctly.  Also drop the cache of associated files before
pmem is removed.

[1]: https://lore.kernel.org/linux-mm/161604050314.1463742.14151665140035795571.stgit@dwillia2-desk3.amr.corp.intel.com/
[2]: https://lore.kernel.org/linux-xfs/169116275623.3187159.16862410128731457358.stg-ugh@frogsfrogsfrogs/

Signed-off-by: Shiyang Ruan <ruansy.fnst@fujitsu.com>
Reviewed-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
2023-12-07 14:34:26 +05:30
..
libxfs xfs: force small EFIs for reaping btree extents 2023-12-06 18:45:19 -08:00
scrub xfs: force small EFIs for reaping btree extents 2023-12-06 18:45:19 -08:00
Kconfig xfs: fix again select in kconfig XFS_ONLINE_SCRUB_STATS 2023-11-13 09:11:41 +05:30
kmem.c
kmem.h
Makefile xfs: implement block reservation accounting for btrees we're staging 2023-12-06 18:45:18 -08:00
mrlock.h
xfs_acl.c xfs: convert to ctime accessor functions 2023-07-24 10:30:06 +02:00
xfs_acl.h fs: port ->set_acl() to pass mnt_idmap 2023-01-19 09:24:27 +01:00
xfs_aops.c New code for 6.6: 2023-08-30 12:34:12 -07:00
xfs_aops.h
xfs_attr_inactive.c xfs: make inode unlinked bucket recovery work with quotacheck 2023-09-12 10:31:07 -07:00
xfs_attr_item.c xfs: elide ->create_done calls for unlogged deferred work 2023-12-06 18:45:17 -08:00
xfs_attr_item.h xfs: share xattr name and value buffers when logging xattr updates 2022-05-23 08:43:46 +10:00
xfs_attr_list.c xfs: use XFS_IFORK_Q to determine the presence of an xattr fork 2022-07-09 15:17:21 -07:00
xfs_bio_io.c fs/xfs: Use the enum req_op and blk_opf_t types 2022-07-14 12:14:33 -06:00
xfs_bmap_item.c xfs: move ->iop_relog to struct xfs_defer_op_type 2023-12-06 18:45:17 -08:00
xfs_bmap_item.h
xfs_bmap_util.c New code for 6.7: 2023-11-08 13:22:16 -08:00
xfs_bmap_util.h xfs: xfs_bmap_punch_delalloc_range() should take a byte range 2022-11-29 09:09:17 +11:00
xfs_buf_item_recover.c xfs: verify buffer contents when we skip log replay 2023-04-12 15:49:23 +10:00
xfs_buf_item.c xfs: buffer pins need to hold a buffer reference 2023-06-05 04:05:27 +10:00
xfs_buf_item.h
xfs_buf.c Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
xfs_buf.h Many singleton patches against the MM code. The patch series which are 2023-11-02 19:38:47 -10:00
xfs_dahash_test.c xfs: test the ascii case-insensitive hash 2023-04-11 19:05:05 -07:00
xfs_dahash_test.h xfs: test dir/attr hash when loading module 2023-03-19 09:55:49 -07:00
xfs_dir2_readdir.c xfs: rearrange the logic and remove the broken comment for xfs_dir2_isxx 2022-10-04 16:39:58 +11:00
xfs_discard.c xfs: abort fstrim if kernel is suspending 2023-10-04 09:25:04 +11:00
xfs_discard.h xfs: move log discard work to xfs_discard.c 2023-10-04 09:24:02 +11:00
xfs_dquot_item_recover.c xfs: dquot recovery does not validate the recovered dquot 2023-11-22 23:39:36 +05:30
xfs_dquot_item.c
xfs_dquot_item.h
xfs_dquot.c xfs: clean up dqblk extraction 2023-11-22 23:39:27 +05:30
xfs_dquot.h xfs: remove warning counters from struct xfs_dquot_res 2022-05-11 17:12:09 +10:00
xfs_drain.c xfs: minimize overhead of drain wakeups by using jump labels 2023-04-11 18:59:59 -07:00
xfs_drain.h xfs: minimize overhead of drain wakeups by using jump labels 2023-04-11 18:59:59 -07:00
xfs_error.c xfs: make kobj_type structures constant 2023-02-10 08:59:48 -08:00
xfs_error.h xfs: allow setting full range of panic tags 2023-02-09 18:36:17 -08:00
xfs_export.c xfs: fix reloading entire unlinked bucket lists 2023-09-24 18:12:13 -07:00
xfs_export.h
xfs_extent_busy.c xfs: process free extents to busy list in FIFO order 2023-10-11 12:35:21 -07:00
xfs_extent_busy.h xfs: reduce AGF hold times during fstrim operations 2023-10-04 09:24:52 +11:00
xfs_extfree_item.c xfs: automatic freeing of freshly allocated unwritten space 2023-12-06 18:45:18 -08:00
xfs_extfree_item.h xfs: refactor all the EFI/EFD log item sizeof logic 2022-10-31 08:58:20 -07:00
xfs_file.c xfs: allow read IO and FICLONE to run concurrently 2023-10-23 12:02:26 +05:30
xfs_filestream.c xfs: fix double xfs_perag_rele() in xfs_filestream_pick_ag() 2023-06-05 14:48:15 +10:00
xfs_filestream.h xfs: pass perag to filestreams tracing 2023-02-13 09:14:56 +11:00
xfs_fsmap.c xfs: convert do_div calls to xfs_rtb_to_rtx helper calls 2023-10-17 16:25:55 -07:00
xfs_fsmap.h
xfs_fsops.c Minor cleanups for 6.5: 2023-07-09 09:50:42 -07:00
xfs_fsops.h
xfs_globals.c xfs: allow setting full range of panic tags 2023-02-09 18:36:17 -08:00
xfs_health.c
xfs_icache.c xfs: dynamically allocate the xfs-inodegc shrinker 2023-10-04 10:32:26 -07:00
xfs_icache.h xfs: use per-mount cpumask to track nonempty percpu inodegc lists 2023-09-11 08:39:03 -07:00
xfs_icreate_item.c xfs: fix potential log item leak 2022-05-04 11:45:11 +10:00
xfs_icreate_item.h
xfs_inode_item_recover.c xfs: recovery should not clear di_flushiter unconditionally 2023-11-13 09:11:41 +05:30
xfs_inode_item.c New code for 6.7: 2023-11-08 13:22:16 -08:00
xfs_inode_item.h xfs: fix AGF vs inode cluster buffer deadlock 2023-06-05 04:08:27 +10:00
xfs_inode.c New code for 6.7: 2023-11-08 13:22:16 -08:00
xfs_inode.h xfs: respect the stable writes flag on the RT device 2023-11-20 15:05:19 +01:00
xfs_ioctl32.c fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap 2023-01-19 09:24:29 +01:00
xfs_ioctl32.h arch: Remove Itanium (IA-64) architecture 2023-09-11 08:13:17 +00:00
xfs_ioctl.c xfs: respect the stable writes flag on the RT device 2023-11-20 15:05:19 +01:00
xfs_ioctl.h fs: port ->fileattr_set() to pass mnt_idmap 2023-01-19 09:24:27 +01:00
xfs_iomap.c xfs: don't allocate into the data fork for an unshare request 2023-05-02 09:14:51 +10:00
xfs_iomap.h xfs: use iomap_valid method to detect stale cached iomaps 2022-11-29 09:09:17 +11:00
xfs_iops.c xfs: respect the stable writes flag on the RT device 2023-11-20 15:05:19 +01:00
xfs_iops.h fs: port ->setattr() to pass mnt_idmap 2023-01-19 09:24:02 +01:00
xfs_itable.c xfs: convert to new timestamp accessors 2023-10-18 14:08:29 +02:00
xfs_itable.h fs: port i_{g,u}id_into_vfs{g,u}id() to mnt_idmap 2023-01-19 09:24:29 +01:00
xfs_iunlink_item.c xfs: create traced helper to get extra perag references 2023-04-11 18:59:55 -07:00
xfs_iunlink_item.h xfs: add in-memory iunlink log item 2022-07-14 11:47:42 +10:00
xfs_iwalk.c xfs: create traced helper to get extra perag references 2023-04-11 18:59:55 -07:00
xfs_iwalk.h
xfs_linux.h xfs: use shifting and masking when converting rt extents, if possible 2023-10-17 16:26:25 -07:00
xfs_log_cil.c xfs: move log discard work to xfs_discard.c 2023-10-04 09:24:02 +11:00
xfs_log_priv.h xfs: use xfs_defer_pending objects to recover intent items 2023-12-06 18:45:14 -08:00
xfs_log_recover.c xfs: move ->iop_recover to xfs_defer_op_type 2023-12-06 18:45:15 -08:00
xfs_log.c xfs: use xfs_defer_pending objects to recover intent items 2023-12-06 18:45:14 -08:00
xfs_log.h xfs: move CIL ordering to the logvec chain 2022-07-07 18:56:08 +10:00
xfs_message.c Merge branch 'guilt/xfs-unsigned-flags-5.18' into xfs-5.19-for-next 2022-04-21 16:45:03 +10:00
xfs_message.h xfs: implement per-mount warnings for scrub and shrink usage 2022-05-27 10:31:34 +10:00
xfs_mount.c xfs: dynamically allocate the xfs-inodegc shrinker 2023-10-04 10:32:26 -07:00
xfs_mount.h New code for 6.7: 2023-11-08 13:22:16 -08:00
xfs_mru_cache.c
xfs_mru_cache.h
xfs_notify_failure.c mm, pmem, xfs: Introduce MF_MEM_PRE_REMOVE for unbind 2023-12-07 14:34:26 +05:30
xfs_ondisk.h xfs: use accessor functions for summary info words 2023-10-18 16:53:00 -07:00
xfs_pnfs.c fs: port ->setattr() to pass mnt_idmap 2023-01-19 09:24:02 +01:00
xfs_pnfs.h
xfs_pwork.c
xfs_pwork.h
xfs_qm_bhv.c
xfs_qm_syscalls.c xfs: introduce xfs_inodegc_push() 2022-06-23 13:34:38 -07:00
xfs_qm.c xfs: dynamically allocate the xfs-qm shrinker 2023-10-04 10:32:26 -07:00
xfs_qm.h xfs: dynamically allocate the xfs-qm shrinker 2023-10-04 10:32:26 -07:00
xfs_quota.h
xfs_quotaops.c xfs: don't set quota warning values 2022-05-11 17:12:09 +10:00
xfs_refcount_item.c xfs: move ->iop_relog to struct xfs_defer_op_type 2023-12-06 18:45:17 -08:00
xfs_refcount_item.h
xfs_reflink.c xfs: remove __xfs_free_extent_later 2023-12-06 18:45:18 -08:00
xfs_reflink.h xfs: pass perag to xfs_alloc_read_agf() 2022-07-07 19:07:40 +10:00
xfs_rmap_item.c xfs: move ->iop_relog to struct xfs_defer_op_type 2023-12-06 18:45:17 -08:00
xfs_rmap_item.h
xfs_rtalloc.c xfs: don't allow overly small or large realtime volumes 2023-12-06 18:45:17 -08:00
xfs_rtalloc.h xfs: convert rt extent numbers to xfs_rtxnum_t 2023-10-17 16:24:22 -07:00
xfs_stats.c xfs: replace unnecessary seq_printf with seq_puts 2022-09-19 06:48:14 +10:00
xfs_stats.h
xfs_super.c New code for 6.7: 2023-11-08 13:22:16 -08:00
xfs_super.h xfs: create scaffolding for creating debugfs entries 2023-08-10 07:48:07 -07:00
xfs_symlink.c fs: port fs{g,u}id helpers to mnt_idmap 2023-01-19 09:24:30 +01:00
xfs_symlink.h fs: port inode_init_owner() to mnt_idmap 2023-01-19 09:24:28 +01:00
xfs_sysctl.c xfs: simplify two-level sysctl registration for xfs_table 2023-04-13 11:49:35 -07:00
xfs_sysctl.h xfs: Add larp debug option 2022-05-11 17:01:22 +10:00
xfs_sysfs.c xfs: document what LARP means 2023-12-06 18:45:17 -08:00
xfs_sysfs.h xfs: make kobj_type structures constant 2023-02-10 08:59:48 -08:00
xfs_trace.c xfs: add debug knob to slow down writeback for fun 2022-11-28 17:24:35 -08:00
xfs_trace.h xfs: allow pausing of pending deferred work items 2023-12-06 18:45:18 -08:00
xfs_trans_ail.c xfs: don't reverse order of items in bulk AIL insertion 2023-06-29 09:28:23 -07:00
xfs_trans_buf.c
xfs_trans_dquot.c xfs: remove quota warning limit from struct xfs_quota_limits 2022-05-11 17:12:09 +10:00
xfs_trans_priv.h xfs: convert log vector chain to use list heads 2022-07-07 18:55:59 +10:00
xfs_trans.c xfs: use shifting and masking when converting rt extents, if possible 2023-10-17 16:26:25 -07:00
xfs_trans.h xfs: move ->iop_relog to struct xfs_defer_op_type 2023-12-06 18:45:17 -08:00
xfs_xattr.c vfs-6.7.xattr 2023-10-30 09:29:44 -10:00
xfs_xattr.h xfs: move xfs_xattr_handlers to .rodata 2023-10-10 13:49:20 +02:00
xfs.h