2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-15 00:34:10 +08:00
linux-next/fs/xfs
Dave Chinner 5b257b4a1f xfs: fix race in inode cluster freeing failing to stale inodes
When an inode cluster is freed, it needs to mark all inodes in memory as
XFS_ISTALE before marking the buffer as stale. This is eeded because the inodes
have a different life cycle to the buffer, and once the buffer is torn down
during transaction completion, we must ensure none of the inodes get written
back (which is what XFS_ISTALE does).

Unfortunately, xfs_ifree_cluster() has some bugs that lead to inodes not being
marked with XFS_ISTALE. This shows up when xfs_iflush() is called on these
inodes either during inode reclaim or tail pushing on the AIL.  The buffer is
read back, but no longer contains inodes and so triggers assert failures and
shutdowns. This was reproducable with at run.dbench10 invocation from xfstests.

There are two main causes of xfs_ifree_cluster() failing. The first is simple -
it checks in-memory inodes it finds in the per-ag icache to see if they are
clean without holding the flush lock. if they are clean it skips them
completely. However, If an inode is flushed delwri, it will
appear clean, but is not guaranteed to be written back until the flush lock has
been dropped. Hence we may have raced on the clean check and the inode may
actually be dirty. Hence always mark inodes found in memory stale before we
check properly if they are clean.

The second is more complex, and makes the first problem easier to hit.
Basically the in-memory inode scan is done with full knowledge it can be racing
with inode flushing and AIl tail pushing, which means that inodes that it can't
get the flush lock on might not be attached to the buffer after then in-memory
inode scan due to IO completion occurring. This is actually documented in the
code as "needs better interlocking". i.e. this is a zero-day bug.

Effectively, the in-memory scan must be done while the inode buffer is locked
and Io cannot be issued on it while we do the in-memory inode scan. This
ensures that inodes we couldn't get the flush lock on are guaranteed to be
attached to the cluster buffer, so we can then catch all in-memory inodes and
mark them stale.

Now that the inode cluster buffer is locked before the in-memory scan is done,
there is no need for the two-phase update of the in-memory inodes, so simplify
the code into two loops and remove the allocation of the temporary buffer used
to hold locked inodes across the phases.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-06-03 16:22:29 +10:00
..
linux-2.6 xfs: fix access to upper inodes without inode64 2010-05-28 15:19:56 -05:00
quota fs/xfs/quota: Add missing mutex_unlock 2010-05-28 15:19:41 -05:00
support xfs: event tracing support 2009-12-14 23:08:16 -06:00
Kconfig xfs: use generic Posix ACL code 2009-06-10 17:07:47 +02:00
Makefile xfs: Introduce delayed logging core code 2010-05-24 10:38:03 -05:00
xfs_acl.h xfs: convert attr to use unsigned names 2010-01-20 10:47:48 +11:00
xfs_ag.h xfs: fix access to upper inodes without inode64 2010-05-28 15:19:56 -05:00
xfs_alloc_btree.c xfs: Improve scalability of busy extent tracking 2010-05-24 10:34:00 -05:00
xfs_alloc_btree.h [XFS] Always use struct xfs_btree_block instead of short / longform 2008-10-30 17:14:34 +11:00
xfs_alloc.c xfs: Improve scalability of busy extent tracking 2010-05-24 10:34:00 -05:00
xfs_alloc.h xfs: Improve scalability of busy extent tracking 2010-05-24 10:34:00 -05:00
xfs_arch.h xfs: use generic Posix ACL code 2009-06-10 17:07:47 +02:00
xfs_attr_leaf.c xfs: remove duplicate buffer flags 2010-01-21 13:44:36 -06:00
xfs_attr_leaf.h [XFS] Remove macro-to-function indirections in attr code 2009-01-09 15:46:44 +11:00
xfs_attr_sf.h xfs: convert attr to use unsigned names 2010-01-20 10:47:48 +11:00
xfs_attr.c xfs: remove duplicate buffer flags 2010-01-21 13:44:36 -06:00
xfs_attr.h xfs: convert attr to use unsigned names 2010-01-20 10:47:48 +11:00
xfs_bit.c [XFS] Use the generic bitops rather than implementing them ourselves. 2008-08-13 15:41:12 +10:00
xfs_bit.h [XFS] Remove macro-to-function indirections in the mask code 2009-01-09 15:53:54 +11:00
xfs_bmap_btree.c xfs: make several more functions static 2010-01-15 15:31:38 -06:00
xfs_bmap_btree.h xfs: make several more functions static 2010-01-15 15:31:38 -06:00
xfs_bmap.c xfs: fix reservation release commit flag in xfs_bmap_add_attrfork() 2010-05-19 09:58:08 -05:00
xfs_bmap.h xfs: event tracing support 2009-12-14 23:08:16 -06:00
xfs_btree_trace.c [XFS] make btree tracing generic 2008-10-30 16:58:50 +11:00
xfs_btree_trace.h xfs: event tracing support 2009-12-14 23:08:16 -06:00
xfs_btree.c xfs: remove duplicate buffer flags 2010-01-21 13:44:36 -06:00
xfs_btree.h xfs: add more statics & drop some unused functions 2009-08-31 14:46:20 -05:00
xfs_buf_item.c xfs: Ensure inode allocation buffers are fully replayed 2010-05-24 10:41:22 -05:00
xfs_buf_item.h xfs: Ensure inode allocation buffers are fully replayed 2010-05-24 10:41:22 -05:00
xfs_da_btree.c xfs: convert dirnameops to unsigned char names 2010-01-20 10:47:17 +11:00
xfs_da_btree.h xfs: convert dirnameops to unsigned char names 2010-01-20 10:47:17 +11:00
xfs_dfrag.c xfs: more swap extent fixes for dynamic fork offsets 2010-04-26 12:38:51 -05:00
xfs_dfrag.h xfs: clean up inconsistent variable naming in xfs_swap_extent 2010-01-15 15:31:23 -06:00
xfs_dinode.h xfs: remove m_litino 2009-03-29 09:51:14 +02:00
xfs_dir2_block.c xfs: clean up sign warnings in dir2 code 2010-01-20 10:48:05 +11:00
xfs_dir2_block.h
xfs_dir2_data.c
xfs_dir2_data.h xfs: fix various typos 2009-03-29 09:55:42 +02:00
xfs_dir2_leaf.c xfs: clean up sign warnings in dir2 code 2010-01-20 10:48:05 +11:00
xfs_dir2_leaf.h
xfs_dir2_node.c xfs: make several more functions static 2010-01-15 15:31:38 -06:00
xfs_dir2_node.h xfs: make several more functions static 2010-01-15 15:31:38 -06:00
xfs_dir2_sf.c xfs: clean up sign warnings in dir2 code 2010-01-20 10:48:05 +11:00
xfs_dir2_sf.h [XFS] kill xfs_dinode_core_t 2008-12-01 11:37:35 +11:00
xfs_dir2.c xfs: clean up sign warnings in dir2 code 2010-01-20 10:48:05 +11:00
xfs_dir2.h xfs: make xfs_dir_cilookup_result use unsigned char 2010-01-20 10:47:25 +11:00
xfs_dmapi.h removed unused #include <linux/version.h>'s 2008-08-23 12:14:12 -07:00
xfs_dmops.c [XFS] kill struct xfs_mount_args 2008-10-30 17:53:24 +11:00
xfs_error.c xfs: clean up log ticket overrun debug output 2010-05-24 10:33:46 -05:00
xfs_error.h xfs: add const qualifiers to xfs error function args 2010-05-19 09:58:11 -05:00
xfs_extfree_item.c xfs: remove stale parameter from ->iop_unpin method 2010-05-19 09:58:08 -05:00
xfs_extfree_item.h [XFS] remove always-true #ifndef HAVE_FORMAT32 tests 2009-01-22 14:07:31 +11:00
xfs_filestream.c xfs: Kill filestreams cache flush 2010-01-15 15:34:22 -06:00
xfs_filestream.h xfs: Kill filestreams cache flush 2010-01-15 15:34:22 -06:00
xfs_fs.h xfs: return inode fork offset in bulkstat for fsr 2010-03-05 11:02:07 -06:00
xfs_fsops.c xfs: Replace per-ag array with a radix tree 2010-01-15 15:33:52 -06:00
xfs_fsops.h filesystem freeze: add error handling of write_super_lockfs/unlockfs 2009-01-09 16:54:42 -08:00
xfs_ialloc_btree.c xfs: fix various typos 2009-03-29 09:55:42 +02:00
xfs_ialloc_btree.h xfs: remove superflous inobt macros 2009-02-09 08:37:14 +01:00
xfs_ialloc.c xfs: remove duplicate buffer flags 2010-01-21 13:44:36 -06:00
xfs_ialloc.h xfs: rationalize xfs_inobt_lookup* 2009-09-01 12:45:39 -05:00
xfs_iget.c xfs: fix access to upper inodes without inode64 2010-05-28 15:19:56 -05:00
xfs_inode_item.c xfs: remove stale parameter from ->iop_unpin method 2010-05-19 09:58:08 -05:00
xfs_inode_item.h xfs: Don't issue buffer IO direct from AIL push V2 2010-02-02 10:13:42 +11:00
xfs_inode.c xfs: fix race in inode cluster freeing failing to stale inodes 2010-06-03 16:22:29 +10:00
xfs_inode.h xfs: remove xfs_ipin/xfs_iunpin 2010-03-01 16:35:56 -06:00
xfs_inum.h xfs: remove XFS_INO64_OFFSET 2009-08-31 14:46:22 -05:00
xfs_iomap.c xfs: mark xfs_iomap_write_ helpers static 2010-05-19 09:58:20 -05:00
xfs_iomap.h xfs: mark xfs_iomap_write_ helpers static 2010-05-19 09:58:20 -05:00
xfs_itable.c xfs: return inode fork offset in bulkstat for fsr 2010-03-05 11:02:07 -06:00
xfs_itable.h xfs: add more statics & drop some unused functions 2009-08-31 14:46:20 -05:00
xfs_log_cil.c xfs: Ensure inode allocation buffers are fully replayed 2010-05-24 10:41:22 -05:00
xfs_log_priv.h xfs: enable background pushing of the CIL 2010-05-24 10:38:20 -05:00
xfs_log_recover.c xfs: clean up xlog_align 2010-05-28 14:58:36 -05:00
xfs_log_recover.h xfs: Clean up XFS_BLI_* flag namespace 2010-05-24 10:33:39 -05:00
xfs_log.c xfs: forced unmounts need to push the CIL 2010-05-24 10:38:14 -05:00
xfs_log.h xfs: Ensure inode allocation buffers are fully replayed 2010-05-24 10:41:22 -05:00
xfs_mount.c xfs: fix access to upper inodes without inode64 2010-05-28 15:19:56 -05:00
xfs_mount.h xfs: Introduce delayed logging core code 2010-05-24 10:38:03 -05:00
xfs_mru_cache.c xfs: Kill filestreams cache flush 2010-01-15 15:34:22 -06:00
xfs_mru_cache.h xfs: Kill filestreams cache flush 2010-01-15 15:34:22 -06:00
xfs_quota.h xfs: removed unused XFS_QMOPT_ flags 2010-05-19 09:58:15 -05:00
xfs_refcache.h
xfs_rename.c xfs: event tracing support 2009-12-14 23:08:16 -06:00
xfs_rtalloc.c xfs: replace E2BIG with EFBIG where appropriate 2010-05-28 14:58:16 -05:00
xfs_rtalloc.h xfs: be more explicit if RT mount fails due to config 2010-05-28 14:58:24 -05:00
xfs_rw.c xfs: only clear the suid bit once in xfs_write 2010-02-12 13:43:57 -06:00
xfs_rw.h xfs: only clear the suid bit once in xfs_write 2010-02-12 13:43:57 -06:00
xfs_sb.h [XFS] Remove the rest of the macro-to-function indirections. 2009-01-19 14:45:55 +11:00
xfs_trans_ail.c xfs: Don't issue buffer IO direct from AIL push V2 2010-02-02 10:13:42 +11:00
xfs_trans_buf.c xfs: Ensure inode allocation buffers are fully replayed 2010-05-24 10:41:22 -05:00
xfs_trans_extfree.c
xfs_trans_inode.c xfs: simplify xfs_trans_iget 2009-09-01 12:46:16 -05:00
xfs_trans_item.c xfs: Introduce delayed logging core code 2010-05-24 10:38:03 -05:00
xfs_trans_priv.h xfs: Introduce delayed logging core code 2010-05-24 10:38:03 -05:00
xfs_trans_space.h xfs: remove superflous inobt macros 2009-02-09 08:37:14 +01:00
xfs_trans.c xfs: cleanup log reservation calculactions 2010-05-28 14:58:30 -05:00
xfs_trans.h xfs: cleanup log reservation calculactions 2010-05-28 14:58:30 -05:00
xfs_types.h xfs: make the log ticket ID available outside the log infrastructure 2010-05-24 10:33:52 -05:00
xfs_utils.c xfs: kill xfs_qmops 2009-06-08 15:33:32 +02:00
xfs_utils.h [XFS] implement IHOLD/IRELE directly 2008-08-13 16:13:45 +10:00
xfs_vnodeops.c xfs: Check new inode size is OK before preallocating 2010-05-28 15:19:12 -05:00
xfs_vnodeops.h xfs: kill xfs_lrw.h 2010-03-01 16:35:44 -06:00
xfs.h xfs: event tracing support 2009-12-14 23:08:16 -06:00