linux/fs/ocfs2
Ryan Ding c15471f795 ocfs2: fix sparse file & data ordering issue in direct io
There are mainly three issues in the direct io code path after commit
24c40b329e ("ocfs2: implement ocfs2_direct_IO_write"):

  * Does not support sparse file.
  * Does not support data ordering.  eg: when write to a file hole, it
    will alloc extent first.  If system crashed before io finished, data
    will corrupt.
  * Potential risk when doing aio+dio.  The -EIOCBQUEUED return value is
    likely to be ignored by ocfs2_direct_IO_write().

To resolve above problems, re-design direct io code with following ideas:
  * Use buffer io to fill in holes.  And this will make better
    performance also.
  * Clear unwritten after direct write finished.  So we can make sure
    meta data changes after data write to disk.  (Unwritten extent is
    invisible to user, from user's view, meta data is not changed when
    allocate an unwritten extent.)
  * Clear ocfs2_direct_IO_write().  Do all ending work in end_io.

This patch has passed fs,dio,ltp-aiodio.part1,ltp-aiodio.part2,ltp-aiodio.part4
test cases of ltp.

For performance improvement, see following test result:
ocfs2 cluster size 1MB, ocfs2 volume is mounted on /mnt/.
The original way:
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=4K count=1048576 oflag=direct
  1048576+0 records in
  1048576+0 records out
  4294967296 bytes (4.3 GB) copied, 1707.83 s, 2.5 MB/s
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=256K count=16384 oflag=direct
  16384+0 records in
  16384+0 records out
  4294967296 bytes (4.3 GB) copied, 582.705 s, 7.4 MB/s

After this patch:
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=4K count=1048576 oflag=direct
  1048576+0 records in
  1048576+0 records out
  4294967296 bytes (4.3 GB) copied, 64.6412 s, 66.4 MB/s
  + rm /mnt/test.img -f
  + dd if=/dev/zero of=/mnt/test.img bs=256K count=16384 oflag=direct
  16384+0 records in
  16384+0 records out
  4294967296 bytes (4.3 GB) copied, 34.7611 s, 124 MB/s

Signed-off-by: Ryan Ding <ryan.ding@oracle.com>
Reviewed-by: Junxiao Bi <junxiao.bi@oracle.com>
Cc: Joseph Qi <joseph.qi@huawei.com>
Cc: Mark Fasheh <mfasheh@suse.de>
Cc: Joel Becker <jlbec@evilplan.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2016-03-25 16:37:42 -07:00
..
cluster ocfs2: o2hb: fix double free bug 2016-03-25 16:37:42 -07:00
dlm ocfs2/dlm: fix a variable overflow problem in dlmdomain.c 2016-03-15 16:55:16 -07:00
dlmfs kmemcg: account certain kmem allocations to memcg 2016-01-14 16:00:49 -08:00
acl.c ocfs2: take inode lock in ocfs2_iop_set/get_acl() 2015-09-04 16:54:41 -07:00
acl.h ocfs2: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00
alloc.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
alloc.h ocfs2: constify ocfs2_extent_tree_operations structures 2016-01-14 16:00:49 -08:00
aops.c ocfs2: fix sparse file & data ordering issue in direct io 2016-03-25 16:37:42 -07:00
aops.h ocfs2: add ocfs2_write_type_t type to identify the caller of write 2016-03-25 16:37:42 -07:00
blockcheck.c
blockcheck.h
buffer_head_io.c ocfs2: clear the rest of the buffers on error 2015-09-04 16:54:41 -07:00
buffer_head_io.h
dcache.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
dcache.h ocfs2: revert iput deferring code in ocfs2_drop_dentry_lock 2014-04-03 16:20:55 -07:00
dir.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
dir.h VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
dlmglue.c ocfs2: NFS hangs in __ocfs2_cluster_lock due to race with ocfs2_unblock_lock 2016-01-21 17:20:51 -08:00
dlmglue.h ocfs2: avoid blocking in ocfs2_mark_lockres_freeing() in downconvert thread 2014-04-03 16:20:55 -07:00
export.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
export.h
extent_map.c ocfs2: neaten do_error, ocfs2_error and ocfs2_abort 2015-09-04 16:54:41 -07:00
extent_map.h
file.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
file.h ocfs2: prepare some interfaces used in append direct io 2015-02-16 17:56:04 -08:00
filecheck.c ocfs2: sysfile interfaces for online file check 2016-03-22 15:36:02 -07:00
filecheck.h ocfs2: sysfile interfaces for online file check 2016-03-22 15:36:02 -07:00
heartbeat.c
heartbeat.h
inode.c ocfs2: record UNWRITTEN extents when populate write desc 2016-03-25 16:37:42 -07:00
inode.h ocfs2: record UNWRITTEN extents when populate write desc 2016-03-25 16:37:42 -07:00
ioctl.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
ioctl.h
journal.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
journal.h ocfs2: add functions to add and remove inode in orphan dir 2015-02-16 17:56:04 -08:00
Kconfig
localalloc.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
localalloc.h ocfs2: free allocated clusters if error occurs after ocfs2_claim_clusters 2014-02-06 13:48:51 -08:00
locks.c ocfs2: fix flock panic issue 2015-12-29 17:45:49 -08:00
locks.h
Makefile ocfs2: sysfile interfaces for online file check 2016-03-22 15:36:02 -07:00
mmap.c ocfs2: add ocfs2_write_type_t type to identify the caller of write 2016-03-25 16:37:42 -07:00
mmap.h
move_extents.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
move_extents.h
namei.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
namei.h ocfs2: do not include dio entry in case of orphan scan 2015-11-05 19:34:48 -08:00
ocfs1_fs_compat.h
ocfs2_fs.h treewide: fix typos in comment blocks 2015-08-07 14:46:24 +02:00
ocfs2_ioctl.h
ocfs2_lockid.h
ocfs2_lockingver.h
ocfs2_trace.h ocfs2: check/fix inode block for online file check 2016-03-22 15:36:02 -07:00
ocfs2.h ocfs2: add errors=continue 2015-09-04 16:54:41 -07:00
quota_global.c ocfs2: Implement get_next_id() 2016-02-09 13:05:23 +01:00
quota_local.c ocfs2: neaten do_error, ocfs2_error and ocfs2_abort 2015-09-04 16:54:41 -07:00
quota.h quota: constify qtree_fmt_operations structures 2016-01-04 10:58:35 +01:00
refcounttree.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
refcounttree.h ocfs2: fix NULL pointer dereference in ocfs2_duplicate_clusters_by_page 2013-08-13 17:57:49 -07:00
reservations.c ocfs2: make resv_lock spinlock static 2015-02-10 14:30:29 -08:00
reservations.h
resize.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
resize.h
slot_map.c ocfs2: fix slot overwritten if storage link down during mount 2016-01-14 16:00:49 -08:00
slot_map.h
stack_o2cb.c ocfs2: avoid a pointless delay in o2cb_cluster_check() 2015-04-14 16:48:57 -07:00
stack_user.c char: make misc_deregister a void function 2015-08-05 10:35:49 -07:00
stackglue.c ocfs2: export ocfs2_kset for online file check 2016-03-22 15:36:02 -07:00
stackglue.h ocfs2: export ocfs2_kset for online file check 2016-03-22 15:36:02 -07:00
suballoc.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
suballoc.h ocfs2: rollback alloc_dinode counts when ocfs2_block_group_set_bits() failed 2014-04-03 16:20:56 -07:00
super.c ocfs2: record UNWRITTEN extents when populate write desc 2016-03-25 16:37:42 -07:00
super.h ocfs2: neaten do_error, ocfs2_error and ocfs2_abort 2015-09-04 16:54:41 -07:00
symlink.c switch ->get_link() to delayed_call, kill ->put_link() 2015-12-30 13:01:03 -05:00
symlink.h
sysfile.c ocfs2: avoid system inode ref confusion by adding mutex lock 2014-04-03 16:20:57 -07:00
sysfile.h
uptodate.c ocfs2: remove NULL assignments on static 2014-06-04 16:53:53 -07:00
uptodate.h
xattr.c wrappers for ->i_mutex access 2016-01-22 18:04:28 -05:00
xattr.h ocfs2: use generic posix ACL infrastructure 2014-01-25 23:58:21 -05:00