linux/fs
Stefan Hajnoczi a62a8ef9d9 virtio-fs: add virtiofs filesystem
Add a basic file system module for virtio-fs.  This does not yet contain
shared data support between host and guest or metadata coherency speedups.
However it is already significantly faster than virtio-9p.

Design Overview
===============

With the goal of designing something with better performance and local file
system semantics, a bunch of ideas were proposed.

 - Use fuse protocol (instead of 9p) for communication between guest and
   host.  Guest kernel will be fuse client and a fuse server will run on
   host to serve the requests.

 - For data access inside guest, mmap portion of file in QEMU address space
   and guest accesses this memory using dax.  That way guest page cache is
   bypassed and there is only one copy of data (on host).  This will also
   enable mmap(MAP_SHARED) between guests.

 - For metadata coherency, there is a shared memory region which contains
   version number associated with metadata and any guest changing metadata
   updates version number and other guests refresh metadata on next access.
   This is yet to be implemented.

How virtio-fs differs from existing approaches
==============================================

The unique idea behind virtio-fs is to take advantage of the co-location of
the virtual machine and hypervisor to avoid communication (vmexits).

DAX allows file contents to be accessed without communication with the
hypervisor.  The shared memory region for metadata avoids communication in
the common case where metadata is unchanged.

By replacing expensive communication with cheaper shared memory accesses,
we expect to achieve better performance than approaches based on network
file system protocols.  In addition, this also makes it easier to achieve
local file system semantics (coherency).

These techniques are not applicable to network file system protocols since
the communications channel is bypassed by taking advantage of shared memory
on a local machine.  This is why we decided to build virtio-fs rather than
focus on 9P or NFS.

Caching Modes
=============

Like virtio-9p, different caching modes are supported which determine the
coherency level as well.  The “cache=FOO” and “writeback” options control
the level of coherence between the guest and host filesystems.

 - cache=none
   metadata, data and pathname lookup are not cached in guest.  They are
   always fetched from host and any changes are immediately pushed to host.

 - cache=always
   metadata, data and pathname lookup are cached in guest and never expire.

 - cache=auto
   metadata and pathname lookup cache expires after a configured amount of
   time (default is 1 second).  Data is cached while the file is open
   (close to open consistency).

 - writeback/no_writeback
   These options control the writeback strategy.  If writeback is disabled,
   then normal writes will immediately be synchronized with the host fs.
   If writeback is enabled, then writes may be cached in the guest until
   the file is closed or an fsync(2) performed.  This option has no effect
   on mmap-ed writes or writes going through the DAX mechanism.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-09-18 20:17:50 +02:00
..
9p 9p: pass the correct prototype to read_cache_page 2019-07-12 11:05:43 -07:00
adfs Merge branch 'work.adfs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 11:33:22 -07:00
affs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
afs afs: use correct afs_call_type in yfs_fs_store_opaque_acl2 2019-08-22 13:33:27 +01:00
autofs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 83 2019-05-24 17:37:52 +02:00
befs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
bfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
btrfs for-5.3-rc4-tag 2019-08-18 09:51:48 -07:00
cachefiles treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 36 2019-05-24 17:27:11 +02:00
ceph ceph: don't try fill file_lock on unsuccessful GETFILELOCK reply 2019-08-22 10:47:41 +02:00
cifs signal: Allow cifs and drbd to receive their terminating signals 2019-08-19 06:34:13 -05:00
coda coda: add hinting support for partial file caching 2019-07-16 19:23:23 -07:00
configfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
cramfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
crypto Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" 2019-07-10 18:43:43 -07:00
debugfs Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
devpts devpts: call fsnotify_unlink() hook 2019-06-20 14:46:34 +02:00
dlm dlm for 5.3 2019-07-12 17:37:53 -07:00
ecryptfs - Fix error handling when ecryptfs_read_lower() encounters an error 2019-07-14 19:29:04 -07:00
efivarfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
efs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
exportfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
ext2 New for 5.3: 2019-07-12 16:54:37 -07:00
ext4 - virtio_pmem: The new virtio_pmem facility introduces a paravirtualized 2019-07-18 10:52:08 -07:00
f2fs f2fs-for-5.4-rc3 2019-07-30 13:15:39 -07:00
fat treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 282 2019-06-05 17:36:37 +02:00
freevxfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
fscache Revert "Merge tag 'keys-acl-20190703' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs" 2019-07-10 18:43:43 -07:00
fuse virtio-fs: add virtiofs filesystem 2019-09-18 20:17:50 +02:00
gfs2 gfs2: gfs2_walk_metadata fix 2019-08-09 16:56:12 +01:00
hfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
hfsplus fs/hfsplus/xattr.c: replace strncpy with memcpy 2019-07-16 19:23:23 -07:00
hostfs This pull request contains the following changes for UML: 2019-05-12 17:52:13 -04:00
hpfs treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
hugetlbfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
iomap iomap: fix Invalid License ID 2019-07-25 11:05:11 +02:00
isofs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 142 2019-05-30 11:25:17 -07:00
jbd2 jbd2: drop declaration of journal_sync_buffer() 2019-06-20 17:32:21 -04:00
jffs2 jffs2: pass the correct prototype to read_cache_page 2019-07-12 11:05:43 -07:00
jfs vfs: create a generic checking and prep function for FS_IOC_SETFLAGS 2019-07-01 08:25:34 -07:00
kernfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 428 2019-06-05 17:37:16 +02:00
lockd lockd: Make two symbols static 2019-07-03 17:52:09 -04:00
minix treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
nfs NFSv4: Ensure state recovery handles ETIMEDOUT correctly 2019-08-07 12:55:11 -04:00
nfs_common treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
nfsd Merge branch 'work.mount-base' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs into HEAD 2019-09-06 21:22:58 +02:00
nilfs2 vfs: create a generic checking and prep function for FS_IOC_SETFLAGS 2019-07-01 08:25:34 -07:00
nls treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
notify proc/sysctl: add shared variables for range check 2019-07-18 17:08:07 -07:00
ntfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 97 2019-05-24 17:37:53 +02:00
ocfs2 ocfs2: remove set but not used variable 'last_hash' 2019-08-03 07:02:00 -07:00
omfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209 2019-05-30 11:29:53 -07:00
openpromfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
orangefs orangefs: This simple pull request is just a fix for an 2019-07-16 15:15:29 -07:00
overlayfs SPDX update for 5.2-rc6 2019-06-21 09:58:42 -07:00
proc new helper: get_tree_keyed() 2019-09-05 14:34:22 -04:00
pstore pstore: Fix double-free in pstore_mkfile() failure path 2019-07-08 21:04:42 -07:00
qnx4 treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
qnx6 treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
quota \n 2019-07-10 20:27:07 -07:00
ramfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
reiserfs fs/reiserfs/journal.c: change return type of dirty_one_transaction 2019-07-16 19:23:24 -07:00
romfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
squashfs treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 499 2019-06-19 17:09:53 +02:00
sysfs Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
sysv treewide: Add SPDX license identifier - Makefile/Kconfig 2019-05-21 10:50:46 +02:00
tracefs \n 2019-07-10 20:09:17 -07:00
ubifs ubifs: Limit the number of pages in shrink_liability 2019-08-22 17:25:33 +02:00
udf \n 2019-07-10 20:27:07 -07:00
ufs fs/ufs/super.c: remove set but not used variable 'usb3' 2019-07-16 19:23:23 -07:00
unicode Many bug fixes and cleanups, and an optimization for case-insensitive 2019-07-10 21:06:01 -07:00
xfs xfs: fix missing ILOCK unlock when xfs_setattr_nonsize fails due to EDQUOT 2019-08-22 20:55:54 -07:00
aio.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
anon_inodes.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
attr.c
bad_inode.c
binfmt_aout.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_elf_fdpic.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
binfmt_elf.c fs/binfmt_elf.c: delete stale comment 2019-07-16 19:23:22 -07:00
binfmt_em86.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
binfmt_flat.c fs/binfmt_flat.c: remove set but not used variable 'inode' 2019-07-16 19:23:22 -07:00
binfmt_misc.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
binfmt_script.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
block_dev.c block: remove REQ_NOWAIT_INLINE 2019-08-15 11:09:16 -06:00
buffer.c for-linus-20190715 2019-07-15 21:20:52 -07:00
char_dev.c chardev: set variable ret to -EBUSY before checking minor range overlap 2019-05-24 20:50:36 +02:00
compat_binfmt_elf.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 193 2019-05-30 11:29:21 -07:00
compat_ioctl.c compat_ioctl: pppoe: fix PPPOEIOCSFWD handling 2019-07-30 14:42:13 -07:00
compat.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 500 2019-06-19 17:09:55 +02:00
coredump.c coredump: split pipe command whitespace before expanding template 2019-08-03 07:02:01 -07:00
d_path.c unexport simple_dname() 2019-05-21 08:23:41 +01:00
dax.c dax: dax_layout_busy_page() should not unmap cow pages 2019-08-05 14:59:05 -07:00
dcache.c Merge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-20 09:15:51 -07:00
dcookies.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
direct-io.c direct-io: use bio_release_pages in dio_bio_complete 2019-06-29 09:47:31 -06:00
drop_caches.c
eventfd.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
eventpoll.c proc/sysctl: add shared variables for range check 2019-07-18 17:08:07 -07:00
exec.c sched/fair: Don't free p->numa_faults with concurrent readers 2019-07-25 15:37:04 +02:00
fcntl.c fs: mark expected switch fall-throughs 2019-04-08 18:21:02 -05:00
fhandle.c
file_table.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
file.c io_uring-2019-03-06 2019-03-08 14:48:40 -08:00
filesystems.c
fs_context.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
fs_parser.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
fs_pin.c switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
fs_struct.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
fs_types.c
fs-writeback.c blkcg, writeback: Add wbc->no_cgroup_owner 2019-07-10 09:00:57 -06:00
fsopen.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
inode.c New for 5.3: 2019-07-12 16:54:37 -07:00
internal.h Merge branch 'work.dcache2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-20 09:15:51 -07:00
io_uring.c io_uring: add need_resched() check in inner poll loop 2019-08-22 15:32:28 -06:00
ioctl.c
Kconfig fs: VALIDATE_FS_PARSER should default to n 2019-07-05 11:22:11 -04:00
Kconfig.binfmt binfmt_flat: make support for old format binaries optional 2019-06-24 09:16:47 +10:00
libfs.c Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-07-19 10:42:02 -07:00
locks.c Highlights: 2019-07-10 21:22:43 -07:00
Makefile iomap: move the main iteration code into a separate file 2019-07-17 07:20:43 -07:00
mbcache.c treewide: Add SPDX license identifier for more missed files 2019-05-21 10:50:45 +02:00
mount.h switch the remnants of releasing the mountpoint away from fs_pin 2019-07-16 22:52:37 -04:00
mpage.c blkcg, writeback: Rename wbc_account_io() to wbc_account_cgroup_owner() 2019-07-10 09:00:57 -06:00
namei.c fsnotify: add empty fsnotify_{unlink,rmdir}() hooks 2019-06-20 14:44:55 +02:00
namespace.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
no-block.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
nsfs.c vfs: Convert nsfs to use the new mount API 2019-05-25 18:00:06 -04:00
open.c access: avoid the RCU grace period for the temporary subjective credentials 2019-07-24 10:12:09 -07:00
pipe.c vfs: Convert pipe to use the new mount API 2019-05-25 18:00:07 -04:00
pnode.c fs/namespace: fix unprivileged mount propagation 2019-06-17 17:36:09 -04:00
pnode.h treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 209 2019-05-30 11:29:53 -07:00
posix_acl.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
proc_namespace.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
read_write.c vfs: fix page locking deadlocks when deduping files 2019-08-16 18:43:24 -07:00
readdir.c
select.c fs/select.c: use struct_size() in kmalloc() 2019-07-16 19:23:25 -07:00
seq_file.c seq_file: fix problem when seeking mid-record 2019-08-13 16:06:52 -07:00
signalfd.c fs: mark expected switch fall-throughs 2019-04-08 18:21:02 -05:00
splice.c uio: make import_iovec()/compat_import_iovec() return bytes on success 2019-05-31 15:30:03 -06:00
stack.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00
stat.c
statfs.c
super.c vfs: subtype handling moved to fuse 2019-09-06 21:28:49 +02:00
sync.c fs/sync.c: sync_file_range(2) may use WB_SYNC_ALL writeback 2019-05-14 09:47:50 -07:00
timerfd.c
userfaultfd.c userfaultfd_release: always remove uffd flags and clear vm_userfaultfd_ctx 2019-08-24 19:48:42 -07:00
utimes.c
xattr.c treewide: Add SPDX license identifier for missed files 2019-05-21 10:50:45 +02:00