Go to file
Kirill Smelkov ad2ba64dd4 fuse: allow filesystems to have precise control over data cache
On networked filesystems file data can be changed externally.  FUSE
provides notification messages for filesystem to inform kernel that
metadata or data region of a file needs to be invalidated in local page
cache. That provides the basis for filesystem implementations to invalidate
kernel cache explicitly based on observed filesystem-specific events.

FUSE has also "automatic" invalidation mode(*) when the kernel
automatically invalidates data cache of a file if it sees mtime change.  It
also automatically invalidates whole data cache of a file if it sees file
size being changed.

The automatic mode has corresponding capability - FUSE_AUTO_INVAL_DATA.
However, due to probably historical reason, that capability controls only
whether mtime change should be resulting in automatic invalidation or
not. A change in file size always results in invalidating whole data cache
of a file irregardless of whether FUSE_AUTO_INVAL_DATA was negotiated(+).

The filesystem I write[1] represents data arrays stored in networked
database as local files suitable for mmap. It is read-only filesystem -
changes to data are committed externally via database interfaces and the
filesystem only glues data into contiguous file streams suitable for mmap
and traditional array processing. The files are big - starting from
hundreds gigabytes and more. The files change regularly, and frequently by
data being appended to their end. The size of files thus changes
frequently.

If a file was accessed locally and some part of its data got into page
cache, we want that data to stay cached unless there is memory pressure, or
unless corresponding part of the file was actually changed. However current
FUSE behaviour - when it sees file size change - is to invalidate the whole
file. The data cache of the file is thus completely lost even on small size
change, and despite that the filesystem server is careful to accurately
translate database changes into FUSE invalidation messages to kernel.

Let's fix it: if a filesystem, through new FUSE_EXPLICIT_INVAL_DATA
capability, indicates to kernel that it is fully responsible for data cache
invalidation, then the kernel won't invalidate files data cache on size
change and only truncate that cache to new size in case the size decreased.

(*) see 72d0d248ca "fuse: add FUSE_AUTO_INVAL_DATA init flag",
eed2179efe "fuse: invalidate inode mapping if mtime changes"

(+) in writeback mode the kernel does not invalidate data cache on file
size change, but neither it allows the filesystem to set the size due to
external event (see 8373200b12 "fuse: Trust kernel i_size only")

[1] https://lab.nexedi.com/kirr/wendelin.core/blob/a50f1d9f/wcfs/wcfs.go#L20

Signed-off-by: Kirill Smelkov <kirr@nexedi.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
2019-04-24 17:05:06 +02:00
arch Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-04-20 10:05:02 -07:00
block bfq: update internal depth state when queue depth changes 2019-04-13 19:08:22 -06:00
certs kexec, KEYS: Make use of platform keyring for signature verify 2019-02-04 17:34:07 -05:00
crypto crypto: x86/poly1305 - fix overflow during partial reduction 2019-04-08 14:43:06 +08:00
Documentation Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input 2019-04-19 10:28:27 -07:00
drivers SCSI fixes on 20190420 2019-04-20 12:52:23 -07:00
fs fuse: allow filesystems to have precise control over data cache 2019-04-24 17:05:06 +02:00
include fuse: allow filesystems to have precise control over data cache 2019-04-24 17:05:06 +02:00
init init: initialize jump labels before command line option parsing 2019-04-19 09:46:05 -07:00
ipc Merge branch 'work.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2019-03-12 14:08:19 -07:00
kernel Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-04-20 10:10:49 -07:00
lib kcov: improve CONFIG_ARCH_HAS_KCOV help text 2019-04-19 09:46:05 -07:00
LICENSES LICENSES: Add GCC runtime library exception text 2019-01-16 14:54:15 -07:00
mm Merge branch 'for-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu 2019-04-19 15:37:22 -07:00
net NFS client bugfixes for Linux 5.1 2019-04-20 12:55:23 -07:00
samples Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2019-03-11 08:54:01 -07:00
scripts locking/atomics: Don't assume that scripts are executable 2019-04-19 14:21:43 +02:00
security Merge branch 'for-5.1-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2019-04-19 18:03:55 -07:00
sound ALSA: hda/realtek - add two more pin configuration sets to quirk table 2019-04-17 10:41:38 +02:00
tools Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2019-04-20 10:05:02 -07:00
usr user/Makefile: Fix typo and capitalization in comment section 2018-12-11 00:18:03 +09:00
virt KVM: fix spectrev1 gadgets 2019-04-16 15:38:07 +02:00
.clang-format clang-format: Update with the latest for_each macro list 2019-04-12 12:49:54 +02:00
.cocciconfig scripts: add Linux .cocciconfig for coccinelle 2016-07-22 12:13:39 +02:00
.get_maintainer.ignore
.gitattributes .gitattributes: set git diff driver for C source code files 2016-10-07 18:46:30 -07:00
.gitignore kbuild: Add support for DT binding schema checks 2018-12-13 09:41:32 -06:00
.mailmap Update Nicolas Pitre's email address 2019-04-02 18:12:44 -10:00
COPYING COPYING: use the new text with points to the license files 2018-03-23 12:41:45 -06:00
CREDITS Char/Misc driver patches for 5.1-rc1 2019-03-06 14:18:59 -08:00
Kbuild Kbuild updates for v5.1 2019-03-10 17:48:21 -07:00
Kconfig kconfig: move the "Executable file formats" menu to fs/Kconfig.binfmt 2018-08-02 08:06:55 +09:00
MAINTAINERS - Fix the random PID check 2019-04-20 10:43:37 -07:00
Makefile Linux 5.1-rc6 2019-04-21 10:45:57 -07:00
README Drop all 00-INDEX files from Documentation/ 2018-09-09 15:08:58 -06:00

Linux kernel
============

There are several guides for kernel developers and users. These guides can
be rendered in a number of formats, like HTML and PDF. Please read
Documentation/admin-guide/README.rst first.

In order to build the documentation, use ``make htmldocs`` or
``make pdfdocs``.  The formatted documentation can also be read online at:

    https://www.kernel.org/doc/html/latest/

There are various text files in the Documentation/ subdirectory,
several of them using the Restructured Text markup notation.

Please read the Documentation/process/changes.rst file, as it contains the
requirements for building and running the kernel, and information about
the problems which may result by upgrading your kernel.