The io for FUSE requests and responses can now be further customized by allowing to write custom functions for reading/writing the responses. This includes overriding the splice io.
The reason for this addition is that having a custom file descriptor is not sufficient to allow custom io. Different types of file descriptor require different mechanisms of io interaction. For example, some file descriptor communication has boundaries (SOCK_DGRAM, EOF, etc...), while other types of fd:s might be unbounded (SOCK_STREAMS, ...). For unbounded communication, you have to read the header of the FUSE request first, and then read the remaining packet data. Furthermore, the one read call does not necessarily return all the data expected, requiring further
calls in a loop.
This test is too simple to check for all functionalities of notify_expire as it always successfully passes when libfuse supports the function (even if kernel does not support it - it just takes it as notify_inval)
On benchmarking metadata operations with a single threaded bonnie++
and "max_idle_threads" limited to 1, 'top' was showing suspicious
160% cpu usage.
Profiling the system with flame graphs showed that an astonishing
amount of CPU time was spent in thread creation and destruction.
After verifying the code it turned out that fuse_do_work() was
creating a new thread every time all existing idle threads
were already busy. And then just a few lines later after processing
the current request it noticed that it had created too many threads
and destructed the current thread. I.e. there was a thread
creation/destruction ping-pong.
Code is changed to only create new threads if the max number of
threads is not reached.
Furthermore, thread destruction is disabled, as creation/destruction
is expensive in general.
With this change cpu usage of passthrough_hp went from ~160% to
~80% (with different values of max_idle_threads). And bonnie
values got approximately faster by 90%. This is a with single
threaded bonnie++
bonnie++ -x 4 -q -s0 -d <path> -n 30:1:1:10 -r 0
Without this patch, using the default max_idle_threads=10 and just
a single bonnie++ the thread creation/destruction code path is not
triggered. Just one libfuse and one application thread is just
a corner case - the requirement for the issue was just
n-application-threads >= max_idle_threads.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
sfs_unlink may call do_lookup(), which increases the inode ref count,
but since that function does not return attributes that lookup ref
count won't get automatically decreased.
Allow requesting from kernel to avoid flush on close at file open
time. If kernel does not support FOPEN_NOFLUSH flag, the request
will be ignored.
For passthrough_hp example, request to avoid flush on close when
writeback cache is disabled and file is opened O_RDONLY.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Before last unlink() release the reference on inode.fd to allow reuse
of underlying fs inode number, mark the server inode "deleted" and bump
it's generation counter.
When same inode number is found on lookup(), the server inode object will
be reused as well.
Skip this when inode has an open file and when writeback cache is enabled.
This will be used to verify inode reuse bug fix in the kernel.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
When not using the readdir_plus mode, the d_type was not returned,
and the use_ino flag was not used for returning d_ino.
This patch fixes the returned values for d_ino and d_type by readdir(3)
The test for the returned value of d_ino has been adjusted to also
take the d_type into consideration and to check the returned values in
both basic readdir and readdir_plus modes. This is done by executing
the passthrough test twice.
Co-authored-by: Jean-Pierre André <jpandre@users.sourceforge.net>
- Test added for all passthrough examples.
- passthrough.c uses offset==0 mode. The others don't.
- passthrough.c changed to set FUSE_FILL_DIR_PLUS to make the test pass.
- This fixes#583.
Compiler warning about close(fd), add include file to fix.
Signed-off-by: haoyixing <haoyixing@kuaishou.com>
Co-authored-by: haoyixing <haoyixing@kuaishou.com>
In cuse_client.c, fd should be closed before return.
Otherwise, it will cause fd leakage problem.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Haotian Li <lihaotian9@huawei.com>
In ioctl_client.c, fd is not closed before return, thus
it will cause fd leakage problem.
Signed-off-by: Zhiqiang Liu <liuzhiqiang26@huawei.com>
Signed-off-by: Haotian Li <lihaotian9@huawei.com>
This commit defines a new capability called `FUSE_CAP_CACHE_SYMLINKS`.
It is off by default but you can now enable it by setting this flag in
in the `want` field of the `fuse_conn_info` structure.
When enabled, the kernel will save symlinks in its page cache,
by making use of the feature introduced in kernel 4.20:
5571f1e654
* passthrough_ll/hp: remove symlink fallbacks
Path lookup in the kernel has special rules for looking up magic symlinks
under /proc. If a filesystem operation is instructed to follow symlinks
(e.g. via AT_SYMLINK_FOLLOW or lack of AT_SYMLINK_NOFOLLOW), and the final
component is such a proc symlink, then the target of the magic symlink is
used for the operation, even if the target itself is a symlink. I.e. path
lookup is always terminated after following a final magic symlink.
I was erronously assuming that in the above case the target symlink would
also be followed, and so workarounds were added for a couple of operations
to handle the symlink case. Since the symlink can be handled simply by
following the proc symlink, these workardouds are not needed.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Co-authored-by: Miklos Szeredi <mszeredi@redhat.com>
IN a bunch of comments we say 'under the terms of the GNU GPL', make
it clear this is GPLv2 (as LICENSE says).
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Upstreamed from:
https://www.redhat.com/archives/virtio-fs/2020-January/msg00106.html
Since keep_cache(FOPEN_KEEP_CACHE) has no effect for directory as
described in fuse_common.h, use cache_readdir(FOPEN_CACHE_DIR) for
diretory open when cache=always mode.
Signed-off-by: Misono Tomohiro <misono.tomohiro@jp.fujitsu.com>
fuse_reply_err() expects the error code, not its negative.
Upstreamed from https://www.redhat.com/archives/virtio-fs/2020-January/msg00000.html. Original commit message:
lo_copy_file_range() passes -errno to fuse_reply_err() and then fuse_reply_err()
changes it to errno again, so that subsequent fuse_send_reply_iov_nofree() catches
the wrong errno.(i.e. reports "fuse: bad error value: ...").
Make fuse_send_reply_iov_nofree() accept the correct -errno by passing errno
directly in lo_copy_file_range().
Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com>
Reviewed-by: Eryu Guan <eguan@linux.alibaba.com>
Co-authored-by: Xiao Yang <ice_yangxiao@163.com>
Define FUSE_USE_VERSION < 35 to get old ioctl prototype
with int commands; define FUSE_USE_VERSION >= 35 to get
new ioctl prototype with unsigned int commands.
Fixes#463.
fdopendir(3) takes ownership of the file descriptor. The presence of
the lo_dirp->fd field could lead to someone incorrectly adding a
close(d->fd) cleanup call in the future.
Do not store the file descriptor in struct lo_dirp since it is unused.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Make use of fuse_log() instead of printing directly to stderr. This
demonstrates unified logging and also caught the fact that I forgot to
add fuse_log APIs to lib/fuse_versionscript. So it's basically a test
case :).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If fallocate isn't available we incorrectly check for the value of
HAVE_POSIX_FALLOCATE rather than it being defined.
We also fail to initialise 'err' in the case where neither are defined.
Fixes: 5fc562c90d ("Add fallocate and use it instead of ...")
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
FreeBSD doesn't allow creating sockets using mknod(2). Instead, one has to use socket(2)
and bind(2). Add appropriate logic to the examples and add a test case.
fuse.ko has supported FALLOC_FL_KEEP_SIZE and FALLOC_FL_PUNCH_HOLE at this
moment and more modes may be supported in the future.
fallocate(2) supports modes while posix_fallocate(2) does not, so this
makes lo_fallocate use fallocate(2) instead.
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
Instead of the Posix ioctl(2) command, Linux uses its own variant of ioctl()
in which the commands are requested as "unsigned long" and truncated to
32 bits by the fuse kernel module. Transmitting the commands to user space
file systems as "unsigned int" is a workaround for processing ioctl()
commands which do not fit into a signed int.
If hello_ll is invoked without a mountpoint, it will try to call
fuse_session_mount anyway with the NULL mountpoint (which then causes a
segfault). Print out a short help message instead (taken from
passthrough_ll.c).
lo_create() did not honour CACHE_NEVER in lo_create(), which has an effect
on how I/O is performed after the open.
The value of CACHE_ALWAYS, which results in setting fi->keep_cache, only
has an effect for the state of the cache at open, and since the file was
just created the cache is always empty. Hence setting this doesn't have an
effect on lo_create(), but keep it for symmetry with lo_open().
If do_readdir() calls do_lookup(), but the latter fails, we still have
to return any entries that we already stored in the readdir buffer to
avoid leaking inodes.
do_lookup() may fail if e.g. we reach the file descriptor limit.
For '.' and '..' entries only the file type in e.attr.st_mode and the inode
number in e.attr.st_ino are used. But it's prudent to at least initialize
the other fields of struct fuse_entry_param as well, instead of using
random values from the stack.
Caching can be controlled with the following options:
"cache=never": disable caching
"cache=normal": enable caching but also refresh after the timeout
"cache=always": never refresh cache
The timeout can be controlled with the "timeout=SEC" option, where SEC is
the number of seconds and can be an arbitrary non-negative floating point
number.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
The extended attribute functionality is enabled with the "xattr" option
(default) and disabled with the "no_xatt" option.
New operations added:
- getxattr
- listxattr
- setxattr
- removexattr
Caveat: none of these operations will work on a symbolic link, because it's
difficult to implement that without races that can result in incorrect
operation.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Conditionally enable flock() locking on underlying filesystem, based on the
flock/no_flock options. Default is "no_flock", meaning locking will be
local to the fuse filesystem and won't be propagated to the filesystem
passed through.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Add method forget_multi() to forget multiple inodes in a single message.
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Right now, passthrough_ll will use "/" as source directory for passthrough.
We need more flexibility where user can specify path of directory to be
passed through. Hence add an option "source=<source-dir>".
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
New operations added:
- mkdir
- mknod
- symlink
- link
- unlink
- rmdir
- rename
- setattr
- fsyncdir
- flush
- fsync
- statfs
- fallocate
Caveats:
- The utimes(2) family of syscalls will fail on symlinks on 4.18 and
earlier kernels. Hoping to add support to later kernels.
- The link(2) and linkat(2) system calls will fail on symlinks unless running
with privileges (CAP_DAC_READ_SEARCH).
Signed-off-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Kernel is not expecting an elevated lookup count for the "." and ".."
entries when doing READDIRPLUS.
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Multiple meson build scripts improvements including:
* Bump meson requirement to 0.40.1 (0.40 already required)
* Declare a dependency object for main library
* Stop using add_global_arguments()
* Various minor style fixes
DragonFlyBSD has no "bsd" in uname, so add 'dragonfly' to conditionals.
-- e.g. uname(1) in DragonFlyBSD
[root@ ~]# uname
DragonFly
[root@ ~]# python -c "import sys; print(sys.platform)"
dragonfly5
We re-introduce the functionality of invalidating the caches for an
inode specified by path by adding a new routine
fuse_invalidate_path. This is useful for network-based file systems
which use the high-level API, enabling them to notify the kernel about
external changes.
This is a revival of Miklos Szeredi's original code for the
fuse_invalidate routine.
As the comment says, this made it compile but not work. If there is a
need, we can add these checks to meson.build to only build this file
if the prerequisites are satisfied.
This allows calls like open(file, O_CREAT|O_RDONLY, 0200) which would
otherwise fail because we cannot open the file after mknod() has
created it with 0200 permissions.