A previous PR supported extended max writes (eg write requests larger than 1 MB)
by initializing the fuse session buffer size to use the max_pages_limit set in
/proc/sys/fs/fuse. However, this is a huge problem for machines where multiple
fuse servers may be running but only one server needs large writes. In this case,
a lot of memory will be wasted and will lead to OOM issues.
This PR does a reallocation of the session buffer transparently if the server set
"se->conn.max_write" to a value larger than 1 MiB. This is only for buffers that
are "owned" by libfuse - if the server wishes to provide its own allocated buffer
for receiving/processing requests, then it should ensure that buffer is allocated
to the proper size from the start.
Local testing showed:
echo 65535 | sudo tee /proc/sys/fs/fuse/max_pages_limit
dd if=/dev/urandom of=hello_file bs=6M count=2
write requests:
write request size is 5242880
write request size is 1048576
write request size is 5242880
write request size is 1048576
I see random ENOTCONN failures in xfstest generic/013 and generic/014
in my branch, but earliest on the 2nd run - takes ~12hours to get
the issue, but then there are no further information logged.
ENOTCONN points to a daemon crash - I need backtraces and a core dump.
This adds optional handling of fatal signals to print a core dump
and optional syslog logging with these new public functions:
fuse_set_fail_signal_handlers()
In addition to the existing fuse_set_signal_handlers(). This is not
enabled together with fuse_set_signal_handlers(), as it is change
in behavior and file systems might already have their own fatal
handlers.
fuse_log_enable_syslog
Print logs to syslog instead of stderr
fuse_log_close_syslog
Close syslog (for now just does closelog())
Code in fuse_signals.c is also updated, to be an array of signals,
and setting signal handlers is now down with a for-loop instead
of one hand coded set_one_signal_handler() per signal.
The function fuse_session_process_buf_int() would do much things
for FUSE_INTERRUPT requests, even there are no FUSE_INTERRUPT requests:
1. check every non-FUSE_INTERRUPT request and add these requests to the
linked list(se->list) under a big lock(se->lock).
2. the function fuse_free_req() frees every request and remove them from
the linked list(se->list) under a bing lock(se->lock).
These operations are not meaningful when there are no FUSE_INTERRUPT requests,
and have a great impact on the performance of fuse filesystem because the big
lock for each request.
In some cases, FUSE_INTERRUPT requests are infrequent, even none at all.
Besides, the user-defined filesystem may do nothing for FUSE_INTERRUPT requests.
And the kernel side has the option "no_interrupt" in struct fuse_conn. This kernel option
can be enabled by return ENOSYS in libfuse for the reply of FUSE_INTERRUPT request.
But I don't find the code to enable the "no_interrupt" kernel option in libfuse.
So add the no_interrupt support, and when this operaion is enabled:
1. remove the useless locking operaions and list operations.
2. return ENOSYS for the reply of FUSE_INTERRUPT request to inform the kernel to disable
FUSE_INTERRUPT request.
Add support for filesystem passthrough read/write of files.
When the FUSE_PASSTHROUGH capability is enabled, the FUSE server may
decide, while handling the "open" or "create" requests, if the given
file can be accessed by that process in "passthrough" mode, meaning that
all the further read and write operations would be forwarded by the
kernel directly to the backing file rather than to the FUSE server.
All requests other than read or write are still handled by the server.
This allows for an improved performance on reads and writes, especially
in the case of reads at random offsets, for which no (readahead)
caching mechanism would help, reducing the performance gap between FUSE
and native filesystem access.
Extend also the passthrough_hp example with the new passthrough feature.
This example opens a kernel backing file per FUSE inode on the first
FUSE file open of that inode and closes the backing file on the release
of the last FUSE file on that inode.
All opens of the same inode passthrough to the same backing file.
A combination of fi->direct_io and fi->passthrough is allowed.
It means that read/write operations go directly to the server, but mmap
is done on the backing file.
This allows to open some fds of the inode in passthrough mode and some
fd of the same inode in direct_io/passthrough_mmap mode.
Signed-off-by: Alessio Balsini <balsini@android.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
The API stays the same, the libfuse version comes from
inlined functions, which are defined fuse_lowlevel.h
and fuse.h. As these inlined functions are defined in the header
files they get added into the application, similar as if these
were preprocessor macros.
Macro vs inlined function is then just a style issue - I personally
prefer the latter.
fuse_session_new() -> static inlinei, in the application
_fuse_session_new -> inside of libfuse
fuse_new() -> static inline, in the application
_fuse_new() -> inside of libfuse
Note: Entirely untested is the fuse 30 api - we need a test
for it. And we do not have any ABI tests at all.
Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm>
If the file system doesn't provide a ->open or an ->opendir, and the
kernel supports FUSE_CAP_NO_OPEN_SUPPORT or FUSE_CAP_NO_OPENDIR_SUPPORT,
allow the implementation to set FUSE_CAP_NO_OPEN*_SUPPORT on conn->want
in order to automatically get this behavior. Expand the documentation
to be more explicit about the behavior of libfuse in the different cases
WRT this capability.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
* Use single place to define the version
Defining the version in fuse_common.h, is removed, it is defined
through meson and provided by "libfuse_config.h". I.e. it avoids
to define the version twice - once in meson and once in
fuse_common.h.
Ideal would be to set integers in the meson file and create the version
string from these integers. However, meson requires that "project"
is the first meson.build keyword - with that it requires to
set the version from a string and then major/minor/hotfix integers
are created from string split.
Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm>
* Increase the version to 3.17.0
This is to prepare the branch for the next release.
Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm>
---------
Signed-off-by: Bernd Schubert <bernd.schubert@fastmail.fm>
Add more documentation for FUSE_CAP_EXPORT_SUPPORT
Also remove the flag from passthrough_ll.c and passthrough_hp.cc
as these implementations do _not_ handle that flag. They just
cast fuse_ino_t to an inode and cause a heap buffer overflow
for unknown objects (simplest reproducer are the examples
in "man 2 open_by_handle_at", but to unmount/mount the file
system after name_to_handle_at and before open_by_handle_at).
Fixes https://github.com/libfuse/libfuse/issues/838
---------
Co-authored-by: Nikolaus Rath <Nikolaus@rath.org>
'FUSE_CAP_HANDLE_KILLPRIV' is not enabled by default anymore, as that
would be a sudden security issue introduced by a new ABI and API
compatible libfuse version.
Allowing parallel dir operations could result in a crash in a filesystem
implementation that is not prepared for this.
To be safe keep this flag off by default (this is not a regression, since
there was no public release where this flag wasn't ignored).
If the filesystem wants better performance, then it should set this flag
explicitly.
Fixes: c9905341ea ("Pass FUSE_PARALLEL_DIROPS to kernel (#861)")
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
This is not called FUSE_CAP_DIRECT_IO_RELAX, as the kernel flag
FUSE_DIRECT_IO_RELAX is supposed to be renamed to
FUSE_DIRECT_IO_ALLOW_MMAP. The corresponding kernel patches just
did not land yet.
This syncs fuse_kernel.h to <linux-6.3>/include/uapi/linux/fuse.h
Special handling is done for setxattr as in linux commit
52a4c95f4d24b struct fuse_setxattr_in was extended. Extended
struct is only used when FUSE_SETXATTR_EXT is passed in FUSE_INIT
reply.
Right now fuse kernel serializes direct writes on the
same file. This serialization is good for such FUSE
implementations which rely on the inode lock to
avoid any data inconsistency issues but it hurts badly
such FUSE implementations which have their own mechanism
of dealing with cache/data integrity and can handle
parallel direct writes on the same file.
This patch allows parallel direct writes on the same file to be
enabled with the help of a flag FOPEN_PARALLEL_DIRECT_WRITES.
FUSE implementations which want to use this feature can
set this flag during fuse init. Default behaviour remains
same i.e no parallel direct writes on the same file.
Corresponding fuse kernel patch(Merged).
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.2&id=153524053bbb0d27bb2e0be36d1b46862e9ce74c
This addresses https://github.com/libfuse/libfuse/issues/729
commit db35a37def introduced a public
config.h (rename to fuse_config.h to avoid conflicts) that
was installed with the package and included by libfuse users
through fuse_common.h. Probablem is that this file does not have
unique defines so that they are unique to libfuse - on including
the file conflicts with libfuse users came up.
In principle all defines could be prefixed, but then most of them
are internal for libfuse compilation only. So this splits out
publically required defines to a new file 'libfuse_config.h'
and changes back to include of "fuse_config.h" only when
HAVE_LIBFUSE_PRIVATE_CONFIG_H is defined.
This also renames HAVE_LIBC_VERSIONED_SYMBOLS to
LIBFUSE_BUILT_WITH_VERSIONED_SYMBOLS, as it actually
better explains for libfuse users what that variable
is for.
This addresses: https://github.com/libfuse/libfuse/issues/724
HAVE_LIBC_VERSIONED_SYMBOLS configures the library if to use
versioned symbols and is set at meson configuration time.
External filesystems (the main target, actually)
include fuse headers and the preprocessor
then acts on HAVE_LIBC_VERSIONED_SYMBOLS. Problem was now that
'config.h' was not distributed with libfuse and so
HAVE_LIBC_VERSIONED_SYMBOLS was never defined with external
tools and the preprocessor did the wrong decision.
This commit also increases the the minimal meson version,
as this depends on meson feature only available in 0.50
<quote 'meson' >
WARNING: Project specifies a minimum meson_
version '>= 0.42' but uses features which were added
in newer versions:
* 0.50.0: {'install arg in configure_file'}
</quote>
Additionally the config file has been renamed to "fuse_config.h"
to avoid clashes - 'config.h' is not very specific.
On benchmarking metadata operations with a single threaded bonnie++
and "max_idle_threads" limited to 1, 'top' was showing suspicious
160% cpu usage.
Profiling the system with flame graphs showed that an astonishing
amount of CPU time was spent in thread creation and destruction.
After verifying the code it turned out that fuse_do_work() was
creating a new thread every time all existing idle threads
were already busy. And then just a few lines later after processing
the current request it noticed that it had created too many threads
and destructed the current thread. I.e. there was a thread
creation/destruction ping-pong.
Code is changed to only create new threads if the max number of
threads is not reached.
Furthermore, thread destruction is disabled, as creation/destruction
is expensive in general.
With this change cpu usage of passthrough_hp went from ~160% to
~80% (with different values of max_idle_threads). And bonnie
values got approximately faster by 90%. This is a with single
threaded bonnie++
bonnie++ -x 4 -q -s0 -d <path> -n 30:1:1:10 -r 0
Without this patch, using the default max_idle_threads=10 and just
a single bonnie++ the thread creation/destruction code path is not
triggered. Just one libfuse and one application thread is just
a corner case - the requirement for the issue was just
n-application-threads >= max_idle_threads.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
struct fuse_loop_config was passed as a plain struct, without any
version identifer. This had two implications
1) Any addition of new parameters required a FUSE_SYMVER for
fuse_session_loop_mt() and fuse_loop_mt() as otherwise a read
beyond end-of previous struct size might have happened.
2) Filesystems also might have been recompiled and the developer
might not have noticed the struct extensions and unexpected for
the developer (or people recomliling the code) uninitialized
parameters would have been passed.
Code is updated to have struct fuse_loop_config as an opaque/private
data type for file systems that want version 312
(FUSE_MAKE_VERSION(3, 12)). The deprecated fuse_loop_config_v1
is visible, but should not be used outside of internal
conversion functions
File systems that want version >= 32 < 312 get the previous
struct (through ifdefs) and the #define of fuse_loop_mt
and fuse_session_loop_mt ensures that these recompiled file
systems call into the previous API, which then converts
the struct. This is similar to existing compiled applications
when just libfuse updated, but binaries it is solved with
the FUSE_SYMVER ABI compact declarations.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Allow requesting from kernel to avoid flush on close at file open
time. If kernel does not support FOPEN_NOFLUSH flag, the request
will be ignored.
For passthrough_hp example, request to avoid flush on close when
writeback cache is disabled and file is opened O_RDONLY.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
This commit defines a new capability called `FUSE_CAP_CACHE_SYMLINKS`.
It is off by default but you can now enable it by setting this flag in
in the `want` field of the `fuse_conn_info` structure.
When enabled, the kernel will save symlinks in its page cache,
by making use of the feature introduced in kernel 4.20:
5571f1e654
Introduce an API for custom log handler functions. This allows libfuse
applications to send messages to syslog(3) or other logging systems.
See include/fuse_log.h for details.
Convert libfuse from fprintf(stderr, ...) to log_fuse(level, ...). Most
messages are error messages with FUSE_LOG_ERR log level. There are also
some debug messages which now use the FUSE_LOG_DEBUG log level.
Note that lib/mount_util.c is used by both libfuse and fusermount3.
Since fusermount3 does not link against libfuse, we cannot call
fuse_log() from lib/mount_util.c. This file will continue to use
fprintf(stderr, ...) until someone figures out how to split it up.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>