The io for FUSE requests and responses can now be further customized by allowing to write custom functions for reading/writing the responses. This includes overriding the splice io.
The reason for this addition is that having a custom file descriptor is not sufficient to allow custom io. Different types of file descriptor require different mechanisms of io interaction. For example, some file descriptor communication has boundaries (SOCK_DGRAM, EOF, etc...), while other types of fd:s might be unbounded (SOCK_STREAMS, ...). For unbounded communication, you have to read the header of the FUSE request first, and then read the remaining packet data. Furthermore, the one read call does not necessarily return all the data expected, requiring further
calls in a loop.
This test is too simple to check for all functionalities of notify_expire as it always successfully passes when libfuse supports the function (even if kernel does not support it - it just takes it as notify_inval)
fuse_loop_mt and fuse_new had not been defined when
HAVE_LIBC_VERSIONED_SYMBOLS had not been set and additionally,
fuse_new_31 was missing in the version script and was therefore
an unusable symbol.
This also adds a test for unset HAVE_LIBC_VERSIONED_SYMBOLS.
In fact only gnu-libc fully supports symbol versioning, so it is
better to have a generic macro for it. This also allows to manually
disable symbol version and allows to run tests with that
configuration on gnu-libc. That testing will still not catch compat
issues, but least ensures the code can compile.
Testing for __APPLE__ and __ULIBC__ is now done by meson. More of such
checks can be added by people using other libcs.
For __APPLE__ and __ULIBC__, which are assumed to not support
versioned symbols, helper.c has a compat ABI symbol for
fuse_parse_cmdline(). However that ABI symbol was conflicting
with the API macro (which redirects to the right API function
for recompilations against current libfuse).
Additionally the parameter 'opts' had a typo and was called
'out_opts'.
As described in https://github.com/libfuse/libfuse/issues/695 and below, partial
locking of paths can cause a deadlock. Partial locking was added to prevent
starvation, but it's unclear what specific cases of starvation were of concern.
As far as I was able to determine, since we support reader locks that give
priority to writers (to prevent starvation), this means that to starve the queue
element, we'd need a constant stream of queued requests that lock the path for
write. Write locks are used when the element is being (potentially) removed, so
this stream of requests that starve the `rename` or `lock` operations seems
unlikely.
### Summarizing issue #695
The high-level API handles locking of the node structures it maintains to
prevent concurrent requests from deleting nodes that are in use by other
requests. This means that requests that might remove these structs (`rmdir`,
`rename`, `unlink`, `link`) need to acquire an (internally managed - not
pthread) exclusive lock before doing so. In the case where the lock is already
held (for read or for write), the operation is placed onto a queue of waiters.
On every unlock, the queue is reinspected for any element that might now be able
to make progress.
Since `rename` and `link` involve two paths, when added to the queue, a single
queue entry requires that we lock two different paths. There was, prior to this
change, support for partially locking the first queue element if it had two
paths to lock. This partial locking can cause a deadlock:
- set up a situation where the first element in the queue is partially locked
(such as by holding a reader lock on one of the paths being renamed, but not
the other). For example: `/rmthis/foo/foo.txt` [not-yet-locked] and
`/rmthis/bar/bar.txt` [locked]
- add an `rmdir` for an ancestor directory of the not-yet-locked path to the
queue. In this example: `/rmthis`
After getting into this situation, we have the following `treelock` values:
- `/rmthis`: 1 current reader (due to the locked `/rmthis/bar/bar.txt`), one
waiting writer (`rmdir`): no new readers will acquire a read lock here.
- `/rmthis/bar`: 1 reader (the locked `/rmthis/bar/bar.txt`)
- `/rmthis/bar/bar.txt`: 1 writer (the locked `/rmthis/bar/bar.txt`)
This is deadlocked, because the partial lock will never be able to be completely
locked, as doing so would require adding a reader lock on `/rmthis`, and that
will be rejected due to write lock requests having priority -- until the writer
succeeds in locking it, no new readers can be added. However, the writer (the
`rmdir`) will never be able to acquire its write lock, as the reader lock will
never be dropped -- there's no support for downgrading a partially locked
element to be unlocked, the only state change that's allowed involves it
becoming completely locked.
autofs uses automount, which calls fuse, during an sshfs call. fuse complains about -n being an unknown option (ref. https://github.com/libfuse/libfuse/issues/715)
this one line edit provides the command to be accepted, and pass through, allowing autofs-automount to operate on the mount, even though it is already in the mtab, given the nature of autofs/automount.
It may happen that none of the worker threads are running
if max_idle_threads is set to 0 although few people will do this.
Adding a limit of keeping at least one worker thread will make
our code more rigorous.
Signed-off-by: Zhansong Gao <zhsgao@hotmail.com>
There was a simple typo and sym1 didn't match the function name
with the older __asm__(".symver " sym1 "," sym2) way to define
ABI compatibility.
Witht the newer "__attribute__ ((symver (sym2)))" sym1 is not used
at all and in manual testing the issue didn't come up therefore.
If we get the interrupt before the fuse op, the fuse_req is deleted without
decrementing the refcount on the cloned file descriptor. This leads to a
leak of the cloned /dev/fuse file descriptor.
On benchmarking metadata operations with a single threaded bonnie++
and "max_idle_threads" limited to 1, 'top' was showing suspicious
160% cpu usage.
Profiling the system with flame graphs showed that an astonishing
amount of CPU time was spent in thread creation and destruction.
After verifying the code it turned out that fuse_do_work() was
creating a new thread every time all existing idle threads
were already busy. And then just a few lines later after processing
the current request it noticed that it had created too many threads
and destructed the current thread. I.e. there was a thread
creation/destruction ping-pong.
Code is changed to only create new threads if the max number of
threads is not reached.
Furthermore, thread destruction is disabled, as creation/destruction
is expensive in general.
With this change cpu usage of passthrough_hp went from ~160% to
~80% (with different values of max_idle_threads). And bonnie
values got approximately faster by 90%. This is a with single
threaded bonnie++
bonnie++ -x 4 -q -s0 -d <path> -n 30:1:1:10 -r 0
Without this patch, using the default max_idle_threads=10 and just
a single bonnie++ the thread creation/destruction code path is not
triggered. Just one libfuse and one application thread is just
a corner case - the requirement for the issue was just
n-application-threads >= max_idle_threads.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
struct fuse_loop_config was passed as a plain struct, without any
version identifer. This had two implications
1) Any addition of new parameters required a FUSE_SYMVER for
fuse_session_loop_mt() and fuse_loop_mt() as otherwise a read
beyond end-of previous struct size might have happened.
2) Filesystems also might have been recompiled and the developer
might not have noticed the struct extensions and unexpected for
the developer (or people recomliling the code) uninitialized
parameters would have been passed.
Code is updated to have struct fuse_loop_config as an opaque/private
data type for file systems that want version 312
(FUSE_MAKE_VERSION(3, 12)). The deprecated fuse_loop_config_v1
is visible, but should not be used outside of internal
conversion functions
File systems that want version >= 32 < 312 get the previous
struct (through ifdefs) and the #define of fuse_loop_mt
and fuse_session_loop_mt ensures that these recompiled file
systems call into the previous API, which then converts
the struct. This is similar to existing compiled applications
when just libfuse updated, but binaries it is solved with
the FUSE_SYMVER ABI compact declarations.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
meson was complaining:
Build targets in project: 27
NOTICE: Future-deprecated features used:
* 0.56.0: {'Dependency.get_pkgconfig_variable'}
So change to .get_variable(pkgconfig : 'type' and also increase
the meson minimal version to be able to handle it.
Co-authored-by: Bernd Schubert <bschubert@ddn.com>
meson configure -D buildtype=debugoptimized
meson configure -D b_sanitize=address,undefined
Results in '-fsanitize=address,undefined ... -O2 -g' that made
compilation to give errors with recent gcc versions.
bernd@t1700bs build-ubuntu>ninja -v
[1/2] ccache gcc -Itest/test_syscalls.p -Itest -I../test -Iinclude -I../include -Ilib -I../lib -I. -I.. -fdiagnostics-color=always -fsanitize=address,undefined -fno-omit-frame-pointer -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -O2 -g -D_REENTRANT -DHAVE_CONFIG_H -Wno-sign-compare -Wstrict-prototypes -Wmissing-declarations -Wwrite-strings -fno-strict-aliasing -Wno-unused-result -DHAVE_SYMVER_ATTRIBUTE -MD -MQ test/test_syscalls.p/test_syscalls.c.o -MF test/test_syscalls.p/test_syscalls.c.o.d -o test/test_syscalls.p/test_syscalls.c.o -c ../test/test_syscalls.c
FAILED: test/test_syscalls.p/test_syscalls.c.o
ccache gcc -Itest/test_syscalls.p -Itest -I../test -Iinclude -I../include -Ilib -I../lib -I. -I.. -fdiagnostics-color=always -fsanitize=address,undefined -fno-omit-frame-pointer -pipe -D_FILE_OFFSET_BITS=64 -Wall -Winvalid-pch -Wextra -Werror -O2 -g -D_REENTRANT -DHAVE_CONFIG_H -Wno-sign-compare -Wstrict-prototypes -Wmissing-declarations -Wwrite-strings -fno-strict-aliasing -Wno-unused-result -DHAVE_SYMVER_ATTRIBUTE -MD -MQ test/test_syscalls.p/test_syscalls.c.o -MF test/test_syscalls.p/test_syscalls.c.o.d -o test/test_syscalls.p/test_syscalls.c.o -c ../test/test_syscalls.c
In file included from /usr/include/string.h:519,
from ../test/test_syscalls.c:7:
In function ‘strncpy’,
inlined from ‘test_socket’ at ../test/test_syscalls.c:1885:2,
inlined from ‘main’ at ../test/test_syscalls.c:2030:9:
/usr/include/x86_64-linux-gnu/bits/string_fortified.h:95:10: error: ‘__builtin_strncpy’ output may be truncated copying 107 bytes from a string of length 1023 [-Werror=stringop-truncation]
95 | return __builtin___strncpy_chk (__dest, __src, __len,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
96 | __glibc_objsize (__dest));
| ~~~~~~~~~~~~~~~~~~~~~~~~~
I disagree a bit on the gcc sanity here, as the code was behaving
correctly and even already checked the string length. But sice
the string length is already verified, that length can be used
for the final strncpy.
In fuse kernel, 'commit 53db28933e95 ("fuse: extend init flags")'
made the changes to handle flags going beyond 32 bits but i think
changes were not done in libfuse to handle the same.
This patch prepares the ground in libfuse for incoming FUSE kernel
patches (Atomic open + lookup) where flags went beyond 32 bits.
It makes struct same as in fuse kernel resulting in name change of
few fields.
sfs_unlink may call do_lookup(), which increases the inode ref count,
but since that function does not return attributes that lookup ref
count won't get automatically decreased.
do_interrupt would destroy_req on the request without decrementing the
channel's refcount. With clone_fd this could leak file descriptors if
the worker thread holding the cloned fd was destroyed. (Only
max_idle_threads are kept).
When returning a negative error code by ->ioctl() to the high level
interface, no error is propagated to the low level, and the reply
message to the kernel is shown as successful.
A negative result is however returned to kernel, so the kernel can
detect the bad condition, but this appears to not be the case since
kernel 5.15.
The proposed fix is more in line with the usual processing of errors
in fuse, taking into account that ioctl(2) always returns a non-negative
value in the absence of errors.
Co-authored-by: Jean-Pierre André <jpandre@users.sourceforge.net>
Simulate write() delay and verify that close(rofd) does not
block waiting on pending writes.
The support for the flag was added in kernel v5.16-rc1.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Allow requesting from kernel to avoid flush on close at file open
time. If kernel does not support FOPEN_NOFLUSH flag, the request
will be ignored.
For passthrough_hp example, request to avoid flush on close when
writeback cache is disabled and file is opened O_RDONLY.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
When directories with open handles are removed, the releasedir and
fsyncdir operations might be called with a NULL path. That is because
there is no hiding behavior like for regular files and the nodes get
removed immediately.