A previous PR supported extended max writes (eg write requests larger than 1 MB)
by initializing the fuse session buffer size to use the max_pages_limit set in
/proc/sys/fs/fuse. However, this is a huge problem for machines where multiple
fuse servers may be running but only one server needs large writes. In this case,
a lot of memory will be wasted and will lead to OOM issues.
This PR does a reallocation of the session buffer transparently if the server set
"se->conn.max_write" to a value larger than 1 MiB. This is only for buffers that
are "owned" by libfuse - if the server wishes to provide its own allocated buffer
for receiving/processing requests, then it should ensure that buffer is allocated
to the proper size from the start.
Local testing showed:
echo 65535 | sudo tee /proc/sys/fs/fuse/max_pages_limit
dd if=/dev/urandom of=hello_file bs=6M count=2
write requests:
write request size is 5242880
write request size is 1048576
write request size is 5242880
write request size is 1048576
On some runs:
Run pip install --break-system-packages -r requirements.txt
....
no such option: --break-system-packages
Error: Process completed with exit code 2.
On other runs it refuses to install and asks for that option
as it refuses to override system packages.
Also require ubuntu-latest only, as MacOS is not supported at all
by libfuse.
Updating the mtab on Android fails due to differences in toybox's mount
command. Setting IGNORE_MTAB avoids that issue.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
max_write can be limited by se->op.init() and by the buffer size,
we use the minimum of these two.
Required se->bufsize is then set according to the determined
max_write. The current thread will have the old buffer size,
though, as it already had to the allocation to handle the
FUSE_INIT call (unless splice is used and ths variable
and related buffer is not used at all).
The given bufsize is just a hint for minimum size, allocation
could be actually larger (for example to get huge pages).
Currently in libfuse, the buffer size for a fuse session is
capped at 1 MiB on a 4k page system. A recent patch
upstream [1] was merged that allows the max number of pages
per fuse request to be dynamically configurable through the
/proc/sys interface (/proc/sys/fs/fuse/max_pages_limit).
This commit adds support for this on the libfuse side to set
the fuse session buffer to take into account the max pages
limit set in /proc/sys/fs/fuse/max_pages_limit. If this
sysctl does not exist (eg older kernels), it will default to
old behavior (using FUSE_MAX_MAX_PAGES (256) as the max pages
limit). This allows for things like bigger write buffers per
request.
[1] https://lore.kernel.org/linux-fsdevel/20240923171311.1561917-1-joannelkoong@gmail.com/T/#t
close_range() is much more efficient.
Also remove the lower limit of 3 and set it to 0, as 0 to 1
might have been closed by the application and might be valid.
When using the auto_unmount option, the fusermount3
watchdog process retains any inherited file descriptors.
This PR ensures they are closed.
Reason is that FDs that are kept open for a long time might
cause issues for applications using libfuse, for example
if these expect a pipe close, but the pipe is kept open
through the inherited file descriptor.
See for example: https://github.com/cvmfs/cvmfs/issues/3645
Signed-off-by: MJ Harvey <mharvey@jumptrading.com>
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
sfs_open and sfs_create set fi->direct_io (FOPEN_DIRECT_IO) when
O_DIRECT is given, in order to benefit from a shared inode lock
in kernel, i.e. to get parallel DIO writes. However, kernel side
disabled passthrough when FOPEN_DIRECT_IO is set. Reads/writes
had been totally failing in this case for O_DIRECT as
sfs_write_buf() and sfs_read() have a sanity check. That sanity
check could be modified, but for performance passthrough is
better than parallel DIO, hence, we only want automatic
FOPEN_DIRECT_IO for O_DIRECT when passthrough is not enabled.
Fixes: https://github.com/libfuse/libfuse/issues/1027
This also fixes automatically switching to FOPEN_DIRECT_IO
for O_DIRECT in sfs_create().
umount2 is called with privs dropped, not raised. This
works around a clash with NFS permissions: if FUSE mounted
on NFS client directory with root_squash in effect, and
some directory in the path leading to the mount point denies
permissions to others, umount2 will fail because userid 0
cannot search it. Since drop_privs merely sets the file-
system user- and group-ID without changing the CAP_SYS_ADMIN
capability needed to unmount a file system (which fusermount
has because it is set-user-ID root), umount2 works fine.
This is an addition to commit e75d2c54a3. This example sets
FUSE_USE_VERSION = 34 but uses fuse_loop_cfg_* APIs, which is
not allowed since these APIs are not introduced in version 34.
The option to the path of the binary had been accidentally removed
in the scripts that can parse backtraces.
The dump_stack() function had left over debug messages.
This is an addition to commit a8f1ae35af, which
introduced the 312 API, but didn't set the right
API version in all examples.
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
Commit 170edc6a8e added dot and dotdot (. and ..) to readdir
results, but introduced an issue when max number of entries
was reached - lookup count must not be decreased without
doing the lookup.
With ext4 as underlying file system readir seems to return . and ..
at random offsets and randomly failed xfstests for me.
This also fixes indentation, as passthrough_hp.cc does not follow
the linux indentation style (if we decide to fix this, it needs
to be done for the entire file).
Signed-off-by: Bernd Schubert <bschubert@ddn.com>
I see random ENOTCONN failures in xfstest generic/013 and generic/014
in my branch, but earliest on the 2nd run - takes ~12hours to get
the issue, but then there are no further information logged.
ENOTCONN points to a daemon crash - I need backtraces and a core dump.
This adds optional handling of fatal signals to print a core dump
and optional syslog logging with these new public functions:
fuse_set_fail_signal_handlers()
In addition to the existing fuse_set_signal_handlers(). This is not
enabled together with fuse_set_signal_handlers(), as it is change
in behavior and file systems might already have their own fatal
handlers.
fuse_log_enable_syslog
Print logs to syslog instead of stderr
fuse_log_close_syslog
Close syslog (for now just does closelog())
Code in fuse_signals.c is also updated, to be an array of signals,
and setting signal handlers is now down with a for-loop instead
of one hand coded set_one_signal_handler() per signal.
generic/401 fails currently because it checks that "." and ".." are
listed as directory entries.
Include "." and ".." as listed directory entries in passthrough_hp's
readdir implementation.
Signed-off by: Joanne Koong <joannelkoong@gmail.com>