closefrom(3) has joined us in glibc-land from *BSD and Solaris. Since
it's available in glibc 2.34+, we want to detect it and only define our
fallback if the libc doesn't provide it.
Bug: https://bugs.gentoo.org/803923
Signed-off-by: Sam James <sam@gentoo.org>
The bug occurs when a filesystem client reads a directory until the end,
seeks using seekdir() to some valid non-zero position and calls
readdir(). A valid 'struct dirent *' is expected, but NULL is returned
instead. Pseudocode demonstrating the bug:
DIR *dp = opendir("some_dir");
struct dirent *de = readdir(dp);
/* Get offset of the second entry */
long offset = telldir(dp);
/* Read directory until the end */
while (de)
de = readdir(de);
seekdir(dp, offset);
de = readdir(dp);
/* de must contain the second entry, but NULL is returned instead */
The reason of the bug is that when the end of directory is reached, the
kernel calls FUSE_READDIR op with an offset at the end of directory, so
the filesystem's .readdir callback never calls the filler function, and
we end up with dh->filled set to 1. After seekdir(), FUSE_READDIR is
called again with a new offset, but this time the filesystem's .readdir
callback is never called, and an empty reply is returned.
Fix by setting dh->filled to 1 only when zero offsets are given to
filler function.
This commit is backported from the following commit in 'master' branch:
commit 5f125c5e6b
Author: Rostislav <rostislav@users.noreply.github.com>
Date: Sat Jul 21 12:57:09 2018 +0300
Fix readdir() bug when a non-zero offset is specified in filler (#269)
Before:
$ _FUSE_COMMFD=1 priv_strace -s8000 -e trace=mount util/fusermount3 /proc/self/fd
mount("/dev/fuse", ".", "fuse", MS_NOSUID|MS_NODEV, "fd=3,rootmode=40000,user_id=379777,group_id=5001") = 0
sending file descriptor: Socket operation on non-socket
+++ exited with 1 +++
After:
$ _FUSE_COMMFD=1 priv_strace -s8000 -e trace=mount util/fusermount3 /proc/self/fd
util/fusermount3: mounting over filesystem type 0x009fa0 is forbidden
+++ exited with 1 +++
This patch could potentially have security
impact on some systems that are configured with allow_other;
see https://launchpad.net/bugs/1530566 for an example of how a similar
issue in the ecryptfs mount helper was exploitable. However, the FUSE
mount helper performs slightly different security checks, so that exact
attack doesn't work with fusermount; I don't know of any specific attack
you could perform using this, apart from faking the SELinux context of your
process when someone's looking at a process listing. Potential targets for
overwrite are (looking on a system with a 4.9 kernel):
writable only for the current process:
/proc/self/{fd,map_files}
(Yes, "ls -l" claims that you don't have write access, but that's not true;
"find -writable" will show you what access you really have.)
writable also for other owned processes:
/proc/$pid/{sched,autogroup,comm,mem,clear_refs,attr/*,oom_adj,
oom_score_adj,loginuid,coredump_filter,uid_map,gid_map,projid_map,
setgroups,timerslack_ns}
Blacklists are notoriously fragile; especially if the kernel wishes to add
some security-critical mount option at a later date, all existing systems
with older versions of fusermount installed will suddenly have a security
problem.
Additionally, if the kernel's option parsing became a tiny bit laxer, the
blacklist could probably be bypassed.
Whitelist known-harmless flags instead, even if it's slightly more
inconvenient.
If an attacker wishes to use the default configuration instead of the
system's actual configuration, they can attempt to trigger a failure in
read_conf(). This only permits increasing mount_max if it is lower than the
default, so it's not particularly interesting. Still, this should probably
be prevented robustly; bail out if funny stuff happens when we're trying to
read the config.
Note that the classic attack trick of opening so many files that the
system-wide limit is reached won't work here - because fusermount only
drops the fsuid, not the euid, the process is running with euid=0 and
CAP_SYS_ADMIN, so it bypasses the number-of-globally-open-files check in
get_empty_filp() (unless you're inside a user namespace).
The old code permits the following behavior:
$ _FUSE_COMMFD=10000 priv_strace -etrace=mount -s200 fusermount -o 'foobar=\,allow_other' mount
mount("/dev/fuse", ".", "fuse", MS_NOSUID|MS_NODEV, "foobar=\\,allow_other,fd=3,rootmode=40000,user_id=1000,group_id=1000") = -1 EINVAL (Invalid argument)
However, backslashes do not have any special meaning for the kernel here.
As it happens, you can't abuse this because there is no FUSE mount option
that takes a string value that can contain backslashes; but this is very
brittle. Don't interpret "escape characters" in places where they don't
work.
Linux performs the dir loop check (rename(a, a/b/c)
or rename(a/b/c, a), etc.) in kernel. Unfortunately
other systems do not perform this check (e.g. FreeBSD).
This results in a deadlock in get_path2, because libfuse
did not expect to handle such cases.
We add a check_dir_loop function that performs the dir
loop check in user mode and enable it on systems that
need it.
Mounting a FUSE file system remotely using SSH in combination with
pseudo-terminal allocation (-t), results in "Transport endpoint is
not connected" errors when trying to access the file system contents.
For example:
# ssh -t root@localhost "cmsfs-fuse /dev/disk/by-path/ccw-0.0.0190 /CMSFS"
Connection to localhost closed.
# ls /CMSFS
ls: cannot access '/CMSFS': Transport endpoint is not connected
The cmsfs-fuse main program (which can also be any other FUSE file
system) calls into the fuse_main() libfuse library function.
The fuse_main() function later calls fuse_daemonize() to fork the
daemon process to handle the FUSE file system I/O.
The fuse_daemonize() function calls fork() as usual. The child
proceeds with setsid() and then redirecting its file descriptors
to /dev/null etc. The parent process, simply exits.
The child's functions and the parent's exit creates a subtle race.
This is seen with an SSH connection. The SSH command above calls
cmsfs-fuse on an allocated pseudo-terminal device (-t option).
If the parent exits, SSH receives the command completion and closes
the connection, that means, it closes the master side of the
pseudo-terminal. This causes a HUP signal being sent to the process
group on the pseudo-terminal. At this point in time, the child might
not have completed the setsid() call and, hence, becomes terminated.
Note that fuse daemon sets up its signal handlers after fuse_daemonize()
has completed.
Even if the child has the chance to disassociate from its parent process
group to become it's own process group with setsid(), the child still
has the pseudo-terminal opened as stdin, stdout, and stderr. So the
pseudo-terminal still behave as controlling terminal and might cause a
SIGHUP at closing the the master side.
To solve the problem, the parent has to wait until the child (the fuse
daemon process) has completed its processing, that means, has become
its own process group with setsid() and closed any file descriptors
pointing to the pseudo-terminal.
Closes: #27
Reported-by: Ofer Baruch <oferba@il.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Mount can be used with an "-o context=" option in order to specify a
mountpoint-wide SELinux security context different from the default context
provided by the active SELinux policy.
This is useful in order to enable users to mount multiple sshfs targets under
distinct contexts, which is my main motivation for getting this patch mainlined.
Closes: #36
Up to now, the Changelog has essentially been a (manually maintained)
copy of the git commit history. This doesn't seem to have any point
other than following the GNU coding standards. I believe it's much
better to use the Changelog to summarize the release-to-release
changes that are most important for users, so this is what we'll do
from now on.
Describe why manual copying of config.rpath is necessary, and fail
with a more helpful message if it can't be found.
Remove code for systems without autoreconf - it's apparently not used
by anyone since it has been broken for quite some time (there is no
`kernel` directory anymore).