fs: properly document __lookup_mnt()

The comment on top of __lookup_mnt() states that it finds the first
mount implying that there could be multiple mounts mounted at the same
dentry with the same parent.

On older kernels "shadow mounts" could be created during mount
propagation. So if a mount @m in the destination propagation tree
already had a child mount @p mounted at @mp then any mount @n we
propagated to @m at the same @mp would be appended after the preexisting
mount @p in @mount_hashtable. This was a completely direct way of
creating shadow mounts.

That direct way is gone but there are still subtle ways to create shadow
mounts. For example, when attaching a source mnt @mnt to a shared mount.
The root of the source mnt @mnt might be overmounted by a mount @o after
we finished path lookup but before we acquired the namespace semaphore
to copy the source mount tree @mnt.

After we acquired the namespace lock @mnt is copied including @o
covering it. After we attach @mnt to a shared mount @dest_mnt we end up
propagation it to all it's peer and slaves @d. If @d already has a mount
@n mounted on top of it we tuck @mnt beneath @n. This means, we mount
@mnt at @d and mount @n on @mnt. Now we have both @o and @n mounted on
the same mountpoint at @mnt.

Explain this in the documentation as this is pretty subtle.

Reviewed-by: Seth Forshee (DigitalOcean) <sforshee@kernel.org>
Message-Id: <20230202-fs-move-mount-replace-v4-2-98f3d80d7eaa@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
This commit is contained in:
Christian Brauner 2023-05-03 13:18:40 +02:00
parent 78aa08a8ca
commit 104026c2e4
No known key found for this signature in database
GPG Key ID: 91C61BC06578DCA2

View File

@ -658,9 +658,25 @@ static bool legitimize_mnt(struct vfsmount *bastard, unsigned seq)
return false;
}
/*
* find the first mount at @dentry on vfsmount @mnt.
* call under rcu_read_lock()
/**
* __lookup_mnt - find first child mount
* @mnt: parent mount
* @dentry: mountpoint
*
* If @mnt has a child mount @c mounted @dentry find and return it.
*
* Note that the child mount @c need not be unique. There are cases
* where shadow mounts are created. For example, during mount
* propagation when a source mount @mnt whose root got overmounted by a
* mount @o after path lookup but before @namespace_sem could be
* acquired gets copied and propagated. So @mnt gets copied including
* @o. When @mnt is propagated to a destination mount @d that already
* has another mount @n mounted at the same mountpoint then the source
* mount @mnt will be tucked beneath @n, i.e., @n will be mounted on
* @mnt and @mnt mounted on @d. Now both @n and @o are mounted at @mnt
* on @dentry.
*
* Return: The first child of @mnt mounted @dentry or NULL.
*/
struct mount *__lookup_mnt(struct vfsmount *mnt, struct dentry *dentry)
{