Follow-up for ade0789fab
The change in behavior was partly intentional, as I think
if both --wait and --pty are used, manually disconnecting
from PTY forwarder should not result in systemd-run exiting
with "Finished with ..." log. But we should check for
--wait here.
Closes#32953
Fixes https://github.com/systemd/systemd/issues/28514.
Quoting https://github.com/systemd/systemd/issues/28514#issuecomment-1831781486:
> Whenever PAM is enabled for a service, we set up the PAM session and then
> fork off a process whose only job is to eventually close the PAM session when
> the service dies. That services we run with service privileges, both to
> minimize attack surface and because we want to use PR_SET_DEATHSIG to be get
> a notification via signal whenever the main process dies. But that only works
> if we have the same credentials as that main process.
>
> Now, if pam_systemd runs inside the PAM stack (which it normally does) it's
> session close hook will ask logind to synchronously end the session via a bus
> call. Currently that call is not accessible to unprivileged clients. And
> that's the part we need to relax: allow users to end their own sessions.
The check is implemented in a way that allows the kill if the sender is in
the target session.
I found 'sudo systemctl --user -M "zbyszek@" is-system-running' to
be a convenient reproducer.
Before:
May 16 16:25:26 x1c systemd[1]: run-u24754.service: Deactivated successfully.
May 16 16:25:26 x1c dbus-broker[1489]: A security policy denied :1.24757 to send method call /org/freedesktop/login1:org.freedesktop.login1.Manager.ReleaseSession to org.freedesktop.login1.
May 16 16:25:26 x1c (sd-pam)[3036470]: pam_systemd(login:session): Failed to release session: Access denied
May 16 16:25:26 x1c systemd[1]: Stopping session-114.scope...
May 16 16:25:26 x1c systemd[1]: session-114.scope: Deactivated successfully.
May 16 16:25:26 x1c systemd[1]: Stopped session-114.scope.
May 16 16:25:26 x1c systemd[1]: session-c151.scope: Deactivated successfully.
May 16 16:25:26 x1c systemd-logind[1513]: Session c151 logged out. Waiting for processes to exit.
May 16 16:25:26 x1c systemd-logind[1513]: Removed session c151.
After:
May 16 17:02:15 x1c systemd[1]: run-u24770.service: Deactivated successfully.
May 16 17:02:15 x1c systemd[1]: Stopping session-115.scope...
May 16 17:02:15 x1c systemd[1]: session-c153.scope: Deactivated successfully.
May 16 17:02:15 x1c systemd[1]: session-115.scope: Deactivated successfully.
May 16 17:02:15 x1c systemd[1]: Stopped session-115.scope.
May 16 17:02:15 x1c systemd-logind[1513]: Session c153 logged out. Waiting for processes to exit.
May 16 17:02:15 x1c systemd-logind[1513]: Removed session c153.
Edit: this seems to also fix https://github.com/systemd/systemd/issues/8598.
It seems that with the call to ReleaseSession, we wait for the pam session
close hooks to finish. I inserted a 'sleep(10)' after the call to ReleaseSession
in pam_systemd, and things block on that, nothing is killed prematurely.
We have the following timestamp status:
$ systemctl show systemd-fsck-root.service | grep InactiveExitTimestamp
InactiveExitTimestamp=Thu 2023-11-02 12:27:24 CET
InactiveExitTimestampMonotonic=15143158
$ systemctl show | grep UserspaceTimestamp
UserspaceTimestamp=Thu 2023-11-02 12:27:25 CET
UserspaceTimestampMonotonic=15804273
i.e. UserspaceTimestamp is before InactiveExit of systemd-fsck-root.service.
This is fine, but on display, we'd subtract those values and print a huge
negative value bogusly:
$ build/systemd-analyze critical-chain systemd-remount-fs.service
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.
systemd-remount-fs.service +137ms
└─systemd-fsck-root.service @584542y 2w 2d 20h 1min 48.890s +45ms
└─systemd-journald.socket
└─system.slice
└─-.slice
In fact, list_dependencies_print() already had a branch where the check that
'times->activating > boot->userspace_time', but it didn't cover all cases. So
make it cover both branches, and also change to '>=', since it's fine if
something happened with the same timestamp.
With the patch:
$ build/systemd-analyze critical-chain systemd-remount-fs.service
The time when unit became active or started is printed after the "@" character.
The time the unit took to start is printed after the "+" character.
systemd-remount-fs.service +42ms
└─systemd-fsck-root.service
└─systemd-journald.socket
└─system.slice
└─-.slice
Fixes https://github.com/systemd/systemd/issues/17191.
When running inside an LXC container the 'su' process will not be part of
any unit or slice.
manager_get_user_by_pid() which was used until v255 (included) does not fail
if it cannot find a unit/slice, but simply returns 'not found'. Do the same
in manager_get_session_by_pidref().
This was not detected as Semaphore CI does not reboot the testbed before
the logind test, so the session is started by the old logind from the base
distro, instead of the one being tested.
Follow-up for 8494f562c8
Follow-up for 5099a50d43
Fixes https://github.com/systemd/systemd/issues/32929
Due to the bug in kernel 6.9 caused by
8debcf5832,
the net_id udev builtin does not work for netdevsim interface.
So, eni99np1 cannot be used with kernel 6.9 anymore.
Workaround for #32910.
If machine ID is previously stored at /run/machine-id, then let's reuse
it. This is important on switching root and /etc/machine-id was previously
a mount point.
Fixes#32908.
Fixes a bug introduced by 1ddb263d21.
Note, this requires the previous two commits, and cannot backport without them.
Note, before the previous commit, the use-after-free could be triggered
only by Rename() DBus method, and could not by RenameImage(), as we did not
cache Image object when RenameImage() method is called. And machinectl
always uses RenameImage(). Hence, the issue could be triggered only when
Rename() DBus method is explicitly called by e.g. busctl.
With the previous commit, the Image object passed to the function is
always cached. Hence, the issue could be triggered even with machinectl
command, and this fix is important.
Previously, Image objects were only cached when reading properties or
methods in the org.freedesktop.machine1.Image interface are called.
This makes that, when a method in the main interface (org.freedesktop.machine1)
for an image is called, also acquire the Image object from the cache,
and if not cached, create Image object and put into the cache, like we
do for org.freedesktop.machine1.Image.
Otherwise, if some properties of an image are updated by methods in the main
interface, e.g. MarkImageReadOnly(), the changes do not applied to the cached
Image object, and subsequent read of proerties through the interface for the
image, e.g. ReadOnly property, may provide outdated values.
Follow-up for 1ddb263d21.
Fixes#32888.
Otherwise, ReadOnly DBus property in org.freedesktop.machine1.Image or
org.freedesktop.portable1.Image will not be updated by MarkReadOnly DBus
method.
The rationale is similar to 40e1f4ea74.
Currently, we only pass TTYPath=/dev/pts/... to
the transient service spawned by systemd-run.
This is a bit problematic though, when ExecStartPre=
or ExecStopPost= is used. Since when these control
processes get to run, the main process is not yet
started/has already exited, hence the slave suffers
from the same vhangup problem as the mentioned commit.
By passing the slave fd in, the service manager will
hold the fd open as long as the service is alive.
Fixes#32916
Follow-up for 6c2d47d6d3.
Fixes the following unexpected skip:
```
[ 6.163670] TEST-64-UDEV-STORAGE.sh[596]: + modinfo btrfs
[ 6.164102] TEST-64-UDEV-STORAGE.sh[726]: /usr/lib/systemd/tests/testdata/units/TEST-64-UDEV-STORAGE.sh: line 726: modinfo: command not found
[ 6.164683] TEST-64-UDEV-STORAGE.sh[727]: + echo 'This test requires the btrfs kernel module but it is not installed, skipping the test'
[ 6.165069] TEST-64-UDEV-STORAGE.sh[728]: + tee --append /skipped
[ 6.166801] TEST-64-UDEV-STORAGE.sh[728]: This test requires the btrfs kernel module but it is not installed, skipping the test
[ 6.167177] TEST-64-UDEV-STORAGE.sh[596]: + exit 77
```