Since d8f9686c0f we use the chattr +i flag
for marking containers in directories as reead-only. But to do so we
need the cap for it, hence grant it.
Fixes: #19115
I'm working on building initramfs images directly from normal packages, and it
doesn't make sense for those units to be started. Pristine system rpms need to
behave correctly as much as possible also in the initrd, and those units are
enabled by the rpms. There usually isn't enough time for the timer to actually
fire, but starting it gives a line on the console and generally looks confusing
and sloppy. Flushing the journal means that its actually lost, since the real
/var is not available yet.
Another approach would be not enable those units, but right now they are
statically enabled, and changing that would be more work, and doesn't really
seem necessary, since the condition checks are very quick.
Checking for /etc/initrd-release is the standard condition that the initrd
units use, so let's do the same here.
The comment talks about upstream development steps and doesn't make
sense for users. We used special '## ' syntax to strip it out during
build, but it got inadvertently reformatted as a normal comment
in 3982becc92.
We don't need two (and half) templating systems anymore, yay!
I'm keeping the changes minimal, to make the diff manageable. Some enhancements
due to a better templating system might be possible in the future.
For handling of '## ' — see the next commit.
Old meson fails with:
Element not a string: [<Holder: <ExternalProgram 'sh' -> ['/bin/sh']>>, '-c', 'test -n "$DESTDIR" || /bin/journalctl --update-catalog']
I'm doing it as a revert so that it's easy to undo the revert when we require
newer meson. The effect is not so bad, maybe a dozen or so lines about finding
'sh'.
Meson 0.58 has gotten quite bad with emitting a message every time
a quoted command is used:
Program /home/zbyszek/src/systemd-work/tools/meson-make-symlink.sh found: YES (/home/zbyszek/src/systemd-work/tools/meson-make-symlink.sh)
Program sh found: YES (/usr/bin/sh)
Program sh found: YES (/usr/bin/sh)
Program sh found: YES (/usr/bin/sh)
Program sh found: YES (/usr/bin/sh)
Program sh found: YES (/usr/bin/sh)
Program sh found: YES (/usr/bin/sh)
Program xsltproc found: YES (/usr/bin/xsltproc)
Configuring custom-entities.ent using configuration
Message: Skipping bootctl.1 because ENABLE_EFI is false
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Message: Skipping journal-remote.conf.5 because HAVE_MICROHTTPD is false
Message: Skipping journal-upload.conf.5 because HAVE_MICROHTTPD is false
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Message: Skipping loader.conf.5 because ENABLE_EFI is false
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
...
Let's suffer one message only for each command. Hopefully we can silence
even this when https://github.com/mesonbuild/meson/issues/8642 is
resolved.
This reverts commit 7c20dd4b6e.
Debian has now been updated to patch the issue, so SemaphoreCI should
no longer fail. The fix has also been backported to the affected
stable branches.
Otherwise a coredump started at the inconvinient moment can stop
shutdown.target leaving the system in a halfway-down state:
Pulling in shutdown.target/start from systemd-poweroff.service/start
Added job shutdown.target/start to transaction.
...
Keeping job shutdown.target/start because of systemd-poweroff.service/start
...
[ OK ] Stopped target Remote File Systems.
shutdown.target: starting held back, waiting for: systemd-networkd.socket
sysinit.target: stopping held back, waiting for: remount_tmp.service
systemd-coredump.socket: Incoming traffic
...
systemd-coredump@0-243-0.service: Trying to enqueue job systemd-coredump@0-243-0.service/start/replace
Added job systemd-coredump@0-243-0.service/start to transaction.
Pulling in systemd-journald.socket/start from systemd-coredump@0-243-0.service/start
Added job systemd-journald.socket/start to transaction.
Pulling in system.slice/start from systemd-journald.socket/start
Added job system.slice/start to transaction.
Pulling in -.slice/start from system.slice/start
Added job -.slice/start to transaction.
Pulling in system-systemd\x2dcoredump.slice/start from systemd-coredump@0-243-0.service/start
Added job system-systemd\x2dcoredump.slice/start to transaction.
Pulling in system.slice/start from system-systemd\x2dcoredump.slice/start
Pulling in shutdown.target/stop from system-systemd\x2dcoredump.slice/start
Added job shutdown.target/stop to transaction.
...
Keeping job systemd-poweroff.service/stop because of umount.target/stop
Keeping job shutdown.target/stop because of systemd-coredump@0-243-0.service/start
This changes the fstab-generator to handle mounting of /usr/ a bit
differently than before. Instead of immediately mounting the fs to
/sysroot/usr/ we'll first mount it to /sysusr/usr/ and then add a
separate bind mount that mounts it from /sysusr/usr/ to /sysroot/usr/.
This way we can access /usr independently of the root fs, without for
waiting to be mounted via the /sysusr/ hierarchy. This is useful for
invoking systemd-repart while a root fs doesn't exist yet and for
creating it, with partition data read from the /usr/ hierarchy.
This introduces a new generic target initrd-usr-fs.target that may be
used to generically order services against /sysusr/ to become available.
systemd-networkd.socket can re-start systemd-networkd.service in
shutdown and by doing this even stop shutdown.target leaving the
system in halfway-down state.
Fixes#4955.
Single-param LoadCredential= in units causes systemd v247/v248 to
assert when parsing. Disable it for now, until the fix is merged
in the stable trees, released and available (eg: in Debian
for the CI)
See: https://github.com/systemd/systemd/issues/19178
With 8f20232fcb systemd-localed supports
generating locales when required. This fails if the locale directory is
read-only, so make it writable.
Closes#19138
Let's make use of our own credentials infrastructure in our tools: let's
hook up systemd-sysusers with the credentials logic, so that the root
password can be provisioned this way. This is really useful when working
with stateless systems, in particular nspawn's "--volatile=yes" switch,
as this works now:
# systemd-nspawn -i foo.raw --volatile=yes --set-credential=passwd.plaintext-password:foo
For the first time we have a nice, non-interactive way to provision the
root password for a fully stateless system from the container manager.
Yay!
We have a chicken and egg problem: validation of DNSSEC signatures
doesn't work without a correct clock, but to set the correct clock we
need to contact NTP servers which requires resolving a hostname, which
would normally require DNSSEC validation.
Let's break the cycle by excluding NTP hostname resolution from
validation for now.
Of course, this leaves NTP traffic unprotected. To cover that we need
NTPSEC support, which we can add later.
Fixes: #5873#15607
Even though many of those scripts are very simple, it is easier to include
the header than to try to say whether each of those files is trivial enough
not to require one.
We'll leave this as opt-in (i.e. a unit that must be enabled
explicitly), since this is supposed to be a debug/developer feature
primarily, and thus no be around in regular production systems.
This adds the support for veritytab.
The veritytab file contains at most five fields, the first four are
mandatory, the last one is optional:
- The first field contains the name of the resulting verity volume; its
block device is set up /dev/mapper/</filename>.
- The second field contains a path to the underlying block data device,
or a specification of a block device via UUID= followed by the UUID.
- The third field contains a path to the underlying block hash device,
or a specification of a block device via UUID= followed by the UUID.
- The fourth field is the roothash in hexadecimal.
- The fifth field, if present, is a comma-delimited list of options.
The following options are recognized only: ignore-corruption,
restart-on-corruption, panic-on-corruption, ignore-zero-blocks,
check-at-most-once and root-hash-signature. The others options will
be implemented later.
Also, this adds support for the new kernel verity command line boolean
option "veritytab" which enables the read for veritytab, and the new
environment variable SYSTEMD_VERITYTAB which sets the path to the file
veritytab to read.
Instead of invoking meson-add-wants.sh once for each wants that has
to be added, we pass all wants to a single invocation of
meson-add-wants.sh and in meson-add-wants.sh, loop over the
arguments.
This saves about 300ms on the install step.
Before:
```
‣ Running build script...
[1/418] Generating version.h with a custom command
Installing /root/build/po/be.gmo to /root/dest/usr/share/locale/be/LC_MESSAGES/systemd.mo
Installing /root/build/po/be@latin.gmo to /root/dest/usr/share/locale/be@latin/LC_MESSAGES/systemd.mo
Installing /root/build/po/bg.gmo to /root/dest/usr/share/locale/bg/LC_MESSAGES/systemd.mo
Installing /root/build/po/ca.gmo to /root/dest/usr/share/locale/ca/LC_MESSAGES/systemd.mo
Installing /root/build/po/cs.gmo to /root/dest/usr/share/locale/cs/LC_MESSAGES/systemd.mo
Installing /root/build/po/da.gmo to /root/dest/usr/share/locale/da/LC_MESSAGES/systemd.mo
Installing /root/build/po/de.gmo to /root/dest/usr/share/locale/de/LC_MESSAGES/systemd.mo
Installing /root/build/po/el.gmo to /root/dest/usr/share/locale/el/LC_MESSAGES/systemd.mo
Installing /root/build/po/es.gmo to /root/dest/usr/share/locale/es/LC_MESSAGES/systemd.mo
Installing /root/build/po/fr.gmo to /root/dest/usr/share/locale/fr/LC_MESSAGES/systemd.mo
Installing /root/build/po/gl.gmo to /root/dest/usr/share/locale/gl/LC_MESSAGES/systemd.mo
Installing /root/build/po/hr.gmo to /root/dest/usr/share/locale/hr/LC_MESSAGES/systemd.mo
Installing /root/build/po/hu.gmo to /root/dest/usr/share/locale/hu/LC_MESSAGES/systemd.mo
Installing /root/build/po/id.gmo to /root/dest/usr/share/locale/id/LC_MESSAGES/systemd.mo
Installing /root/build/po/it.gmo to /root/dest/usr/share/locale/it/LC_MESSAGES/systemd.mo
Installing /root/build/po/ja.gmo to /root/dest/usr/share/locale/ja/LC_MESSAGES/systemd.mo
Installing /root/build/po/ko.gmo to /root/dest/usr/share/locale/ko/LC_MESSAGES/systemd.mo
Installing /root/build/po/lt.gmo to /root/dest/usr/share/locale/lt/LC_MESSAGES/systemd.mo
Installing /root/build/po/pl.gmo to /root/dest/usr/share/locale/pl/LC_MESSAGES/systemd.mo
Installing /root/build/po/pt_BR.gmo to /root/dest/usr/share/locale/pt_BR/LC_MESSAGES/systemd.mo
Installing /root/build/po/ro.gmo to /root/dest/usr/share/locale/ro/LC_MESSAGES/systemd.mo
Installing /root/build/po/ru.gmo to /root/dest/usr/share/locale/ru/LC_MESSAGES/systemd.mo
Installing /root/build/po/sk.gmo to /root/dest/usr/share/locale/sk/LC_MESSAGES/systemd.mo
Installing /root/build/po/sr.gmo to /root/dest/usr/share/locale/sr/LC_MESSAGES/systemd.mo
Installing /root/build/po/sv.gmo to /root/dest/usr/share/locale/sv/LC_MESSAGES/systemd.mo
Installing /root/build/po/tr.gmo to /root/dest/usr/share/locale/tr/LC_MESSAGES/systemd.mo
Installing /root/build/po/uk.gmo to /root/dest/usr/share/locale/uk/LC_MESSAGES/systemd.mo
Installing /root/build/po/zh_CN.gmo to /root/dest/usr/share/locale/zh_CN/LC_MESSAGES/systemd.mo
Installing /root/build/po/zh_TW.gmo to /root/dest/usr/share/locale/zh_TW/LC_MESSAGES/systemd.mo
Installing /root/build/po/pa.gmo to /root/dest/usr/share/locale/pa/LC_MESSAGES/systemd.mo
real 0m1.465s
user 0m1.025s
sys 0m0.426s
```
After:
```
‣ Running build script...
[1/418] Generating version.h with a custom command
Installing /root/build/po/be.gmo to /root/dest/usr/share/locale/be/LC_MESSAGES/systemd.mo
Installing /root/build/po/be@latin.gmo to /root/dest/usr/share/locale/be@latin/LC_MESSAGES/systemd.mo
Installing /root/build/po/bg.gmo to /root/dest/usr/share/locale/bg/LC_MESSAGES/systemd.mo
Installing /root/build/po/ca.gmo to /root/dest/usr/share/locale/ca/LC_MESSAGES/systemd.mo
Installing /root/build/po/cs.gmo to /root/dest/usr/share/locale/cs/LC_MESSAGES/systemd.mo
Installing /root/build/po/da.gmo to /root/dest/usr/share/locale/da/LC_MESSAGES/systemd.mo
Installing /root/build/po/de.gmo to /root/dest/usr/share/locale/de/LC_MESSAGES/systemd.mo
Installing /root/build/po/el.gmo to /root/dest/usr/share/locale/el/LC_MESSAGES/systemd.mo
Installing /root/build/po/es.gmo to /root/dest/usr/share/locale/es/LC_MESSAGES/systemd.mo
Installing /root/build/po/fr.gmo to /root/dest/usr/share/locale/fr/LC_MESSAGES/systemd.mo
Installing /root/build/po/gl.gmo to /root/dest/usr/share/locale/gl/LC_MESSAGES/systemd.mo
Installing /root/build/po/hr.gmo to /root/dest/usr/share/locale/hr/LC_MESSAGES/systemd.mo
Installing /root/build/po/hu.gmo to /root/dest/usr/share/locale/hu/LC_MESSAGES/systemd.mo
Installing /root/build/po/id.gmo to /root/dest/usr/share/locale/id/LC_MESSAGES/systemd.mo
Installing /root/build/po/it.gmo to /root/dest/usr/share/locale/it/LC_MESSAGES/systemd.mo
Installing /root/build/po/ja.gmo to /root/dest/usr/share/locale/ja/LC_MESSAGES/systemd.mo
Installing /root/build/po/ko.gmo to /root/dest/usr/share/locale/ko/LC_MESSAGES/systemd.mo
Installing /root/build/po/lt.gmo to /root/dest/usr/share/locale/lt/LC_MESSAGES/systemd.mo
Installing /root/build/po/pl.gmo to /root/dest/usr/share/locale/pl/LC_MESSAGES/systemd.mo
Installing /root/build/po/pt_BR.gmo to /root/dest/usr/share/locale/pt_BR/LC_MESSAGES/systemd.mo
Installing /root/build/po/ro.gmo to /root/dest/usr/share/locale/ro/LC_MESSAGES/systemd.mo
Installing /root/build/po/ru.gmo to /root/dest/usr/share/locale/ru/LC_MESSAGES/systemd.mo
Installing /root/build/po/sk.gmo to /root/dest/usr/share/locale/sk/LC_MESSAGES/systemd.mo
Installing /root/build/po/sr.gmo to /root/dest/usr/share/locale/sr/LC_MESSAGES/systemd.mo
Installing /root/build/po/sv.gmo to /root/dest/usr/share/locale/sv/LC_MESSAGES/systemd.mo
Installing /root/build/po/tr.gmo to /root/dest/usr/share/locale/tr/LC_MESSAGES/systemd.mo
Installing /root/build/po/uk.gmo to /root/dest/usr/share/locale/uk/LC_MESSAGES/systemd.mo
Installing /root/build/po/zh_CN.gmo to /root/dest/usr/share/locale/zh_CN/LC_MESSAGES/systemd.mo
Installing /root/build/po/zh_TW.gmo to /root/dest/usr/share/locale/zh_TW/LC_MESSAGES/systemd.mo
Installing /root/build/po/pa.gmo to /root/dest/usr/share/locale/pa/LC_MESSAGES/systemd.mo
real 0m1.162s
user 0m0.803s
sys 0m0.338s
```
systemd-timesyncd.service only applies the much weaker monotonic clock
from file logic, i.e should pull in and order itself before
time-set.target. The strong time-sync.target unit is pulled in by
systemd-time-wait-sync.service.
In hostnamed this is exposed as a dbus property, and in the logs in both
places.
This is of interest to network management software and such: if the fallback
hostname is used, it's not as useful as the real configured thing. Right now
various programs try to guess the source of hostname by looking at the string.
E.g. "localhost" is assumed to be not the real hostname, but "fedora" is. Any
such attempts are bound to fail, because we cannot distinguish "fedora" (a
fallback value set by a distro), from "fedora" (received from reverse dns),
from "fedora" read from /etc/hostname.
/run/systemd/fallback-hostname is written with the fallback hostname when
either pid1 or hostnamed sets the kernel hostname to the fallback value. Why
remember the fallback value and not the transient hostname in /run/hostname
instead?
We have three hostname types: "static", "transient", fallback".
– Distinguishing "static" is easy: the hostname that is set matches what
is in /etc/hostname.
– Distingiushing "transient" and "fallback" is not easy. And the
"transient" hostname may be set outside of pid1+hostnamed. In particular,
it may be set by container manager, some non-systemd tool in the initramfs,
or even by a direct call. All those mechanisms count as "transient". Trying
to get those cases to write /run/hostname is futile. It is much easier to
isolate the "fallback" case which is mostly under our control.
And since the file is only used as a flag to mark the hostname as fallback,
it can be hidden inside of our /run/systemd directory.
For https://bugzilla.redhat.com/show_bug.cgi?id=1892235.
MESON_INSTALL_QUIET is set when --quiet is passed to meson install.
Make sure we check the variable in our custom install scripts and
don't output anything if it is set.
Commit 42cc2855ba incorrectly removed the condition on sysfs in both
sys-fs-fuse-connections.mount and sys-kernel-config.mount. However there are
still needed in case modprobe of one of these modules is intentionally skipped
(due to lack of privs for example).
This patch restores the 2 conditions which should be safe for the common case,
since all conditions are only checked after all deps ordered before are
complete.
Follow-up for 42cc2855ba.
udev requests to start the fs mount units when their respective module is
loaded. For that it monitors uevents of type "ADD" for the relevant fs modules.
However the uevent is sent by the kernel too early, ie before the init() of the
module is called hence before directories in /sys/fs/ are created.
This patch workarounds adds "Requires/After=modprobe@<fs-module>.service" to
the mount unit, which means that modprobe(8) will be called once the fs module
is announced to be loaded. This sounds pointless, but given that modprobe only
returns after the initialization of the module is complete, it should
workaround the issue.
As a side effect, the module will be automatically loaded if the mount unit is
started manually.
Fixes#17586.
This reverts commit 9cbf1e58f9.
The presence of /sys/module/%I directory can't be used to assert that the load
of a given module is complete and therefore the call to modprobe(8) can be
skipped. Indeed this directory is created before the init() function of the
module is called.
Users of modprobe@.service needs to be sure that once this service returns the
module is fully operational.
This is useful for development where overwriting files out side
the configured prefix will affect the host as well as stateless
systems such as NixOS that don't let packages install to /etc but handle
configuration on their own.
Alternative to https://github.com/systemd/systemd/pull/17501
tested with:
$ mkdir inst build && cd build
$ meson \
-Dcreate-log-dirs=false \
-Dsysvrcnd-path=$(realpath ../inst)/etc/rc.d \
-Dsysvinit-path=$(realpath ../inst)/etc/init.d \
-Drootprefix=$(realpath ../inst) \
-Dinstall-sysconfdir=false \
--prefix=$(realpath ../inst) ..
$ ninja install
To make things simple and robust when debugging journald, we'll leave
the SO_TIMESTAMP invocations in the C code in place, even if they are
now typically redundant, given that the sockets are already passed into
the process with SO_TIMESTAMP turned on now.
[zjs: Replaces #17149.
I took half of the patch in
https://github.com/systemd/systemd/pull/17149#issuecomment-698399194,
hence I'm keeping Jonathan's authorship.
The original reasoning for 6c5496c492 was that we
enable remote-cryptsetup.target via presets, and since presets are not used for
the initrd, we need a different target. But since parts of the unit and target
tree are shared between the initramfs and the main system, we can't just create
a separate target for the initramfs. All the targets that depend on this one
would need to be split also. That condition is true for initrd-fs.target, but
not for sysinit.target.
So let's instead just uncoditionally pull in remote-cryptsetup.target in the
initramfs. It should normally be empty, so there should be no impact on boots
that don't have units in the target.
Jonathan's patch used initrd-root-fs.target, this version instead uses
initrd-root-device.target. initrd-root-device.target is ordered before
sysroot.mount, which means that the decrypted devices will be available earlier
too.]
This reverts commit 6c5496c492.
sysinit.target is shared between the initrd and the host system. Pulling in
initrd-cryptsetup.target into sysinit.target causes the following warning at
boot:
Oct 27 10:42:30 workstation-uefi systemd[1]: initrd-cryptsetup.target: Starting requested but asserts failed.
Oct 27 10:42:30 workstation-uefi systemd[1]: Assertion failed for initrd-cryptsetup.target.
For encrypted block devices that we need to unlock from the initramfs,
we currently rely on dracut shipping `cryptsetup.target`. This works,
but doesn't cover the case where the encrypted block device requires
networking (i.e. the `remote-cryptsetup.target` version). That target
however is traditionally dynamically enabled.
Instead, let's rework things here by adding a `initrd-cryptsetup.target`
specifically for initramfs encrypted block device setup. This plays the
role of both `cryptsetup.target` and `remote-cryptsetup.target` in the
initramfs.
Then, adapt `systemd-cryptsetup-generator` to hook all generated
services to this new unit when running from the initrd. This is
analogous to `systemd-fstab-generator` hooking all mounts to
`initrd-fs.target`, regardless of whether they're network-backed or not.
Ensure that systemd-random-seed.service has completed before marking
a first boot as completed to guarantee that a saved seed will only be
used after it has been initialized at least once.
Make sure systemd-firstboot completes before reaching first-boot-complete.target
and thus marking the first boot as completed. This way, it is
guaranteed that systemd-firstboot has a chance to complete provisioning
at least once, even in cases of the first boot getting aborted early.
Add a new target for synchronizing units that wish to run once during
the first boot of the system. The machine-id will be committed to disk
only after the target has been reached, thus ensuring that all units
ordered before it had a chance to complete.
Let's explicitly deactivate all home dirs on shutdown, in order to
properly synchronizing unmounting and avoiding blocking devices.
Previously, we'd rely on automatic deactivation when home directories
become unused. However, that scheme is asynchronous, and ongoing
deactviations might conflicts with attempts to unmount /home. Let's fix
that by providing an explicit service systemd-homed-activate.service
whose only job is to have a ExecStop= line that explicitly deactivates
all home directories on shutdown. This service can the be ordered after
home.mount and similar, ensuring that we'll first deactivate all homes
before deactivating /home itself during shutdown.
This is kept separate from systemd-homed.service so that it is possible
to restart systemd-homed.service without deactivating all home
directories.
Fixes: #16842
RC_LOCAL_SCRIPT_PATH_START and RC_LOCAL_SCRIPT_PATH_STOP were was originally
added in the conversion to meson based on the autotools name. In
4450894653 RC_LOCAL_SCRIPT_PATH_STOP was dropped.
We don't need to use such a long name.
Add After=systemd-networkd.socket to avoid a race condition and networkd
falling back to the non-socket activation code.
Also add Wants=systemd-networkd.socket, so the socket is started when
networkd is started via `systemctl start systemd-networkd.service`.
A Requires is not strictly necessary, as networkd still ships the
non-socket activation code. Should this code be removed one day, the
Wants should be bumped to Requires accordingly.
See also 5544ee8516.
Fixes: #16809
This should make /home as automount work reasonably well.
If /home is an automount this has little effect at boot, because if the
automount is not triggered it doesn't matter how the associated mount is
ordered.
It does matter at shutdown however, where home.mount is likely active
now. There the ordering means we'll end sessions first, and only then
deactivate home.mount.
Fixes: #16291
Combining Requires= with Before= doesn't really make sense, since this
means we are requiring something that runs after us, which logically
cannot be fulfilled.
Let's hence downgrade Requires= to Wants= so that the ordering is kept
but no failure propagation implied.
This should be enough to fix https://bugzilla.redhat.com/show_bug.cgi?id=1856514.
But the limit should be significantly higher than 10% anyway. By setting a
limit on /tmp at 10% we'll break many reasonable use cases, even though the
machine would deal fine with a much larger fraction devoted to /tmp.
(In the first version of this patch I made it 25% with the comment that
"Even 25% might be too low.". The kernel default is 50%, and we have been using
that seemingly without trouble since https://fedoraproject.org/wiki/Features/tmp-on-tmpfs.
So let's just make it 50% again.)
See 7d85383edb.
(Another consideration is that we learned from from the whole initiative with
zram in Fedora that a reasonable size for zram is 0.5-1.5 of RAM, and that pretty
much all systems benefit from having zram or zswap enabled. Thus it is reasonable
to assume that it'll become widely used. Taking the usual compression effectiveness
of 0.2 into account, machines have effective memory available of between
1.0 - 0.2*0.5 + 0.5 = 1.4 (for zram sized to 0.5 of RAM) and
1.0 - 0.2*1.5 + 1.5 = 2.2 (for zram 1.5 sized to 1.5 of RAM) times RAM size.
This means that the 10% was really like 7-4% of effective memory.)
A comment is added to mount-util.h to clarify that tmp.mount is separate.
This reverts commit c7220ca802.
The removal was done as a reaction to the messages from systemd:
initrd-root-fs.target: Requested dependency OnFailure=emergency.target ignored (target units cannot fail).
initrd.target: Requested dependency OnFailure=emergency.target ignored (target units cannot fail).
initrd-root-device.target: Requested dependency OnFailure=emergency.target ignored (target units cannot fail).
initrd-fs.target: Requested dependency OnFailure=emergency.target ignored (target units cannot fail).
local-fs.target: Requested dependency OnFailure=emergency.target ignored (target units cannot fail).
...
But it seems that the messages themselves are wrong, and the units were OK.
Strictly speaking you can run homed without userdb. But it doesn't
really make much sense: they go hand in hand and implement the same
concepts, just for different sets of users. Let's hence disable both
automatically by default if homed is requested.
(We don't do the reverse: opting into userdbd shouldn't mean that you
are OK with homed.)
And of course, users can always deviate from our defaults easily, and
turn off userbd again right-away if they don't like it, and things will
generally work.
This generator can be used by desktop environments to launch autostart
applications and services. The feature is an opt-in, triggered by
xdg-desktop-autostart.target being activated.
Also included is the new binary xdg-autostart-condition. This binary is
used as an ExecCondition to test the OnlyShowIn and NotShowIn XDG
desktop file keys. These need to be evaluated against the
XDG_CURRENT_DESKTOP environment variable which may not be known at
generation time.
Co-authored-by: Henri Chain <henri.chain@enioka.com>
In our regular gettys the actual shell commands live the the session
scope anyway (as long as logind is used). Hence, let's avoid
KillMode=process, it serves no purpose and is simply unsafe since it
disables systemd's own process lifecycle management.
We want to watch USB sticks being plugged in, and that requires
AF_NETLINK to work correctly and get the host's events. But if we live
in a network namespace AF_NETLINK is disconnected too and we'll not get
the host udev events.
Fixes: #15287
Limit size of various tmpfs mounts to 10% of RAM, except volatile root and /var
to 25%. Another exception is made for /dev (also /devs for PrivateDevices) and
/sys/fs/cgroup since no (or very few) regular files are expected to be used.
In addition, since directories, symbolic links, device specials and xattrs are
not counted towards the size= limit, number of inodes is also limited
correspondingly: 4MB size translates to 1k of inodes (assuming 4k each), 10% of
RAM (using 16GB of RAM as baseline) translates to 400k and 25% to 1M inodes.
Because nr_inodes option can't use ratios like size option, there's an
unfortunate side effect that with small memory systems the limit may be on the
too large side. Also, on an extremely small device with only 256MB of RAM, 10%
of RAM for /run may not be enough for re-exec of PID1 because 16MB of free
space is required.
"Login Service" doesn''t explain much, esp. considering that logind is actually is
for logins. I think "User Login Management" is better, but not that great either.
Suggestions welcome.
We unregister binfmt_misc twice during shutdown with this change:
1. A previous commit added support for doing that in the final shutdown
phase, i.e. when we do the aggressive umount loop. This is the robust
thing to do, in case the earlier ("clean") shutdown phase didn't work
for some reason.
2. This commit adds support for doing that when systemd-binfmt.service
is stopped. This is a good idea so that people can order mounts
before the service if they want to register binaries from such
mounts, as in that case we'll undo the registration on shutdown
again, before unmounting those mounts.
And all that, just because of that weird "F" flag the kernel introduced
that can pin files...
Fixes: #14981
This doesn't really matter, since in non-/usr-merged systems plymouth
needs to be in /bin and on merged ones it doesn't matter, but it is
still prettier to insert the right path, and avoid /bin on merged
systems, since it's just a compat symlink.
Replaces: #15351
This dependency is now generated automatically given we use
StateDirectory=. Moreover the combination of Wants= and After= was too
strong anway, as whether remount-fs is pulled in or not should not be up
to systemd-pstore.service, and in fact is part of the initial
transaction anyway.
sysinit.target is the target our early boot services are generally
pulled in from, make systemd-pstore.service not an exception of that.
Effectively this doesn't mean much, either way our unit is part of the
initial transaction.
Add `ProtectClock=yes` to systemd units. Since it implies certain
`DeviceAllow=` rules, make sure that the units have `DeviceAllow=` rules so
they are still able to access other devices. Exclude timesyncd and timedated.
This reverts commit 7e1ed1f3b2.
systemd-repart is not a user service that should be something people
enable/disable, instead it should just work if there's configuration for
it. It's like systemd-tmpfiles, systemd-sysusers, systemd-load-modules,
systemd-binfmt, systemd-systemd-sysctl which are NOPs if they have no
configuration, and thus don't hurt, but cannot be disabled since they
are too deep part of the OS.
This doesn't mean people couldn't disable the service if they really
want to, there's after all "systemctl mask" and build-time disabling,
but those are OS developer facing instead of admin facing, that's how it
should be.
Note that systemd-repart is in particular an initrd service, and so far
enable/disable state of those is not managed anyway via "systemctl
enable/disable" but more what dracut decides to package up and what not.
/home is posibly a remote file system. it makes sense to order homed
after it, so that we can properly enumerate users in it, but we probably
shouldn't pull it in ourselves, and leave that to users to configure
otherwise.
Fixes: #15102
It's lightweight and generally useful, so it should be enabled by default. But
users might want to disable it for whatever reason, and things should be fine
without it, so let's make it installable so it can be disabled if wanted.
Fixes#15175.
This essentially adds another layer of configurability:
build disable, this, presence of configuration. The default is
set to enabled, because the service does nothing w/o config.
Possible alternative to #14819.
For me, setting RemainAfterExit=yes would be OK, but if people think that it
might cause issues, then this could be a reasonable alternative that still
let's us skip the invocation of the separate binary.
This reverts the second part of 8125e8d38e.
The first part was reverted in 750e550eba.
The problem starts when s-v-s.s is pulled in by something that is then pulled
in by sysinit.target. Every time a unit is started, systemd recursively checks
all dependencies, and since sysinit.target is pull in by almost anything, we'll
start s-v-s.s over and over. In particular, plymouth-start.service currently
has Wants=s-v-s.s and After=s-v-s.s.
This minus has been there since the unit was added in
d42d27ead9. I think the idea was not cause things
to fail if the user instance doesn't work. But ignoring the return value
doesn't seem to be the right way to approach the problem. In particular, if
the program fails to run, we'll get a bogus fail state, see
https://bugzilla.redhat.com/show_bug.cgi?id=1727895#c1:
with the minus:
$ systemctl start user@1002
Job for user@1002.service failed because the service did not take the steps required by its unit configuration.
See "systemctl status user@1002.service" and "journalctl -xe" for details.
without the minus:
$ systemctl start user@1002
Job for user@1002.service failed because the control process exited with error code.
See "systemctl status user@1002.service" and "journalctl -xe" for details.
This patch modifies the RequireMountsFor setting in systemd-nspawn@.service to wait for the machine instance directory to be mounted, not just /var/lib/machines.
Closes#14931
machined needs access to the host mount namespace to propagate bind
mounts created with the "machinectl bind" command. However, the
"ProtectKernelLogs" directive relies on mount namespaces to make the
kernel ring buffer inaccessible. This commit removes the
"ProtectKernelLogs=yes" directive from machined service file introduced
in 6168ae5.
Closes#14559.
Kernel 4.1 separated the tracing system from the debugfs,
actual documentation already points to a different path
that needs this new mount to exist.
the old sysfs path will still be an automount in the debugfs,
created by the kernel (for now).
Signed-off-by: Norbert Lange <nolange79@gmail.com>
See c80a9a33d0, target units can't fail.
I guess we need to figure out some replacement functionality, but at least
let's avoid the warning from systemd for now.
If we have exit on idle, then operations such as "journalctl
--namespace=foo --rotate" should work even if the journal daemon is
currently not running.
(Note that we don't do activation by varlink for the main instance of
journald, I am not sure the deadlocks it might introduce are worth it)
This makes things a bit simpler and the build a bit faster, because we don't
have to rewrite files to do the trivial substitution. @rootbindir@ is always in
our internal $PATH that we use for non-absolute paths, so there should be no
functional change.
Let's use uppercase wording in the description string, like we usually
do.
Let's allow using this service in early boot.
If it's pulled into the initial transaction it's better to finish
loading this before sysinit.target.
Don't bother with this in containers that lack CAP_SYS_MODULE
Devices referred to by `DeviceAllow=` sandboxing are resolved into their
corresponding major numbers when the unit is loaded by looking at
`/proc/devices`. If a reference is made to a device which is not yet
available, the `DeviceAllow` is ignored and the unit's processes cannot
access that device.
In both logind and nspawn, we have `DeviceAllow=` lines, and `modprobe`
in `ExecStartPre=` to load some kernel modules. Those kernel modules
cause device nodes to become available when they are loaded: the device
nodes may not exist when the unit itself is loaded. This means that the
unit's processes will not be able to access the device since the
`DeviceAllow=` will have been resolved earlier and denied it.
One way to fix this would be to re-evaluate the available devices and
re-apply the policy to the cgroup, but this cannot work atomically on
cgroupsv1. So we fall back to a second approach: instead of running
`modprobe` via `ExecStartPre`, we move this out to a separate unit and
order it before the units which want the module.
Closes#14322.
Fixes: #13943.
This reverts commit 07125d24ee.
In contrast to what is claimed in #13396 dbus-broker apparently does
care for the service file to be around, and otherwise will claim
"Service Not Activatable" in the time between systemd starting up the
broker and connecting to it, which the stub service file is supposed to
make go away.
Reverting this makes the integration test suite pass again on host with
dbus-broker (i.e. current Fedora desktop).
Tested with dbus-broker-21-6.fc31.x86_64.
This reverts commit 362c378291.
This commit introduced an ordering loop: remote-cryptsetup.target was both
before and after remote-fs-pre.target. It also globally ordered all cryptsetup
volumes before all mounts. Such global ordering is problematic if people have
stacked storage. Let's look for a different solution.
See https://github.com/systemd/systemd/pull/14378#discussion_r359460109.
Otherwise, systemd-udev-trigger|settle.service that ran in the initrd may
ramain active, and never re-run again from the system root.
This is observed by forexample examining ESP with udevadm info, which in the
initrd has all the ID_* variables, and none of them in fully booted system.
This option is an indication for PID1 that the entry in crypttab is handled by
initrd only and therefore it shouldn't interfer during the usual start-up and
shutdown process.
It should be primarily used with the encrypted device containing the root FS as
we want to keep it (and thus its encrypted device) until the very end of the
shutdown process, i.e. when initrd takes over.
This option is the counterpart of "x-initrd.mount" used in fstab.
Note that the slice containing the cryptsetup services also needs to drop the
usual shutdown dependencies as it's required by the cryptsetup services.
Fixes: #14224
Apparently some firmwares don't allow us to write this token, and refuse
it with EINVAL. We should normally consider that a fatal error, but not
really in the case of "bootctl random-seed" when called from the
systemd-boot-system-token.service since it's called as "best effort"
service after boot on various systems, and hence we shouldn't fail
loudly.
Similar, when we cannot find the ESP don't fail either, since there are
systems (arch install ISOs) that carry a boot loader capable of the
random seed logic but don't mount it after boot.
Fixes: #13603
We set ProtectKernelLogs=yes on all long running services except for
udevd, since it accesses /dev/kmsg, and journald, since it calls syslog
and accesses /dev/kmsg.
As discussed on systemd-devel [1], in Fedora we get lots of abrt reports
about the watchdog firing [2], but 100% of them seem to be caused by resource
starvation in the machine, and never actual deadlocks in the services being
monitored. Killing the services not only does not improve anything, but it
makes the resource starvation worse, because the service needs cycles to restart,
and coredump processing is also fairly expensive. This adds a configuration option
to allow the value to be changed. If the setting is not set, there is no change.
My plan is to set it to some ridiculusly high value, maybe 1h, to catch cases
where a service is actually hanging.
[1] https://lists.freedesktop.org/archives/systemd-devel/2019-October/043618.html
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1300212
The code supports SIGTERM and SIGINT to termiante the process. It would
be possible to reporpose one of those signals for the restart operation,
but I think it's better to use a completely different signal to avoid
misunderstandings.
See https://bugzilla.redhat.com/show_bug.cgi?id=1731772:
when autofs4 is disabled in the kernel,
proc-sys-fs-binfmt_misc.automount is not started, so the binfmt_misc module is
never loaded. If we added a dependency on proc-sys-fs-binfmt_misc.mount
to systemd-binfmt.service, things would work even if autofs4 was disabled, but
we would unconditionally pull in the module and mount, which we don't want to do.
(Right now we ony load the module if some binfmt is configured.)
But let's make it easier to handle this case by doing two changes:
1. order systemd-binfmt.service after the .mount unit (so that the .service
can count on the mount if both units are pulled in, even if .automount
is skipped)
2. add [Install] section to the service unit. This way the user can do
'systemctl enable proc-sys-fs-binfmt_misc.mount' to get the appropriate behaviour.
This fixes the following problem:
> At the very end of the boot, just after the first user logs in
> (usually using sddm / X) I get the following messages in my logs:
> Nov 18 07:02:33 samd dbus-daemon[2879]: [session uid=1000 pid=2877] Activated service 'org.freedesktop.systemd1' failed: Process org.freedesktop.systemd1 exited with status 1
> Nov 18 07:02:33 samd dbus-daemon[2879]: [session uid=1000 pid=2877] Activated service 'org.freedesktop.systemd1' failed: Process org.freedesktop.systemd1 exited with status 1
These messages are caused by the "stub" service files that systemd
installs. It installed them because early versions of systemd activation
required them to exist.
Since dbus 1.11.0, a dbus-daemon that is run with --systemd-activation
automatically assumes that o.fd.systemd1 is an activatable
service. As a result, with a new enough dbus version,
/usr/share/dbus-1/services/org.freedesktop.systemd1.service and
/usr/share/dbus-1/system-services/org.freedesktop.systemd1.service should
become unnecessary, and they can be removed.
dbus 1.11.0 was released 2015-12-02.
Bug-Debian: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=914015
If logging disappears issues are hard to debug, hence let's give
journald a slight edge over other services when the OOM killer hits.
Here are the special adjustments we now make:
systemd-coredump@.service.in OOMScoreAdjust=500
systemd-journald.service.in OOMScoreAdjust=-250
systemd-udevd.service.in OOMScoreAdjust=-1000
(i.e. the coredump processing is made more likely to be killed on OOM,
and udevd and journald are less likely to be killed)
Follow-up for 26ded55709.
The commit says,
> Note that with this change sysinit.target (and thus early boot) is NOT
systematically delayed until the entropy pool is initialized,
But the dependency was not dropped.
This was found by David Seifert (@SoapGentoo).
This makes two major changes to the way systemd-random-seed operates:
1. We now optionally credit entropy if this is configured (via an env
var). Previously we never would do that, with this change we still don't
by default, but it's possible to enable this if people acknowledge that
they shouldn't replicate an image with a contained random seed to
multiple systems. Note that in this patch crediting entropy is a boolean
thing (unlike in previous attempts such as #1062), where only a relative
amount of bits was credited. The simpler scheme implemented here should
be OK though as the random seeds saved to disk are now written only with
data from the kernel's entropy pool retrieved after the pool is fully
initialized. Specifically:
2. This makes systemd-random-seed.service a synchronization point for
kernel entropy pool initialization. It was already used like this, for
example by systemd-cryptsetup-generator's /dev/urandom passphrase
handling, with this change it explicitly operates like that (at least
systems which provide getrandom(), where we can support this). This
means services that rely on an initialized random pool should now place
After=systemd-random-seed.service and everything should be fine. Note
that with this change sysinit.target (and thus early boot) is NOT
systematically delayed until the entropy pool is initialized, i.e.
regular services need to add explicit ordering deps on this service if
they require an initialized random pool.
Fixes: #4271
Replaces: #10621#4513
This reverts commit 971a7a1526.
These unit names are typically different on distributions, let's not
hardcode those. Stuff like this should probably live in the distro
RPM/.deb, but not upstream, where we should be distro agnostic and
agnostic to other higher level packages like this.
Users might end up with more than one of those service enabled, through
admin mistake, or broken installation scriptlets, or whatever. On my machine,
I had both chronyd and timesyncd happilly running at the same time. If
more than one is enabled, it's better to have just one running. Adding
Conflicts will make the issue more visible in logs too.
This patch introduces the systemd pstore service which will archive the
contents of the Linux persistent storage filesystem, pstore, to other storage,
thus preserving the existing information contained in the pstore, and clearing
pstore storage for future error events.
Linux provides a persistent storage file system, pstore[1], that can store
error records when the kernel dies (or reboots or powers-off). These records in
turn can be referenced to debug kernel problems (currently the kernel stuffs
the tail of the dmesg, which also contains a stack backtrace, into pstore).
The pstore file system supports a variety of backends that map onto persistent
storage, such as the ACPI ERST[2, Section 18.5 Error Serialization] and UEFI
variables[3 Appendix N Common Platform Error Record]. The pstore backends
typically offer a relatively small amount of persistent storage, e.g. 64KiB,
which can quickly fill up and thus prevent subsequent kernel crashes from
recording errors. Thus there is a need to monitor and extract the pstore
contents so that future kernel problems can also record information in the
pstore.
The pstore service is independent of the kdump service. In cloud environments
specifically, host and guest filesystems are on remote filesystems (eg. iSCSI
or NFS), thus kdump relies [implicitly and/or explicitly] upon proper operation
of networking software *and* hardware *and* infrastructure. Thus it may not be
possible to capture a kernel coredump to a file since writes over the network
may not be possible.
The pstore backend, on the other hand, is completely local and provides a path
to store error records which will survive a reboot and aid in post-mortem
debugging.
Usage Notes:
This tool moves files from /sys/fs/pstore into /var/lib/systemd/pstore.
To enable kernel recording of error records into pstore, one must either pass
crash_kexec_post_notifiers[4] to the kernel command line or enable via 'echo Y
> /sys/module/kernel/parameters/crash_kexec_post_notifiers'. This option
invokes the recording of errors into pstore *before* an attempt to kexec/kdump
on a kernel crash.
Optionally, to record reboots and shutdowns in the pstore, one can either pass
the printk.always_kmsg_dump[4] to the kernel command line or enable via 'echo Y >
/sys/module/printk/parameters/always_kmsg_dump'. This option enables code on the
shutdown path to record information via pstore.
This pstore service is a oneshot service. When run, the service invokes
systemd-pstore which is a tool that performs the following:
- reads the pstore.conf configuration file
- collects the lists of files in the pstore (eg. /sys/fs/pstore)
- for certain file types (eg. dmesg) a handler is invoked
- for all other files, the file is moved from pstore
- In the case of dmesg handler, final processing occurs as such:
- files processed in reverse lexigraphical order to faciliate
reconstruction of original dmesg
- the filename is examined to determine which dmesg it is a part
- the file is appended to the reconstructed dmesg
For example, the following pstore contents:
root@vm356:~# ls -al /sys/fs/pstore
total 0
drwxr-x--- 2 root root 0 May 9 09:50 .
drwxr-xr-x 7 root root 0 May 9 09:50 ..
-r--r--r-- 1 root root 1610 May 9 09:49 dmesg-efi-155741337601001
-r--r--r-- 1 root root 1778 May 9 09:49 dmesg-efi-155741337602001
-r--r--r-- 1 root root 1726 May 9 09:49 dmesg-efi-155741337603001
-r--r--r-- 1 root root 1746 May 9 09:49 dmesg-efi-155741337604001
-r--r--r-- 1 root root 1686 May 9 09:49 dmesg-efi-155741337605001
-r--r--r-- 1 root root 1690 May 9 09:49 dmesg-efi-155741337606001
-r--r--r-- 1 root root 1775 May 9 09:49 dmesg-efi-155741337607001
-r--r--r-- 1 root root 1811 May 9 09:49 dmesg-efi-155741337608001
-r--r--r-- 1 root root 1817 May 9 09:49 dmesg-efi-155741337609001
-r--r--r-- 1 root root 1795 May 9 09:49 dmesg-efi-155741337710001
-r--r--r-- 1 root root 1770 May 9 09:49 dmesg-efi-155741337711001
-r--r--r-- 1 root root 1796 May 9 09:49 dmesg-efi-155741337712001
-r--r--r-- 1 root root 1787 May 9 09:49 dmesg-efi-155741337713001
-r--r--r-- 1 root root 1808 May 9 09:49 dmesg-efi-155741337714001
-r--r--r-- 1 root root 1754 May 9 09:49 dmesg-efi-155741337715001
results in the following:
root@vm356:~# ls -al /var/lib/systemd/pstore/155741337/
total 92
drwxr-xr-x 2 root root 4096 May 9 09:50 .
drwxr-xr-x 4 root root 40 May 9 09:50 ..
-rw-r--r-- 1 root root 1610 May 9 09:50 dmesg-efi-155741337601001
-rw-r--r-- 1 root root 1778 May 9 09:50 dmesg-efi-155741337602001
-rw-r--r-- 1 root root 1726 May 9 09:50 dmesg-efi-155741337603001
-rw-r--r-- 1 root root 1746 May 9 09:50 dmesg-efi-155741337604001
-rw-r--r-- 1 root root 1686 May 9 09:50 dmesg-efi-155741337605001
-rw-r--r-- 1 root root 1690 May 9 09:50 dmesg-efi-155741337606001
-rw-r--r-- 1 root root 1775 May 9 09:50 dmesg-efi-155741337607001
-rw-r--r-- 1 root root 1811 May 9 09:50 dmesg-efi-155741337608001
-rw-r--r-- 1 root root 1817 May 9 09:50 dmesg-efi-155741337609001
-rw-r--r-- 1 root root 1795 May 9 09:50 dmesg-efi-155741337710001
-rw-r--r-- 1 root root 1770 May 9 09:50 dmesg-efi-155741337711001
-rw-r--r-- 1 root root 1796 May 9 09:50 dmesg-efi-155741337712001
-rw-r--r-- 1 root root 1787 May 9 09:50 dmesg-efi-155741337713001
-rw-r--r-- 1 root root 1808 May 9 09:50 dmesg-efi-155741337714001
-rw-r--r-- 1 root root 1754 May 9 09:50 dmesg-efi-155741337715001
-rw-r--r-- 1 root root 26754 May 9 09:50 dmesg.txt
where dmesg.txt is reconstructed from the group of related
dmesg-efi-155741337* files.
Configuration file:
The pstore.conf configuration file has four settings, described below.
- Storage : one of "none", "external", or "journal". With "none", this
tool leaves the contents of pstore untouched. With "external", the
contents of the pstore are moved into the /var/lib/systemd/pstore,
as well as logged into the journal. With "journal", the contents of
the pstore are recorded only in the systemd journal. The default is
"external".
- Unlink : is a boolean. When "true", the default, then files in the
pstore are removed once processed. When "false", processing of the
pstore occurs normally, but the pstore files remain.
References:
[1] "Persistent storage for a kernel's dying breath",
March 23, 2011.
https://lwn.net/Articles/434821/
[2] "Advanced Configuration and Power Interface Specification",
version 6.2, May 2017.
https://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf
[3] "Unified Extensible Firmware Interface Specification",
version 2.8, March 2019.
https://uefi.org/sites/default/files/resources/UEFI_Spec_2_8_final.pdf
[4] "The kernel’s command-line parameters",
https://static.lwn.net/kerneldoc/admin-guide/kernel-parameters.html
We use that on all other services, and hence should here too. Otherwise
the service will be killed with SIGSYS when doing something not
whitelisted, which is a bit crass.
While the need for access to character devices can be tricky to determine for
the general case, it's obvious that most of our services have no need to access
block devices. For logind and timedated this can be tightened further.
We had all kinds of indentation: 2 sp, 3 sp, 4 sp, 8 sp, and mixed.
4 sp was the most common, in particular the majority of scripts under test/
used that. Let's standarize on 4 sp, because many commandlines are long and
there's a lot of nesting, and with 8sp indentation less stuff fits. 4 sp
also seems to be the default indentation, so this will make it less likely
that people will mess up if they don't load the editor config. (I think people
often use vi, and vi has no support to load project-wide configuration
automatically. We distribute a .vimrc file, but it is not loaded by default,
and even the instructions in it seem to discourage its use for security
reasons.)
Also remove the few vim config lines that were left. We should either have them
on all files, or none.
Also remove some strange stuff like '#!/bin/env bash', yikes.
time-sync.target is supposed to indicate system clock is synchronized
with a remote clock, but as used through 241 it only provided a system
clock that was updated based on a locally-maintained timestamp. Systems
that are powered off for extended periods would not come up with
accurate time.
Retain the existing behavior using a new time-set.target leaving
time-sync.target for cases where accuracy is required.
Closes#8861
Let's be safe, rather than sorry. This way DynamicUser=yes services can
neither take benefit of, nor create SUID/SGID binaries.
Given that DynamicUser= is a recent addition only we should be able to
get away with turning this on, even though this is strictly speaking a
binary compatibility breakage.
This patch was initially prompted by a report on a Fedora update [1], that the
upgrade causes systemd-resolved.service and systemd-networkd.service to be
re-enabled. We generally want to preserve the enablement of all services during
upgrades, so a reset like this is not expected.
Both services declare two symlinks in their [Install] sections, for their dbus
names and for multi-user.target.wants/. It turns out that both services were
only partially enabled, because their dbus unit symlinks
/etc/systemd/system/dbus-org.freedesktop.{resolve1,network1}.service were
created, by the symlinks in /etc/systemd/system/multi-user.target.wants/ were
not. This means that the units could be activated by dbus, but not in usual
fashion using systemctl start. Our tools make it rather hard to figure out when
something like this happens, and it is definitely an area for improvement on its
own. The symlink in .wants/ was filtered out by during packaging, but the dbus
symlink was left in (I assume by mistake).
Let's simplify things by not creating the symlinks statically during 'ninja
install'. This means that the units shipped by systemd have to be enabled in
the usual fashion, which in turns means that [Install] section and presets
become the "single source of truth" and we don't have two sets of conflicting
configuration.
Let's consider a few cases:
- developer: a developer installs systemd from git on a running system, and they
don't want the installation to reset enablement of anything. So this change is
either positive for them, or has no effect (if they have everything at
defaults).
- package creation: we want to create symlinks using 'preset-all' and 'preset'
on upgraded packages, we don't want to have any static symlinks. This change
will remove the need to filter out symlinks in packaging and of course fix
the original report.
- installation of systemd from scratch: this change means that without
'preset-all' the system will not be functional. This case could be affected
negatively by this change, but I think it's enough of a corner case to accept
this. In practice I expect people to build a package, not installl directly
into the file system, so this might not even matter in practice.
Creating those symlinks was probably the right thing in the beginning, but
nowadays the preset system is very well established and people expect it to
be honoured. Ignoring the presets and doing static configuration is not welcome
anymore.
Note: during package installation, either 'preset-all' or 'preset getty@.service
machines.target remote-cryptsetup.target remote-fs.target
systemd-networkd.service systemd-resolved.service
systemd-networkd-wait-online.service systemd-timesyncd.service' should be called.
[1] https://bodhi.fedoraproject.org/updates/FEDORA-2019-616045ca76
A couple of API VFS we mount via .mount units. Let's set the three flags
for those too, just in case.
This is just paranoia, nothing else, but shouldn't hurt.
This service uses PAM anyway, hence let pam_keyring set things up for
us. Moreover, this way we ensure that the invocation ID is not set for
this service as key, and thus can't confuse the user service's
invocation ID.
Fixes: #11649
`systemd-journal-catalog-update.service` writes to `/var`. However, it's
not explicitly ordered wrt `systemd-tmpfiles-setup.service`, which means
that it may run before or after.
This is an issue for Fedora CoreOS, which uses Ignition. We want to be
able to prepare `/var` on first boot from the initrd, where the SELinux
policy is not loaded yet. This means that the hierarchy under `/var` is
not correctly labeled. We add a `Z /var - - -` tmpfiles entry so that it
gets relabeled once `/var` gets mounted post-switchroot.
So any service that tries to access `/var` before `systemd-tmpfiles`
relabels it is likely to hit `EACCES`.
Fix this by simply ordering `systemd-journal-catalog-update.service`
after `systemd-tmpfiles-setup.service`. This is also clearer since the
tmpfiles entries are the canonical source of how `/var` should be
populated.
For more context on this, see:
https://github.com/coreos/ignition/issues/635#issuecomment-446620297
ProtectHostname= turns off hostname change propagation from host to
service. This means for services that care about the hostname and need
to be able to notice changes to it it's not suitable (though it is
useful for most other cases still).
Let's turn it off hence for journald (which logs the current hostname)
for networkd (which optionally sends the current hostname to dhcp
servers) and resolved (which announces the current hostname via
llmnr/mdns).
This behaves similar to the "boot into firmware" logic, and also allows
either direct EFI operation (which sd-boot supports and others might
support eventually too) or override through env var.
This was an overzealous setting from commit 99894b867f. Without this,
`hostnamectl set-hostname` fails with
Could not set property: Access denied
as `sethostname()` fails with `EPERM`.
Linux can be run on a device meant to act as a USB peripheral. In order
for a machine to act as such a USB device it has to be equipped with
a UDC - USB Device Controller.
This patch adds a target reached when UDC becomes available. It can be used
for activating e.g. a service unit which composes a USB gadget with
configfs and activates it.
A follow-up for commit a8cb1dc3e0.
Commit a8cb1dc3e0 made sure that initrd-cleanup.service won't be stopped
when initrd-switch-root.target is isolated.
However even with this change, it might happen that initrd-cleanup.service
survives the switch to rootfs (since it has no ordering constraints against
initrd-switch-root.target) and is stopped right after when default.target is
isolated. This led to initrd-cleanup.service entering in failed state as it
happens when oneshot services are stopped.
This patch along with a8cb1dc3e0 should fix issue #4343.
Fixes: #4343
Currently, tmpfiles runs in two separate services at boot. /dev is
populated by systemd-tmpfiles-setup-dev.service and everything else by
systemd-tmpfiles-setup.service. The former was so far conditionalized by
CAP_SYS_MODULES. The reasoning was that the primary purpose of
populating /dev was to create device nodes based on the static device
node info exported in kernel modules through MODALIAS. And without the
privs to load kernel modules doing so is unnecessary. That thinking is
incomplete however, as there might be reason to create stuff in /dev
outside of the static modalias usecase. Thus, let's drop the
conditionalization to ensure that tmpfiles.d rules are always executed
at least once under all conditions.
Fixes: #11544
Instead of enabling it unconditionally and then using ConditionPathExists=/etc/fstab,
and possibly masking this condition if it should be enabled for auto gpt stuff,
just pull it in explicitly when required.
We already *install* those as real files since de78fa9ba0.
Meson will start to copy symlinks as-is, so we would get dangling symlinks in
/usr/lib/systemd/user/.
I considered the layout in our sources to match the layout in the installation
filesystem (i.e. creating units/system/ and moving all files from units/ to
units/system/), but that seems overkill. By using normal files for both we get
some duplication, but those files change rarely, so it's not a big downside in
practice.
Fixes#9906.
Let's simplify things and drop the logic that /var/lib/machines is setup
as auto-growing btrfs loopback file /var/lib/machines.raw.
THis was done in order to make quota available for machine management,
but quite frankly never really worked properly, as we couldn't grow the
file system in sync with its use properly. Moreover philosophically it's
problematic overriding the admin's choice of file system like this.
Let's hence drop this, and simplify things. Deleting code is a good
feeling.
Now that regular file systems provide project quota we could probably
add per-machine quota support based on that, hence the btrfs quota
argument is not that interesting anymore (though btrfs quota is a bit
more powerful as it allows recursive quota, i.e. that the machine pool
gets an overall quota in addition to per-machine quota).
Otherwise we might install the socket unit early, but the service
backing it late, and then end up in strange loops when we enter rescue
mode, because we saw an event on /dev/rfkill but really can't dispatch
it nor flush it.
Fixes: #9171
now that logind doesn't mount $XDG_RUNTIME_DIR anymore we can lock down
the service using fs namespacing (as we don't need the mount to
propagate to the host namespace anymore).
Previously, setting this option by default was problematic due to
SELinux (as this would also prohibit the transition from PID1's label to
the service's label). However, this restriction has since been lifted,
hence let's start making use of this universally in our services.
On SELinux system this change should be synchronized with a policy
update that ensures that NNP-ful transitions from init_t to service
labels is permitted.
An while we are at it: sort the settings in the unit files this touches.
This might increase the size of the change in this case, but hopefully
should result in stabler patches later on.
Fixes: #1219
I found zero references to busnames.target, using git grep "busnames".
(And we do not install using a wildcard units/*.*. There is no
busnames.target installed on my Fedora 28 system).
THis dep existed since the unit was introduced, but I cannot see what
good it would do. Hence in the interest of simplifying things, let's
drop it. If breakages appear later we can certainly revert this again.
Fixes: #10469
This is might be useful in some cases, but it's primarily an example for
a boot check service that can be plugged before boot-complete.target.
It's disabled by default.
All it does is check whether the failed unit count is zero
This is the counterpiece to the boot counting implemented in
systemd-boot: if a boot is detected as successful we mark drop the
counter again from the booted snippet or kernel image.
C.f. 287419c119: 'systemctl exit 42' can be
used to set an exit value and pulls in exit.target, which pulls in systemd-exit.service,
which calls org.fdo.Manager.Exit, which calls method_exit(), which sets the objective
to MANAGER_EXIT. Allow the same to happen through SuccessAction=exit.
v2: update for 'exit' and 'exit-force'
Explicit systemctl calls remain in systemd-halt.service and the system
systemd-exit.service. To convert systemd-halt, we'd need to add
SuccessAction=halt-force. Halting doesn't make much sense, so let's just
leave that is. systemd-exit.service will be converted in the next commit.
This updates the unit files of all our serviecs that deal with journal
stuff to use a higher RLIMIT_NOFILE soft limit by default. The new value
is the same as used for the new HIGH_RLIMIT_NOFILE we just added.
With this we ensure all code that access the journal has higher
RLIMIT_NOFILE. The code that runs as daemon via the unit files, the code
that is run from the user's command line via C code internal to the
relevant tools. In some cases this means we'll redundantly bump the
limits as there are tools run both from the command line and as service.
So far we always used "yes" instead of "true" in all our unit files,
except for one outlier. Let's do this here too. No change in behaviour
whatsoever, except that it looks prettier ;-)
I think this is a slightly cleaner approach than parsing the
configuration file at multiple places, as this way there's only a single
reload cycle for logind.conf, and that's systemd-logind.service's
runtime.
This means that logind and dbus become a requirement of
user-runtime-dir, but given that XDG_RUNTIME_DIR is not set anyway
without logind and dbus around this isn't really any limitation.
This also simplifies linking a bit as this means user-runtime-dir
doesn't have to link against any code of logind itself.
Let's not use the word "wrapper", as it's not clear what that is, and in
some way any unit file is a "wrapper"... let's simply say that it's
about the runtime directory.
If for any reason local-fs.target fails at startup while a password is
requested by systemd-cryptsetup@.service, we end up with the emergency shell
competing with systemd-ask-password-console.service for the console.
This patch makes sure that:
- systemd-ask-password-console.service is stopped before entering in emergency
mode so it won't make any access to the console while the emergency shell is
running.
- systemd-ask-password-console.path is also stopped so any attempts to restart
systemd-cryptsetup in the emergency shell won't restart
systemd-ask-password-console.service and kill the emergency shell.
- systemd-ask-password-wall.path is stopped so
systemd-ask-password-wall.service won't be started as this service pulls
the default dependencies in.
Fixes: #10131
This reverts commit d4e9e574ea.
(systemd.conf.m4 part was already reverted in 5b5d82615011b9827466b7cd5756da35627a1608.)
Together those reverts should "fix" #10025 and #10011. ("fix" is in quotes
because this doesn't really fix the underlying issue, which is combining
DynamicUser= with strict container sandbox, but it avoids the problem by not
using that feature in our default installation.)
Dynamic users don't work well if the service requires matching configuration in
other places, for example dbus policy. This is true for those three services.
In effect, distros create the user statically [1, 2]. Dynamic users make more
sense for "add-on" services where not creating the user, or more precisely,
creating the user lazily, can save resources. For "basic" services, if we are
going to create the user on package installation anyway, setting DynamicUser=
just creates unneeded confusion. The only case where it is actually used is
when somebody forgets to do system configuration. But it's better to have the
service fail cleanly in this case too. If we want to turn on some side-effect
of DynamicUser=yes for those services, we should just do that directly through
fine-grained options. By not using DynamicUser= we also avoid the need to
restart dbus.
[1] bd9bf30727
[2] 48ac1cebde/f/systemd.spec (_473)
(Fedora does not create systemd-timesync user.)
systemd-tmpfiles-setup.service needs to be ordered after
systemd-journald.service, so entries in /run/log/journal are already
created when systemd-tmpfiles tries to adjust its permissions.
This is specially problematic for setups using a volatile journal where
the initrd does not ship a machine-id (i.e. OSTree-based systems), where
logs from the initrd will be inaccessible for users in the
systemd-journal group. It also has a side effect of `journalctl --user`
failing with "No journal files were opened due to insufficient
permissions".
Fixes#10128.
We would create a useless empty directory under build/.
It seems we were lucky and all symlinks were installed into directories
which were alredy created because we installed something into the same
location earlier.
While at it, also add '-v' to 'mkdir -p'. This will print the names of
directories as they are created (just once), making it easier to see all of
what the install script is doing.
This reverts commit 48d3e88c18.
I kept the follow-symlink=false → follow-symlink=true change instact, since
we're likely to have existing installations with a symlink now.
Followup to commit 13cf422e04 ("user@.service: don't kill user manager at runlevel switch")
I think there's a general rule that units with `StopWhenUnneeded=yes` need
`IgnoreOnIsolate=yes`... But it doesn't apply to `suspend.target` and friends.
`printer.target` and friends break on isolate even if we apply the rule[1].
That just leaves `graphical-session.target`, which is a user service.
"isolate" is *mostly* a weird attempt to emulate runlevels, so I decided
not to worry about it for user services.
[1] https://github.com/systemd/systemd/issues/6505#issuecomment-320644819
Loggin in as root user and then switching the runlevel results in a
stop of the user manager, even though the user ist still logged in.
That leaves a broken user session.
Adding "IgnoreOnIsolate=true" to user@.service fixes this.
This service won't use much resources, but it's certainly nicer to see
it attached th the user's slice along with user@.service, so that
everything we run for a specific user is properly bound into one unit.
We use systemd-user-sessions.service as barrier when to allow login
sessions. With this patch user@.service is ordered after that too, so
that any login related code (which user-runtime-dir@.service is) is
guaranteed to run after the barrier, and never before.