Commit Graph

1648 Commits

Author SHA1 Message Date
Mike Yuan
9d5b690100
logind: serialize session leader pidfd to fdstore 2024-01-04 16:19:20 +08:00
Lennart Poettering
4e1f0037b8 units: add a tpm2.target synchronization point and small generator that pulls in
Distributions apparently only compile a subset of TPM2 drivers into the
kernel. For those not compiled it but provided as kmod we need a
synchronization point: we must wait before the first TPM2 interaction
until the driver is available and accessible.

This adds a tpm2.target unit as such a synchronization point. It's
ordered after /dev/tpmrm0, and is pulled in by a generator whenever we
detect that the kernel reported a TPM2 to exist but we have no device
for it yet.

This should solve the issue, but might create problems: if there are TPM
devices supported by firmware that we don't have Linux drivers for we'll
hang for a bit. Hence let's add a kernel cmdline switch to disable (or
alternatively force) this logic.

Fixes: #30164
2024-01-03 13:49:02 +01:00
Lennart Poettering
caef0bc3dc creds: open up access to clients via Polkit
Use auth_admin_keep, so that users don't have to re-auth interactively
again and again when encrypting/decrypting batches of credentials.
2024-01-03 11:53:52 +01:00
Mike Yuan
f6ce1ad033
Merge pull request #30686 from poettering/uki-measured-check-imply-tpm2
efi-loader: when detecting if we are booted in UKI measured boot mode, imply a check for TPM2
2024-01-03 18:39:22 +08:00
Yu Watanabe
6e6b59ed00 unit: order systemd-resolved after systemd-sysctl
Otherwise, IPv6 enable/disable setting may be changed after resolved is
started.
2024-01-03 04:07:15 +09:00
Lennart Poettering
9f32bb927c Revert "units: add ConditionSecurity=tpm2 to systemd-tpm2-setup units"
Now that the ConditionSecurity=uki-measured check is tighter we can drop
the explicit TPM2 check again.

This reverts commit aa735b0219.
2024-01-02 17:49:04 +01:00
Luca Boccassi
aa735b0219 units: add ConditionSecurity=tpm2 to systemd-tpm2-setup units
ConditionSecurity=measured-uki can be true even with TPM 1.2 which we
don't support, so add an explicit check for TPM 2.0.

Fixes https://github.com/systemd/systemd/issues/30650

Follow-up for 2e64cb71b9
2023-12-29 03:14:34 +09:00
Lennart Poettering
644f19c75c creds: add varlink API for encrypting/decrypting credentials 2023-12-21 19:19:12 +01:00
Lennart Poettering
3ccadbce33 homectl: add "firstboot" command
This extends what systemd-firstboot does and runs on first boots only
and either processes user records passed in via credentials to create,
or asks the user interactively to create one (only if no regular user
exists yet).
2023-12-18 11:10:53 +01:00
Neil Wilson
627966ab01 systemd-homed.service.in: add quotactl to SystemCallFilter
Standard directories make a call to the quotactl system call to enforce disk size limits.

Fixes #30287
2023-12-01 22:43:31 +00:00
Yu Watanabe
f89985ca49 unit: make journald stopped on soft-reboot before broadcasting SIGKILL
Workaround for #30195.
2023-11-28 18:28:17 +09:00
Zbigniew Jędrzejewski-Szmek
4704176795 units: disable start rate limit for systemd-vconsole-setup.service
The unit will be started or restarted a few times during boot, but but it has
StartLimitBurst = DefaultStartLimitBurst = 5, which means that the fifth
restart will already fail. On my laptop, I have exactly 4 restarts, so I don't
hit the limit, but on a slightly different system we will easily hit the limit.
In https://bugzilla.redhat.com/show_bug.cgi?id=2251394, there are five reloads
and we hit the limit.

Since 6ef512c0bb we propagate the start counter
over switch-root and daemon reloads, so it's easier to hit the limit during
boot.

In principle there might be systems with lots of vtcon devices, so let's just
allow the unit to be restarted without a limit.

Fixes https://bugzilla.redhat.com/show_bug.cgi?id=2251394.
2023-11-25 13:27:17 +01:00
Lennart Poettering
4134f47de2 units: pull in plymouth when booting into storagetm mode 2023-11-13 15:45:16 +01:00
Lennart Poettering
809def1940 units: add units that put together and install a TPM2 PCR policy at boot
(This is disabled by default, for now)
2023-11-03 11:24:45 +01:00
Lennart Poettering
1761066b13 storagetm: add new systemd-storagetm component
This implements a "storage target mode", similar to what MacOS provides
since a long time as "Target Disk Mode":

        https://en.wikipedia.org/wiki/Target_Disk_Mode

This implementation is relatively simple:

1. a new generic target "storage-target-mode.target" is added, which
   when booted into defines the target mode.

2. a small tool and service "systemd-storagetm.service" is added which
   exposes a specific device or all devices as NVMe-TCP devices over the
   network.  NVMe-TCP appears to be hot shit right now how to expose
   block devices over the network. And it's really simple to set up via
   configs, hence our code is relatively short and neat.

The idea is that systemd-storagetm.target can be extended sooner or
later, for example to expose block devices also as USB mass storage
devices and similar, in case the system has "dual mode" USB controller
that can also work as device, not just as host. (And people could also
plug in sharing as NBD, iSCSI, whatever they want.)

How to use this? Boot into your system with a kernel cmdline of
"rd.systemd.unit=storage-target-mode.target ip=link-local", and you'll see on
screen the precise "nvme connect" command line to make the relevant
block devices available locally on some other machine. This all requires
that the target mode stuff is included in the initrd of course. And the
system will the stay in the initrd forever.

Why bother? Primarily three use-cases:

1. Debug a broken system: with very few dependencies during boot get
   access to the raw block device of a broken machine.

2. Migrate from system to another system, by dd'ing the old to the new
   directly.

3. Installing an OS remotely on some device (for example via Thunderbolt
   networking)

(And there might be more, for example the ability to boot from a
laptop's disk on another system)

Limitations:

1. There's no authentication/encryption. Hence: use this on local links
   only.

2. NVMe target mode on Linux supports r/w operation only. Ideally, we'd
   have a read-only mode, for security reasons, and default to it.

Future love:

1. We should have another mode, where we simply expose the homed LUKS
   home dirs like that.

2. Some lightweight hookup with plymouth, to display a (shortened)
   version of the info we write to the console.

To test all this, just run:

    mkosi --kernel-command-line-extra="rd.systemd.unit=storage-target-mode.target" qemu
2023-11-02 14:19:32 +01:00
Martin Wilck
bf25cf6c49 units: modprobe@.service: don't unescape instance name
modprobe treats "-" and "_" interchangeably, thereby avoiding frequent
errors because some module names contain dashes and others underscores.

Because modprobe@.service unescapes the instance name, an attempt to
start "modprobe@dm-crypt.service" will run "modprobe -abq dm/crypt",
which is doomed to fail. "modprobe@dm_crypt.service" will work as
expected. Thus unescaping the instance name has surprising side effects.
Use "%i" instead.
2023-10-21 11:41:22 +01:00
Lennart Poettering
cde8cc946b
Merge pull request #29272 from enr0n/coredump-container
coredump: support forwarding coredumps to containers
2023-10-16 16:13:16 +02:00
Lennart Poettering
f5151fb459 sysext: make some calls available via varlink 2023-10-16 12:08:39 +02:00
Nick Rosbrook
411d8c72ec nspawn: set CoredumpReceive=yes on container's scope when --boot is set
When --boot is set, and --keep-unit is not, set CoredumpReceive=yes on
the scope allocated for the container. When --keep-unit is set, nspawn
does not allocate the container's unit, so the existing unit needs to
configure this setting itself.

Since systemd-nspawn@.service sets --boot and --keep-unit, add
CoredumpReceives=yes to that unit.
2023-10-13 15:28:50 -04:00
Priit Laes
c08bec1587 systemd-journal-upload: Increase failure tolerance (#19426, #2877)
As systemd-journal-upload deals mostly with remote servers, add
some failsafes to its unit to restart on failures.

```
[Service]
Restart=on-failure
RestartSteps=10
RestartMaxDelaySec=60
```
2023-10-12 23:10:59 +01:00
Lennart Poettering
4e16d5c69e pcrextend: make pcrextend tool acccessible via varlink
This is primarily supposed to be a 1st step with varlinkifying our
various command line tools, and excercise in how this might look like
across our codebase one day. However, at AllSystemsGo! 2023 it was
requested that we provide an API to do a PCR measurement along with a
matching event log record, and this provides that.
2023-10-06 11:49:38 +02:00
Lennart Poettering
2e64cb71b9 tpm2-setup: add new early boot tool for initializing the SRK
This adds an explicit service for initializing the TPM2 SRK. This is
implicitly also done by systemd-cryptsetup, hence strictly speaking
redundant, but doing this early has the benefit that we can parallelize
this in a nicer way. This also write a copy of the SRK public key in PEM
format to /run/ + /var/lib/, thus pinning the disk image to the TPM.
Making the SRK public key is also useful for allowing easy offline
encryption for a specific TPM.

Sooner or later we should probably grow what this service does, the
above is just the first step. For example, the service should probably
offer the ability to reset the TPM (clear the owner hierarchy?) on a
factory reset, if such a policy is needed. And we might want to install
some default AK (?).

Fixes: #27986
Also see: #22637
2023-09-29 19:36:04 +02:00
Lennart Poettering
174e8e9897
Merge pull request #29345 from poettering/measured-uki-condition
pid1: introduce ConditionSecurity=measured-uki
2023-09-27 16:39:46 +02:00
Mike Yuan
99f360a46b units/blockdev@.target: conflict with umount.target
Follow-up for d120ce478d

blockdev@.target is used as a synchronization point between
the mount unit and corresponding systemd-cryptsetup@.service.
After the mentioned commit, it doesn't get a stop job enqueued
during shutdown, and thus the stop job for systemd-cryptsetup@.service
could be run before the mount unit is stopped.

Therefore, let's make blockdev@.target conflict with umount.target,
which is also what systemd-cryptsetup@.service does.

Fixes #29336
2023-09-27 12:33:40 +02:00
Lennart Poettering
8506bf494d units: move units over to ConditionSecurity=measured-uki 2023-09-27 12:13:26 +02:00
Lennart Poettering
c8cb548f0b Revert "userdbd: Order systemd-userdbd.service after systemd-remount-fs.service"
This reverts commit 9dd8858281.
2023-09-27 11:02:06 +02:00
Lennart Poettering
0869e1326a oomd: correct listening sockets
So, unfortunately oomd uses "io.system." rather than "io.systemd." as
prefix for its sockets. This is a mistake, and doesn't match the
Varlink interface naming or anything else in oomd.

hence, let's fix that.

Given that this is an internal protocol between PID1 and oomd let's
simply change this without retaining compat.
2023-09-25 23:27:18 +02:00
Lennart Poettering
32295fa08f pcrphase: rename binary to pcrextend
The tool initially just measured the boot phase, but was subsequently
extended to measure file system and machine IDs, too. At AllSystemsGo
there were request to add more, and make the tool generically
accessible.

Hence, let's rename the binary (but not the pcrphase services), to make
clear the tool is not just measureing the boot phase, but a lot of other
things too.

The tool is located in /usr/lib/ and still relatively new, hence let's
just rename the binary and be done with it, while keeping the unit names
stable.

While we are at it, also move the tool out of src/boot/ and into its own
src/pcrextend/ dir, since it's not really doing boot related stuff
anymore.
2023-09-25 17:17:20 +02:00
Daan De Meyer
021b0ff405 repart: Don't fail on boot if we can't find the root block device
When booting from virtiofs, we won't be able to find a root block
device. Let's gracefully handle this similar to how we don't fail
if we can't find a GPT partition table.
2023-09-22 16:01:12 +01:00
Joerg Behrmann
7227dd816f treewide: fix typos
- mostly: usecase -> use case
- continously -> continuously
- single typos in docs/FILE_DESCRIPTOR_STORE.md
2023-09-19 10:05:38 +02:00
Mike Yuan
89a1bb9012
units: order battery-check before hibernate-resume 2023-09-07 20:21:16 +08:00
Mike Yuan
a628d933cc
hibernate-resume: split out the logic of finding hibernate location
Before this commit, the hibernate location logic only exists in
the generator. Also, we compare device nodes (devnode_same()) and
clear EFI variable HibernateLocation in the generator too. This is
not ideal though: when the generator gets to run, udev hasn't yet
started, so effectively devnode_same() always fails. Moreover, if
the boot process is interrupted by e.g. battery-check, the hibernate
information is lost.

Therefore, let's split out the logic of finding hibernate location.
The generator only does the initial validation of system info and
enables systemd-hibernate-resume.service, and when the service
actually runs we validate everything again, which includes comparing
the device nodes and clearing the EFI variable. This should make
things more robust, plus systems that don't utilize a systemd-enabled
initrd can use the exact same logic to resume using the EFI variable.
I.e., systemd-hibernate-resume can be used standalone.
2023-09-07 20:21:16 +08:00
Victor Westerhuis
9dd8858281 userdbd: Order systemd-userdbd.service after systemd-remount-fs.service
Otherwise the root filesystem might still be readonly and
systemd-userdbd fails to start.

Explicitly pick systemd-remount-fs.service instead of local-fs-pre.target
to prevent a dependency cycle.
2023-09-04 09:47:05 +08:00
Yu Watanabe
c3c885a771 bsod: several cleanups
- add reference to the service unit in the man page,
- fix several indentation and typos,
- replace '(uint64_t) -1' with 'UINT64_MAX',
- drop unnecessary 'continue'.
2023-08-22 23:20:14 +09:00
Luca Boccassi
b24d10e35a
Merge pull request #28697 from 1awesomeJ/new_bsod
systemd-bsod: Add "--continuous" option
2023-08-18 00:20:04 +01:00
OMOJOLA JOSHUA
77d0917ea3 systemd-bsod: Add "--continuous" option 2023-08-17 13:13:54 +01:00
Yu Watanabe
bb7f485f4b units: introduce systemd-tmpfiles-setup-dev-early.service
This makes tmpfiles, sysusers, and udevd invoked in the following order:
1. systemd-tmpfiles-setup-dev-early.service
   Create device nodes gracefully, that is, create device nodes anyway
   by ignoring unknown users and groups.
2. systemd-sysusers.service
   Create users and groups, to make later invocations of tmpfiles and
   udevd can resolve necessary users and groups.
3. systemd-tmpfiles-setup-dev.service
   Adjust owners of previously created device nodes.
4. systemd-udevd.service
   Process all devices. Especially to make block devices active and can
   be mountable.
5. systemd-tmpfiles-setup.service
   Setup basic filesystem.

Follow-up for b42482af90.

Fixes #28653.
Replaces #28681 and #28732.
2023-08-12 07:55:20 +09:00
Yu Watanabe
12aac8ea45 Revert "unit: make udev rules really take precedence over tmpfiles"
This reverts commits 112a41b6ec,
3178698bb5, and
b768379e8b.

The commit 112a41b6ec introduces #28765,
as systemd-tmpfiles-setup.service has ordering after local-fs.target,
but usually the target requires block devices processed by udevd.
Hence, the service can only start after the block devices timed out.

Fixes #28765.
2023-08-12 07:55:20 +09:00
Yu Watanabe
112a41b6ec unit: make udev rules really take precedence over tmpfiles
Follow-up for b42482af90.

The commit makes systemd-tmpfiles-setup.service also updates the
permission or owner of device nodes. However, the service does not have
ordering for systemd-udevd.service. So, the service may set different
permission from the one udevd already set.

Fixes #28653.
Replaces #28681.
2023-08-09 07:24:02 +09:00
Yu Watanabe
41521e3a21 Revert "unit: make udev rules take precesence over tmpfiles"
This reverts commit 31845ef554.

systemd-tmpfiles-setup-dev.service has Before=systemd-udevd.service.
So the commit does not change anything.
2023-08-09 07:13:09 +09:00
Yu Watanabe
9289e093ae meson: use install_emptydir() and drop meson-make-symlink.sh
The script is mostly equivalent to 'mkdir -p' and 'ln -sfr'.
Let's replace it with install_emptydir() builtin function and
inline meson call.
2023-08-08 22:11:34 +01:00
Fabian Vogt
327cd2d3db units/initrd-parse-etc.service: Conflict with emergency.target
If emergency.target is started while initrd-parse-etc.service/start is queued,
the initrd-parse-etc job did not get canceled. In parallel to the emergency
units, it eventually runs the service, which starts initrd-cleanup.service,
which in turn isolates initrd-switch-root.target. This stops the emergency
units and effectively starts the initrd boot process again, which likely
fails again like the initial attempt. The system is thus stuck in an endless
loop, never really reaching emergency.target.

With this conflict added, starting emergency.target automatically cancels
initrd-parse-etc.service/start, avoiding the loop.
2023-08-08 20:24:39 +01:00
Yu Watanabe
31845ef554 unit: make udev rules take precesence over tmpfiles
Without this change, there are no ordering between udevd and tmpfiles,
and if tmpfiles is invoked later it may discard the permission set by
udevd.

Fixes an issue introduced by b42482af90.

Fixes #28588 and #28653.
2023-08-05 04:38:39 +09:00
Daan De Meyer
e2e20b6d3c Revert "units: Import all repart credentials in systemd-repart.service"
This reverts commit ed6b99dbf1.
2023-08-01 15:10:02 +02:00
Daan De Meyer
ed6b99dbf1 units: Import all repart credentials in systemd-repart.service 2023-08-01 07:53:59 +02:00
Luca Boccassi
b0d3095fd6 Drop split-usr and unmerged-usr support
As previously announced, execute order 66:

https://lists.freedesktop.org/archives/systemd-devel/2022-September/048352.html

The meson options split-usr, rootlibdir and rootprefix become no-ops
that print a warning if they are set to anything other than the
default values. We can remove them in a future release.
2023-07-28 19:34:03 +01:00
Daan De Meyer
9e72aa1832 units: Load agetty credentials in all getty units
In it's latest release, agetty will support reading the agetty.autologin
and login.noauth credentials, so let's make sure we import those in our
getty units so they're available to agetty to read.
2023-07-27 16:50:58 +02:00
Daan De Meyer
f2aaa14d37 units: Add --graceful flag to pcrphase units
Some of the new units using systemd-pcrphase are missing the --graceful
flag which causes them to error if the tpm libraries are not installed.
Add --graceful just like in the other pcrphase units to make systemd-pcrphase
exit gracefully if the tpm libraries are missing.
2023-07-17 14:24:46 +02:00
Luca Boccassi
9027aff9d4
Merge pull request #27867 from keszybz/vconsole-reload-again
Restore ordering between vconsole-setup and firstboot services
2023-07-14 23:06:18 +01:00
Yu Watanabe
7cfef4bb48 battery-check: allow to skip by passing systemd.battery-check=0 2023-07-14 15:56:29 +01:00
Zbigniew Jędrzejewski-Szmek
8623dab880 units/systemd-vconsole-setup: suppress error when service is restarted
The service has Type=oneshot, which means that the default value of SuccessExitStatus=0.
When multiple vtcon devices are detected, udev will restart the service after each
one. If this happens quickly enough, the old instance will get SIGTERM while it is
still running:

[    5.357341] (udev-worker)[593]: vtcon1: /usr/lib/udev/rules.d/90-vconsole.rules:12 RUN '/usr/bin/systemctl --no-block restart systemd-vconsole-setup.service
[    5.357439] (udev-worker)[593]: vtcon1: Running command "/usr/bin/systemctl --no-block restart systemd-vconsole-setup.service"
[    5.357485] (udev-worker)[593]: vtcon1: Starting '/usr/bin/systemctl --no-block restart systemd-vconsole-setup.service'
[    5.357537] (udev-worker)[609]: vtcon0: /usr/lib/udev/rules.d/90-vconsole.rules:12 RUN '/usr/bin/systemctl --no-block restart systemd-vconsole-setup.service
[    5.357587] (udev-worker)[609]: vtcon0: Running command "/usr/bin/systemctl --no-block restart systemd-vconsole-setup.service"
[    5.357634] (udev-worker)[609]: vtcon0: Starting '/usr/bin/systemctl --no-block restart systemd-vconsole-setup.service'
...
[    5.680529] systemd[1]: systemd-vconsole-setup.service: Trying to enqueue job systemd-vconsole-setup.service/restart/replace
[    5.680565] systemd[1]: systemd-vconsole-setup.service: Merged into running job, re-running: systemd-vconsole-setup.service/restart as 557
[    5.680600] systemd[1]: systemd-vconsole-setup.service: Enqueued job systemd-vconsole-setup.service/restart as 557
...
[    5.682334] systemd[1]: Received SIGCHLD from PID 744 ((le-setup)).
[    5.682377] systemd[1]: Child 744 ((le-setup)) died (code=killed, status=15/TERM)
[    5.682407] systemd[1]: systemd-vconsole-setup.service: Child 744 belongs to systemd-vconsole-setup.service.
[    5.682436] systemd[1]: systemd-vconsole-setup.service: Main process exited, code=killed, status=15/TERM
[    5.682471] systemd[1]: systemd-vconsole-setup.service: Failed with result 'signal'.
[    5.682518] systemd[1]: systemd-vconsole-setup.service: Service will not restart (manual stop)
[    5.682552] systemd[1]: systemd-vconsole-setup.service: Changed stop-sigterm -> failed

This is expected and not a problem. Let's treat SIGTERM as success so we don't
get this spurious "failure".
2023-07-13 10:56:52 +02:00
Zbigniew Jędrzejewski-Szmek
6cfb3ebc60 units/systemd-firstboot: start the service after systemd-vconsole-setup.service
This way, we don't start user interaction before (or while) the configured
fonts are loading.

Tweak the comments a bit while at it.
2023-07-12 15:54:33 +02:00
Zbigniew Jędrzejewski-Szmek
3b2321f6ea units/systemd-vconsole-setup.service: improve title
"Setup" is a noun, and the expected order is "<adjective> <noun>".
("Set up" is the verb. But we want a noun here, so that we can say
e.g. "Starting Virtual Console Setup".)
2023-07-12 15:54:33 +02:00
Yu Watanabe
6750c1af24 unit: also condition out systemd-backlight in initrd
Follow-up for 9173d31dfea5c2b05ff08480972c499cb7aac940.

The systemd-backlight@.service also save/restore state but the data
is in /var/.
2023-07-05 09:01:27 +02:00
Lennart Poettering
49c55abcbe units: condition out a few services in the initrd
Let's make our units more robust to being added to an initrd:

1. systemd-boot-update only makes sense if sd-boot is available in /usr/
   to copy into the ESP. This is generally not the case in initrds, and
   even if it was, we shouldn't update the ESP from the initrd, but from
   the host instead.

2. The rfkill services save/restore rfkill state, but that information
   is only available once /var/ is mounted, which generally happens
   after the initrd transition.

3. utmp management is partly in /var/, and legacy anyway, hence don't
   bother with it in the initrd.
2023-07-05 10:58:47 +09:00
Lennart Poettering
c65e3d7a9b units: skip systemd-battery-check in environments where it doesn't make sense
Let's condition the service so that it doesn't run where we aren't
directly run on baremetal, or where no power sources are discovered at
all.
2023-07-03 16:38:42 +01:00
Lennart Poettering
95dafd30da battery-check: rework unit
Let's rename the unit to systemd-battery-check.service. We usually want
to name our own unit files like our tools they wrap, in particular if
they are entirely defined by us (i.e. not just wrappers of foreign
concepts)

While we are at it, also hook this in from initrd.target, and order it
against initrd-root-device.target so that it runs before the root device
is possibly written to (i.e. mounted or fsck'ed).

This is heavily inspired by @aafeijoo-suse's PR #28208, but quite
different ;-)
2023-07-01 03:19:16 +08:00
Yu Watanabe
be994c2640 battery-check: several follow-ups
Follow-ups for e3d4148d50.

- add reference to initrd-battery-check.service in man page, and move
  its section from 1 to 8,
- add link to man page in help message,
- introduce ERRNO_IS_NO_PLYMOUTH(),
- propagate error in battery_check_send_plymouth_message(),
- rename battery_check_send_plymouth_message() -> plymouth_send_message(),
- return earlier when the first battery level check passed to reduce
  indentation,
- fix potential use of invalid fd on battery restored,
- do not use emoji for /dev/console,
- add simple test (mostly for coverity),

etc, etc...
2023-06-29 15:41:00 +09:00
OMOJOLA JOSHUA
e3d4148d50 PID1: detect battery level in initrd and if low refuse continuing to boot, print message and shut down. 2023-06-28 14:48:54 +01:00
Mike Yuan
760e99bb52
hibernate-resume: rework to follow the logic of sleep.c and use
main-func.h

Preparation for #27247
2023-06-23 23:57:49 +08:00
Frantisek Sumsal
dc7e580e64 tree-wide: use https for the 0pointer.de doc links 2023-06-23 13:46:56 +01:00
Yu Watanabe
742aebc5a7 meson: merge two similar loops for unit files
This also merges two arrays units and in_units, and uses dictionary
for declaring units.

This also fixes the condition handling, that previously only two
conditions were handled and rests were ignored.
2023-06-22 10:19:51 -06:00
Daan De Meyer
9a0eade760 units: Use built-in halt and kexec features instead of systemctl 2023-06-22 10:33:18 +01:00
Daan De Meyer
1ab6ae1957 units: Use ImportCredential= where applicable 2023-06-08 14:09:36 +02:00
Lennart Poettering
1775872679 units: change TimeoutSec=0 to TimeoutSec=infinity
Follow-up for #27936

Let's also update a bunch of static unit files, matching what we just
did for the generators.
2023-06-06 18:23:43 +01:00
Lennart Poettering
13ffc60749 pid1: add "soft-reboot" reboot method
This adds a new mechanism for rebooting, a form of "userspace reboot"
hereby dubbed "soft-reboot". It will stop all services as in a usual
shutdown, possibly transition into a new root fs and then issue a fresh
initial transaction. The kernel is not replaced.

File descriptors can be passed over, thus opening the door for leaving
certain resources around between such reboots.

Usecase: this is an extremely quick way to reset userspace fully when
updating image based systems, without going through a full
hardware/firmware/boot loader/kernel/initrd cycle. It minimizes "grayout time"
for OS updates. (In particular when combined with kernel live patching)
2023-06-02 16:49:38 +02:00
Lennart Poettering
d120ce478d units: don't stop blockdev@.target unit at shutdown
We want that cryptsetup/veritysetup devices can stick around until the
very end, as well as the users of them which might depend on
blockdev@.target for the devices. Hence leave the targets around till
the very end.

Note that their runtime is managed via StopWhenUnneeded= anyway, hence
unless their are volumes that actually survive still the very end they
target units will still be stopped.
2023-06-01 18:49:43 +02:00
Lennart Poettering
7a2f3194ff units: set DefaultDependencies=no for veritysetup slice
This mimics what we already have for cryptsetup services: the slice they
are placed in (they have their own slice since that's what we do by
default for instantiated services) shouldn't conflict with
shutdown.target, so that veritysetup services can stay around until the
very end (which is what we want for the root and usr verity volumes).

It's literally just a copy of the same unit we already have for
cryptsetup, just with an updated description string.
2023-06-01 18:49:43 +02:00
Zbigniew Jędrzejewski-Szmek
bec89355c5 units: pull in local-fs-pre.target from systemd-tmpfiles-setup-dev.service
local-fs-pre.target is a passive unit, which means that it is supposed to be
pulled in by everything that is ordered before it. We had
Before=local-fs-pre.target, so add Wants= too.

I don't expect this to change anything. Instead, just make things follow the
docs so it's easier to reason about the dependency set.
2023-05-31 15:44:44 +02:00
Mike Yuan
97d822abac
Merge pull request #27787 from keszybz/firstboot-synchronous-restart
firstboot: make restart of vconsole-setup synchronuous
2023-05-27 02:30:45 +08:00
Zbigniew Jędrzejewski-Szmek
b2ce20aa0c units: order systemd-firstboot after systemd-tmpfiles-setup
We may copy files from factory to /etc. The default mkosi config has
factory/etc/vconsole.conf. systemd-firstboot would race with tmpfiles-setup,
and sometimes ask for the keymap, and sometimes not.

I guess that if there are files in factory, we shouldn't ask the user for
the same configuration.
2023-05-26 15:07:01 +02:00
Zbigniew Jędrzejewski-Szmek
8eb668b9ab firstboot: synchronously wait for systemd-vconsole-setup.service/restart job
Requested in https://github.com/systemd/systemd/pull/27755#pullrequestreview-1443489520.

I dropped the info message about the job being requested, because we get
fairly verbose logs from starting the unit, and the additional message isn't
useful.

In the unit, the ordering before systemd-vconsole-setup.service is dropped,
because now it needs to happen in parallel, while systemd-firstboot.service
is running. This means that we may potentially execute vconsole-setup twice,
but it's fairly quick, so this doesn't matter much.
2023-05-26 15:07:01 +02:00
Daan De Meyer
75efd16fb0 units: Shut down networkd and resolved on switch-root
Let's explicitly order these against initrd-switch-root.target, so
that they are properly shut down before we switch root. Otherwise,
there's a race condition where networkd might only shut down after
switching root and after we've already we've loaded the unit graph,
meaning it won't be restarted in the rootfs.

Fixes #27718
2023-05-26 06:54:56 +08:00
Zbigniew Jędrzejewski-Szmek
a777a59243 firstboot: process the root account after sysusers created it
We would create root account from sysusers or from firstboot, depending on
which one ran earlier. Since firstboot offers more options, in particular can
set the root password, we needed to order it earlier. This created an ugly
ordering requirement:

systemd-sysusers.service > systemd-firstboot.service > ... >
  systemd-remount-fs.service > systemd-tmpfiles-setup-dev.service >
  systemd-sysusers.service

We want sysusers.service to create basic users, so we can create nodes in dev,
so we can operate on block devices and such, so that we can resize and remount
things. But at the same time, systemd-firstboot.service can only work if it is
run early, before systemd-sysusers.service has created /etc/passwd. We can't
have it both ways: the units that want to have a fully writable root file
system cannot be ordered before units which are required to do file system
preparation.

Instead of trying to order firstboot very early, let's let it do its thing even
if it is started later. Instead of refusing to create to the root account if
/etc/passwd and /etc/shadow exist, actually check if the account is configured.
Now sysusers writes root account with password PASSWORD_UNPROVISIONED
("!unprovisioned"), and then firstboot checks for this, and will configure root
in this case.

This allows sysusers to be executed earlier (or accounts to be set up earlier
in another way).

This effectively reverts b825ab1a99.
2023-05-23 15:09:39 +02:00
Zbigniew Jędrzejewski-Szmek
b42482af90 units: create /dev with --graceful first, allow sysusers to run later
We want to call systemd-tmpfiles-setup-dev.service to create /dev/fuse and
other device nodes so that module probing will work. But it is possible that
when we're in first boot, some users or groups need to be created by
systemd-sysusers first. But it is also possible that systemd-sysusers cannot
actually execute configuration because the root partition is not fully writable
yet. So let systemd-tmpfiles-setup-dev.service run earlier, possibly without
all users and groups in place. Since systemd-tmpfiles-setup-dev.service writes
to /dev only, it doesn't care how the root partition is mounted. In this early
run, some some nodes might be created with default permissions (i.e. not
accessible to non-root users or groups). This should be OK for the early boot
phase. Afterwards, we let systemd-tmpfiles-setup.service execute full
configuration. We will configure any files in /dev twice, but considering that
there's only a few of them and that the second run should only adjust ownership
and permissions, this should be OK. This way, we avoid the dependency loop.
2023-05-23 15:09:39 +02:00
Zbigniew Jędrzejewski-Szmek
b93562a1a1 units: make sure proc-sys-binfmt_misc.automount is actually stopped
As with other units, stopping of the automount requires actual work,
and without the ordering dependency systemd might not execute the stop
job before shutdown.target is reached and units ordered after that are
executed.
2023-05-23 12:39:34 +02:00
Zbigniew Jędrzejewski-Szmek
d6f6846464 units/systemd-repart: stop pretending that root config is executed in the initrd
I have a system with /usr/lib/repart.d/50-root.conf with GrowFileSystem=yes.
The partition wouldn't be resized in the initrd, because
ConditionDirectoryNotEmpty=|/sysusr/usr/lib/repart.d was evaluated very early,
before /sysroot was mounted. There was no ordering dependency between
systemd-repart.service and sysroot.mount. (There was After=initrd-usr-fs.target,
but it seems to be only referred to by systemd-fstab-generator, which in my
case doesn't even run, because there's no fstab.)

But in fact, we neeed to run systemd-repart in the initrd only in limited
circumstances: when we need to create the root device based on config under
sysusr.mount. If there is config on the root device, it can be executed in
the host system, early during boot. Thus, let's remove the condition on
/sysroot/…. Without an ordering dependency on sysroot.mount, it was subject to
a race condition anyway. (A race condition with a low probability of "winning",
because systemd-repart.service has no dependencies, but sysroot.mount requires
a device to be detected and the mount to happen.)

The other problem was that systemd-repart.service didn't have the ordering wrt.
initrd-switch-root.target, so it was subject to the same race condition that
was fixed for other units in 7c0e2b5559. (If the
systemd-repart.service/stop job is slow, we could end up not restarting
systemd-repart.service in the host system.)

With the changes here, I see systemd-repart.service/start running twice:
in the initrd it is skipped because the conditions fail, and then in the
host system it runs normally.

Note: support for /sysroot is retained in systemd-repart code. I don't see a
strong reason to remove it, since it may still be useful to people invoking
repart in the initrd in other circumstances.
2023-05-23 12:39:33 +02:00
Zbigniew Jędrzejewski-Szmek
4e66876dfc units: do more reordering of ordering config
No functional change, just a cleanup to make the subsequent changes easier to
see. This is a continuation of 9810e41942

> The block is reordered and split to have:
>    1. description + documentation
>    2. (optionally) conditions
>    3. all the dependencies

The dependencies for shutdown.target are listed separately because they are the
other deps are for startup, and shutdown.target only matter much later.
2023-05-23 12:39:16 +02:00
Zbigniew Jędrzejewski-Szmek
a6f3a7eb8a units: order sysinit.target, debug-shell.service after systemd-vconsole-setup
Previous patch to add an implicit dependency effectively orders various getty
services after systemd-vconsole-setup.service. But I think it's cleaner to also
order the service before sysinit.target, like it was before
8125e8d38e. There might be units which don't do
use TTYVHangup= but would like to have the console fully initialized.

Also, add a manual ordering to debug-shell.service, because it has
ImplicitDependencies=no. This might delay debug-shell.service a bit, but
systemd-vconsole-setup.service has no dependencies and should be very quick, so
this should not be noticable in practice. Without the ordering, the terminal
might not have a key map loaded, making debug-shell.service hard to use.
2023-05-19 17:47:14 +02:00
Zbigniew Jędrzejewski-Szmek
ce0fe01f22 units: order getty units after getty-pre.target unconditionally
Those two units had this ordering conditionalized on HAVE_SYSV_COMPAT. This
seems strange. 45e2753297 added the ordering
differently for those two files without any comment, and I think it was just
pasted or scripted erroneously.
2023-05-19 15:22:45 +02:00
Yu Watanabe
d0e3ae838f unit: add conditions and deps to make oomd.socket and .service consistent
Fixes #27690.
2023-05-19 08:58:56 +02:00
Daan De Meyer
4340e5b6df Revert "units: Add missing dependencies on initrd-switch-root.target"
This reverts commit f0ad3e6b96.
2023-05-15 15:42:21 +02:00
Daan De Meyer
f0ad3e6b96 units: Add missing dependencies on initrd-switch-root.target
These are all services that valid to be run in the initrd, so let's
make sure they have the appropriate dependencies on
initrd-switch-root.target so that they are stopped when we're about
to switch root.
2023-05-13 02:14:02 +08:00
Daan De Meyer
97211510b0 units: Add CAP_NET_ADMIN condition to systemd-networkd-wait-online@.service as well
It was added to CAP_NET_ADMIN but we forgot to add it to the template
service as well.
2023-05-09 17:59:55 +02:00
Yu Watanabe
d421db6e8b units: add/fix Documentation= about bus interface 2023-05-09 06:10:23 +09:00
Lennart Poettering
5ae89ef347 core/systemctl: when switching root default to /sysroot/
We hardcode the path the initrd uses to prepare the final mount point at
so many places, let's also imply it in "systemctl switch-root" if not
specified.

This adds the fallback both to systemctl and to PID 1 (this is because
both to — different – checks on the path).
2023-04-28 23:26:20 +01:00
Lennart Poettering
1a3704dcc3 nspawn: port over to /supervisor/ subcgroup being delegated to nspawn
Let's make use of the new DelegateSubgroup= feature and delegate the
/supervisor/ subcgroup already to nspawn, so that moving the supervisor
process there is unnecessary.
2023-04-27 12:18:32 +02:00
Lennart Poettering
f8371dbd56 udev: port to DelegateSubgroup= 2023-04-27 12:18:32 +02:00
Lennart Poettering
3975e3f8ae units: make system service manager create init.scope subcgroup for user service manager
This one is basically for free, since the service manager is already
prepared for being invoked in init.scope. Hence let's start it in the
right cgroup right-away.
2023-04-27 12:18:32 +02:00
Lennart Poettering
e76b3d4ed2 units: restrict hugepages fs a bit
suid binaries and device nodes should not be placed there, hence forbid
it.

Of all the API VFS we mount from PID 1 or via a unit file this one is
the only one where we didn't add MS_NODEV/MS_NOSUID. Let's address that,
since there's really no reason why device nodes or suid binaries would
be placed in hugetlbfs.
2023-04-27 12:28:50 +09:00
Eric Curtin
b9dac41837 Support /etc/system-update for OSTree systems
This is required when / is immutable and cannot be written at runtime.

Co-authored-by: Richard Hughes <richard@hughsie.com>
2023-04-25 17:40:41 +02:00
Lennart Poettering
3af48a86d9
Merge pull request #25608 from poettering/dissect-moar
dissect: add dissection policies
2023-04-12 13:46:08 +02:00
Kai Lueke
721412ac98 systemd-sysext/confext.service: Refresh on start/reload
When adding a sysext image to the system and manuall merging it, a
later "systemctl (re)start systemd-sysext" won't work because "merge"
refuses to work when something is merged already. Another problem with
"merge" at start plus "unmerge" at stop is that a service restart can't
make use of the new MOVE_MOUNT_BENEATH in the future even which would
only be available in "refresh". It also prepares us for setting up the
merged overlay for the sysroot from the initrd already, which also
would lead to the mentioned start problem of the service (One
optimization could be to skip the loading but only if we are sure that
all images were loaded and weren't modified since - this assumption is
hard because early services could want to inject a sysext, too).

Use "refresh" on service start to fix the problem that the service
can't start as soon as a manual merge was done. Also add a reload
action that allows to issue "systemctl reload systemd-sysext" and it
will make use of MOVE_MOUNT_BENEATH once we implement this in
systemd-sysext refresh (and it's available from the kernel).
2023-04-06 20:47:26 +09:00
maanyagoenka
1f839f48e0 confext: add the systemd-confext.service file 2023-04-05 21:50:04 +00:00
Lennart Poettering
73740c9f84 discover-image: automaticaly pick up sysext images from /.extra/sysext 2023-04-05 20:52:21 +02:00
Luca Boccassi
de862276ed sysext: stop storing under /usr/lib[/local]/extensions/
sysexts are meant to extend /usr. All extension images and directories are opened and merged in a
single, read-only overlayfs layer, mounted on /usr.
So far, we had fallback storage directories in /usr/lib/extensions and /usr/local/lib/extensions.
This is problematic for three reasons.

Firstly, technically, for directory-based extensions the kernel will reject
creating such an overlay, as there is a recursion problem. It actively
validates that a lowerdir is not a child of another lowerdir, and fails with
-ELOOP if it is. So having a sysext /usr/lib/extensions/myextdir/ would result
in an overlayfs config lowerdir=/usr/lib/extensions/myextdir/usr/:/usr which is
not allowed, as indicated by Christian the kernel performs this check:

/*
 * Check if this layer root is a descendant of:
 * - another layer of this overlayfs instance
 * - upper/work dir of any overlayfs instance
 */

<...>

/* Walk back ancestors to root (inclusive) looking for traps */
while (!err && parent != next) {
        if (is_lower && ovl_lookup_trap_inode(sb, parent)) {
                err = -ELOOP;
                pr_err("overlapping %s path\n", name);

Secondly, there's a confusing aspect to this recursive storage. If you
have /usr/lib/extensions/myext.raw which contains /usr/lib/extensions/mynested.raw
'systemd-sysext merge' will only pick up the first one, but both will appear in
the merged root under /usr/lib/extensions/. So you have two extension images, both
appear in your merged filesystem, but only one is actually in use.

Finally, there's a conceptual aspect: the idea behind sysexts and hermetic /usr
is that the /usr tree is not modified locally, but owned by the vendor. Dropping
extensions in /usr thus goes contrary to this foundational concept.
2023-03-30 11:25:17 +01:00
Lennart Poettering
62c72c60b5 units: let's establish the coredump socket before writting core_pattern sysctl
It's a bit nicer if we only write the sysctl core_pattern once the
coredump socket is established, since it's the backend for the handler.

Given the systemd-coredump.socket basically has no dependencies that run
before it this should not really make things slower or so, it just
removes the tiny window where core pattern is in effect that wants to
connect to the backend socket but cannot.

The status quo isn't terrible, and not too different in effect: either
way, until the socket unit is up we won't process coredumps. It's mostly
what kind of behaviour you get then: an error due to /bin/false being
invoked, or an error because systemd-coredump can't connect to its
socket. After this patch we'll exclusively see the former.
2023-03-30 08:53:52 +09:00
Mike Yuan
23c4c03406
unit: sysext: update unit name for sd-tmpfiles-setup
Fixes #26882
2023-03-19 01:29:48 +08:00
Daan De Meyer
cafd2c0be4 units: Order user@.service after systemd-oomd.service
The user manager connects to oomd over varlink. Currently, during
shutdown, if oomd is stopped before any user manager, the user
manager will try to reconnect to the socket, leading to a warning
from pid 1 about a conflicting transaction.

Let's fix this by ordering user@.service after systemd-oomd.service,
so that user sessions are stopped before systemd-oomd is stopped,
which makes sure that the user sessions won't try to start oomd via
its socket after systemd-oomd is stopped.
2023-03-18 15:05:43 +09:00
Daan De Meyer
a4180c0fb3 journald-console: Add colors when forwarding to console
Let's color output when we're forwarding to the console. To make this
work, we inherit TERM from pid 1 and use it to decide whether we should
output colors or not.
2023-03-16 11:22:58 +01:00
Jan Janssen
dfca5587cf tree-wide: Drop gnu-efi
This drops all mentions of gnu-efi and its manual build machinery. A
future commit will bring bootloader builds back. A new bootloader meson
option is now used to control whether to build sd-boot and its userspace
tooling.
2023-03-10 11:41:03 +01:00
Jan Engelhardt
18fe76eba5 doc: correct wrong use "'s" contractions 2023-03-07 13:39:31 +01:00
Lennart Poettering
9d03637404 units: let systemd --user manage its own memory pressure handling
Let's make things systematic: the per-user and the per-system manager
should manage their own memory pressure, as they are, well, managers of
things.

This is particularly relevant and the per-user service manager should
watch its own "init.scope" subcgroup, instead of the main service unit
cgroup, and hence $MEMORY_PRESSURE_WATCH as set by the per-system
service manager would simply be wrong.
2023-03-01 09:43:24 +01:00
Luca Boccassi
7ef09e2099 units: change assert to condition to skip running in initrd/os
These units are also present in the initrd, so instead of an assert,
just use a condition so they are skipped where they need to be skipped.

Fixes https://github.com/systemd/systemd/issues/26358
2023-02-09 12:04:21 +00:00
Zbigniew Jędrzejewski-Szmek
e4c7b5f517 core: split system/user job timeouts and make them configurable
Config options are -Ddefault-timeout-sec= and -Ddefault-user-timeout-sec=.
Existing -Dupdate-helper-user-timeout= is renamed to -Dupdate-helper-user-timeout-sec=
for consistency. All three options take an integer value in seconds. The
renaming and type-change of the option is a small compat break, but it's just
at compile time and result in a clear error message. I also doubt that anyone was
actually using the option.

This commit separates the user manager timeouts, but keeps them unchanged at 90 s.
The timeout for the user manager is set to 4/3*user-timeout, which means that it
is still 120 s.

Fedora wants to experiment with lower timeouts, but doing this via a patch would
be annoying and more work than necessary. Let's make this easy to configure.
2023-02-01 11:52:29 +00:00
Frantisek Sumsal
0eb635ef4b units: don't install pcrphase-related units without gnu-efi
since we don't have systemd-pcrphase built anyway, which breaks the tests:

...
I: Attempting to install /usr/lib/systemd/systemd-networkd-wait-online (based on unit file reference)
I: Attempting to install /usr/lib/systemd/systemd-network-generator (based on unit file reference)
I: Attempting to install /usr/lib/systemd/systemd-oomd (based on unit file reference)
I: Attempting to install /usr/lib/systemd/systemd-pcrphase (based on unit file reference)
W: Failed to install '/usr/lib/systemd/systemd-pcrphase'
make: *** [Makefile:4: setup] Error 1
make: Leaving directory '/root/systemd/test/TEST-01-BASIC'

Follow-up to 04959faa63.
2023-01-17 14:30:02 +01:00
Lennart Poettering
04959faa63 generators: optionally, measure file systems at boot
If we use gpt-auto-generator, automatically measure root fs and /var.

Otherwise, add x-systemd.measure option to request this.
2023-01-17 09:42:16 +01:00
Lennart Poettering
50072ccf1b units: rework growfs units to be just a regular unit that is instantiated
The systemd-growfs@.service units are currently written in full for each
file system to grow. Which is kinda pointless given that (besides an
optional ordering dep) they contain always the same definition. Let's
fix that and add a static template for this logic, that the generator
simply instantiates (and adds an ordering dep for).

This mimics how systemd-fsck@.service is handled. Similar to the wait
that for root fs there's a special instance systemd-fsck-root.service
we also add a special instance systemd-growfs-root.service for the root
fs, since it has slightly different deps.

Fixes: #20788
See: #10014
2023-01-17 09:42:16 +01:00
Lennart Poettering
072c8f6505 units: measure /etc/machine-id into PCR 15 during early boot
We want PCR 15 to be useful for binding per-system policy to. Let's
measure the machine ID into it, to ensure that every OS we can
distinguish will get a different PCR (even if the root disk encryption
key is already measured into it).
2023-01-17 09:42:16 +01:00
Franck Bui
2aba77057e journal: give the ability to enable/disable systemd-journald-audit.socket
Before this patch the only way to prevent journald from reading the audit
messages was to mask systemd-journald-audit.socket. However this had main
drawback that downstream couldn't ship the socket disabled by default (beside
the fact that masking units is not supposed to be the usual way to disable
them).

Fixes #15777
2023-01-11 17:18:57 +01:00
Lennart Poettering
5d71e463f4 logind: implement Type=notify-reload protocol properly
So close already. Let's add the two missing notifications too.

Fixes: #18484
2023-01-10 18:28:38 +01:00
Lennart Poettering
f84331539d udevd: implement the full Type=notify-reload protocol
We are basically already there, just need to add MONOTONIC_USEC= to the
RELOADING=1 message, and make sure the message is generated in really
all cases.
2023-01-10 18:28:38 +01:00
Lennart Poettering
0e07cdb0e7 networkd: implement Type=notify-reload protocol 2023-01-10 18:28:38 +01:00
Lennart Poettering
dd0ab174c3 pid1: make sure we send our calling service manager RELOADING=1 when reloading
And send READY=1 again when we are done with it.

We do this not only for "daemon-reload" but also for "daemon-reexec" and
"switch-root", since from the perspective of an encapsulating service
manager these three operations are not that different.
2023-01-10 18:28:38 +01:00
Daan De Meyer
e0ff0ee8f9
Merge pull request #25947 from poettering/resolved-dns-creds
resolved: add support for reading DNS config from kernel cmdline + service credentials
2023-01-06 14:11:57 +01:00
Lennart Poettering
882b011277 units: condition systemd-networkd-wait-online.service like systemd-networkd.service
This adds the same condition that systemd-networkd.service already
carries also to systemd-networkd-wait-online.service. Otherwise we'll
potentially see some logs we'd rather not see about a service we BindTo=
not running. Or in other words, if service X binds to Y then X should be
at least as conditioned as Y.
2023-01-05 21:44:45 +01:00
Lennart Poettering
116687f267 resolved: read DNS conf also from creds and kernel cmdline
Note that this drops ProtectProc=invisible from
systemd-resolved.service.

This is done because othewise access to the booted "kernel" command line is not
necessarily available. That's because in containers we want to read
/proc/1/cmdline for that.

Fixes: #24103
2023-01-05 18:52:15 +01:00
Lennart Poettering
ea575e176a vconsole: permit configuration of vconsole settings via credentials 2023-01-05 18:24:21 +01:00
Lennart Poettering
921fc451cb units: rename/rework systemd-boot-system-token.service → systemd-boot-random-seed.service
This renames systemd-boot-system-token.service to
systemd-boot-random-seed.service and conditions it less strictly.

Previously, the job of the service was to write a "system token" EFI
variable if it was missing. It called "bootctl --graceful random-seed"
for that. With this change we condition it more liberally: instead of
calling it only when the "system token" EFI variable isn't set, we call
it whenever a boot loader interface compatible boot loader is used. This
means, previously it was invoked on the first boot only: now it is
invoked at every boot.

This doesn#t change the command that is invoked. That's because
previously already the "bootctl --graceful random-seed" did two things:
set the system token if not set yet *and* refresh the random seed in the
ESP. Previousy we put the focus on the former, now we shift the focus to
the latter.

With this simple change we can replace the logic
f913c784ad added, but from a service that
can run much later and doesn't keep the ESP pinned.
2023-01-04 15:18:10 +01:00
Lennart Poettering
5019b0cb15 bootctl: downgrade graceful messages to LOG_NOTICE 2023-01-04 15:18:10 +01:00
Lennart Poettering
ce7dcfd6b0 units: pull in loop.ko and dm-mod.ko before repart
We want to make use of that when formatting file systems, hence let's
pull in these modules explicitly.

(This is necessary because we are an early boot service that might run
before systemd-tmpfiles-dev.service, which creates /dev/loop-control and
/dev/mapper/control.)

Alternatively we could just order ourselves after
systemd-tmpfiles-dev.service, but I think there's value in adding an
explicit minimal ordering here, since we know what we'll need.

Fixes: #25775
2022-12-23 17:26:57 +01:00
Lennart Poettering
143a1f1039 units: change modprobe@dm-mod.service → modprobe@dm_mod.service
Follow-up for 8f1359bf85
2022-12-23 17:26:48 +01:00
Michal Sekletar
d5e5bc2fe9 units: allow systemd-userdbd to change process name
rename_process() requires CAP_SYS_RESOURCE so let's make sure it is in
our permitted set after execve() by adding in to the bounding set.

Previously,
systemd-userdbd.service - User Database Manager
     Loaded: loaded (/usr/lib/systemd/system/systemd-userdbd.service; indirect; preset: disabled)
     Active: active (running) since Mon 2022-12-19 17:07:21 CET; 17min ago
TriggeredBy: ● systemd-userdbd.socket
       Docs: man:systemd-userdbd.service(8)
   Main PID: 1880 (systemd-userdbd)
     Status: "Processing requests..."
      Tasks: 4 (limit: 2272)
     Memory: 5.2M
        CPU: 244ms
     CGroup: /system.slice/systemd-userdbd.service
             ├─1880 /usr/lib/systemd/systemd-userdbd
             ├─2270 systemd-userwork
             ├─2271 systemd-userwork
             └─2272 systemd-userwork

Now,
    Loaded: loaded (/usr/lib/systemd/system/systemd-userdbd.service; indirect; preset: disabled)
     Active: active (running) since Mon 2022-12-19 17:27:02 CET; 15s ago
TriggeredBy: ● systemd-userdbd.socket
       Docs: man:systemd-userdbd.service(8)
   Main PID: 2404 (systemd-userdbd)
     Status: "Processing requests..."
      Tasks: 4 (limit: 2272)
     Memory: 5.5M
        CPU: 89ms
     CGroup: /system.slice/systemd-userdbd.service
             ├─2404 /usr/lib/systemd/systemd-userdbd
             ├─2407 "systemd-userwork: waiting..."
             ├─2408 "systemd-userwork: waiting..."
             └─2409 "systemd-userwork: waiting..."
2022-12-19 18:33:24 +01:00
Yu Watanabe
8f1359bf85 unit: use underbar for module name
For consistency with src/core/unit.c.
2022-12-19 12:12:02 +01:00
Lennart Poettering
0318d54539 pcrphase: gracefully exit if TPM2 support is incomplete
If everything points to the fact that TPM2 should work, but then the
driver fails to initialize we should handle this gracefully and not
cause failing services all over the place.

Fixes: #25700
2022-12-15 22:20:54 +01:00
Yu Watanabe
f74a7cb45c unit: check more specific path to be written by systemd-binfmt
Follow-up for 41807efb15.
Replaces #25690.
2022-12-15 03:36:27 +09:00
Lennart Poettering
51f3dc2234 units: change Requires=systemd-networkd.service → BindsTo= one more time
Follow-up for da15f8406e which did the
change for systemd-networkd-wait-online.service, let's also do this for
systemd-networkd-wait-online@.service
2022-11-29 16:56:07 +01:00
Daan De Meyer
da15f8406e units: Use BindsTo=systemd-networkd in systemd-networkd-wait-online.service
We don't want systemd-networkd-wait-online to start if systemd-networkd
is skipped due to condition failures. This is only guaranteed by BindsTo=
and not Requires=, so let's use BindsTo=
2022-11-26 07:35:05 +09:00
Luca Boccassi
0f6d54ca47 units: fix typo in Condition in systemd-boot-system-token
/lib/systemd/system/systemd-boot-system-token.service:20: Unknown key name 'ConditionPathExists|' in section 'Unit', ignoring

Follow-up for 0a1d8ac77a
2022-11-24 19:17:50 +01:00
Jason A. Donenfeld
0a1d8ac77a stub: handle random seed like sd-boot does
sd-stub has an opportunity to handle the seed the same way sd-boot does,
which would have benefits for UKIs when sd-boot is not in use. This
commit wires that up.

It refactors the XBOOTLDR partition discovery to also find the ESP
partition, so that it access the random seed there.
2022-11-23 00:56:45 +01:00
Jason A. Donenfeld
a4eea6038c bootctl: install system token on virtualized systems
Removing the virtualization check might not be the worst thing in the
world, and would potentially get many, many more systems properly seeded
rather than not seeded. There are a few reasons to consider this:

- In most QEMU setups and most guides on how to setup QEMU, a separate
  pflash file is used for nvram variables, and this generally isn't
  copied around.

- We're now hashing in a timestamp, which should provide some level of
  differentiation, given that EFI_TIME has a nanoseconds field.

- The kernel itself will additionally hash in: a high resolution time
  stamp, a cycle counter, RDRAND output, the VMGENID uniquely
  identifying the virtual machine, any other seeds from the hypervisor
  (like from FDT or setup_data).

- During early boot, the RNG is reseeded quite frequently to account for
  the importance of early differentiation.

So maybe the mitigating factors make the actual feared problem
significantly less likely and therefore the pros of having file-based
seeding might outweigh the cons of weird misconfigured setups having a
hypothetical problem on first boot.
2022-11-21 15:13:26 +01:00
Jason A. Donenfeld
0be72218f1 boot: implement kernel EFI RNG seed protocol with proper hashing
Rather than passing seeds up to userspace via EFI variables, pass seeds
directly to the kernel's EFI stub loader, via LINUX_EFI_RANDOM_SEED_TABLE_GUID.
EFI variables can potentially leak and suffer from forward secrecy
issues, and processing these with userspace means that they are
initialized much too late in boot to be useful. In contrast,
LINUX_EFI_RANDOM_SEED_TABLE_GUID uses EFI configuration tables, and so
is hidden from userspace entirely, and is parsed extremely early on by
the kernel, so that every single call to get_random_bytes() by the
kernel is seeded.

In order to do this properly, we use a bit more robust hashing scheme,
and make sure that each input is properly memzeroed out after use. The
scheme is:

    key = HASH(LABEL || sizeof(input1) || input1 || ... || sizeof(inputN) || inputN)
    new_disk_seed = HASH(key || 0)
    seed_for_linux = HASH(key || 1)

The various inputs are:
- LINUX_EFI_RANDOM_SEED_TABLE_GUID from prior bootloaders
- 256 bits of seed from EFI's RNG
- The (immutable) system token, from its EFI variable
- The prior on-disk seed
- The UEFI monotonic counter
- A timestamp

This also adjusts the secure boot semantics, so that the operation is
only aborted if it's not possible to get random bytes from EFI's RNG or
a prior boot stage. With the proper hashing scheme, this should make
boot seeds safe even on secure boot.

There is currently a bug in Linux's EFI stub in which if the EFI stub
manages to generate random bytes on its own using EFI's RNG, it will
ignore what the bootloader passes. That's annoying, but it means that
either way, via systemd-boot or via EFI stub's mechanism, the RNG *does*
get initialized in a good safe way. And this bug is now fixed in the
efi.git tree, and will hopefully be backported to older kernels.

As the kernel recommends, the resultant seeds are 256 bits and are
allocated using pool memory of type EfiACPIReclaimMemory, so that it
gets freed at the right moment in boot.
2022-11-14 15:21:58 +01:00
Yu Watanabe
403ca5b8b4 unit: also prioritize input devices when triggering devices
As in most cases, tty device without input devices is meaningless.

This also swaps the priority of tty and net:
- input devices are often connected under USB bus, hence may take
  slightly much time to be initialized. As, described in the above,
  in most cases it is allowed that tty devices are initialized just
  before input devices,
- network configuration usually requires much time, e.g. DHCP or RA,
  hence it is better that network interfaces initialized. Then,
  network services can start DHCP client or friends earlier.

Fixes #24026.
2022-10-26 10:49:09 +02:00
Lennart Poettering
047273e6e8 pcrphase: add two additional phases
This adds two more phases to the PCR boot phase logic: "sysinit" +
"final".

The "sysinit" one is placed between sysinit.target and basic.target.
It's good to have a milestone in this place, since this is after all
file systems/LUKS volumes are in place (which sooner or later should
result in measurements of their own) and before services are started
(where we should be able to rely on them to be complete).

This is particularly useful to make certain secrets available for
mounting secondary file systems, but making them unavailable later.

This breaks API in a way (as measurements during runtime will change),
but given that the pcrphase stuff wasn't realeased yet should be OK.
2022-10-17 12:09:43 +02:00
Daan De Meyer
9377e53f4f meson: Fix pcrphase unit conditions 2022-10-11 15:29:08 +02:00
Topi Miettinen
75723d31a6 units: udev: partially emulate ProtectClock=
Drop CAP_SYS_TIME and CAP_WAKE_ALARM capabilities and block clock-related
system calls. Update TODO.
2022-09-26 11:40:28 +02:00
Lennart Poettering
4cebd207d1 tmpfiles: add lines for provisioning ssh keys for root by default
With this, I can now easily do:

    systemd-nspawn --load-credential=ssh.authorized_keys.root:/home/lennart/.ssh/authorized_keys --image=… --boot

To boot into an image with my SSH key copied in. Yay!
2022-09-23 09:30:00 +02:00
Lennart Poettering
40f1856791 units: add pcrphase units 2022-09-22 16:53:34 +02:00
Zbigniew Jędrzejewski-Szmek
15b3f7e309
Merge pull request #24670 from keszybz/early-boot-ordering
Early boot ordering
2022-09-17 13:26:51 +02:00
Dan Streetman
137d162c42 add CAP_LINUX_IMMUTABLE to systemd-machined, so it can handle machinectl read-only requests
Without this, the 'machinectl read-only ...' command always fails.
2022-09-16 19:50:52 +01:00
Yu Watanabe
f562abe296 unit: drop ProtectClock=yes from systemd-udevd.service
This partially reverts cabc1c6d7a.

The setting ProtectClock= implies DeviceAllow=, which is not suitable
for udevd. Although we are slowly removing cgropsv1 support, but
DeviceAllow= with cgroupsv1 is necessarily racy, and reloading PID1
during the early boot process may cause issues like #24668.

Let's disable ProtectClock= for udevd. And, if necessary, let's
explicitly drop CAP_SYS_TIME and CAP_WAKE_ALARM (and possibly others)
by using CapabilityBoundingSet= later.

Fixes #24668.
2022-09-16 03:41:29 +09:00
Zbigniew Jędrzejewski-Szmek
89c4dc52b3 units: drop path to executable in $PATH
We don't have it other places, so let's make things a bit simpler.
2022-09-15 14:59:11 +02:00
Zbigniew Jędrzejewski-Szmek
5b5ec138c6 units: make sure that initrd-switch-root.service pulls in .target
Normally we queue initrd-switch-root.target/isolate, which pulls in the
service via Wants= in the .target unit file. But if the service is instead
started directly, there may be nothing pulling in the target. Let's make
sure that the reference exists.
2022-09-15 14:59:11 +02:00
Zbigniew Jędrzejewski-Szmek
3449814b8b units: add dependency ordering for emergency.service conflicts
If we want to stop those services which would compete for access to
the console, we need to have an ordering so that they are actually
stopped before the other things starts, not asynchronously.
2022-09-15 14:59:11 +02:00
Zbigniew Jędrzejewski-Szmek
7c0e2b5559 units: add ordering dependencies on initrd-switch-root.target
For shutdown, we queue shutdown.target/start, so in every unit which should be
stopped *before* shutdown, we need both Conflicts and an ordering dependency
with shutdown.target (either Before= or After= would work, because stop jobs
are always ordered before start jobs).

For initrd transition, we queue initrd-switch-root.service/isolate. This
automatically creates a /stop job for every running unit without
IgnoreOnIsolate. But no ordering dependency is created, unless the unit has a
(possibly transitive) ordering dependency on initrd-switch-root.service.
Since most units must stop before the transition, we should add the ordering
dependency. It is nicer to use Before=initrd-switch-root.target for this.
initrd-switch-root.target is ordered before initrd-switch-root.service, so
the effect it the same when both are in a transaction.

Fixes #23745.

To also cover the case where somebody is emergency mode in the initrd and
queues initrd-switch-root.service/start (not isolate), also add
Conflicts=initrd-switch-root.target, so various units are stopped properly.
This extends 2525682565 to cover all the other
services that are touched. It could be consider "operator error", but it's
easy to make and it's nicer if we can make this more foolproof.
2022-09-15 14:59:11 +02:00
Zbigniew Jędrzejewski-Szmek
d5fd07cdee units/systemd-network-generator.service: add forgotten ordering for shutdown 2022-09-15 14:59:11 +02:00
Zbigniew Jędrzejewski-Szmek
9810e41942 units: reorder/split unit dependency blocks
The block is reordered and split to have:
  1. description + documentation
  2. (optionally) conditions
  3. all the dependencies
I think it's easier to read the units this way.
Also, the Conflicts+Before is seperated out to separate lines.
The ordering dependency is "fake", because it could just as well be
After=, we are adding it to force ordering wrt. shutdown.target, and
it plays a different role than the other Before=, which are about a
real ordering on boot.
2022-09-15 14:59:11 +02:00
Nick Rosbrook
8b8bd621e1 pstore: do not try to load all known pstore modules
Commit 70e74a5997 ("pstore: Run after modules are loaded") added After=
and Wants= entries for all known kernel modules providing a pstore.

While adding these dependencies on systems where one of the modules is
not present, or not configured, should not have a real affect on the
system, it can produce annoying error messages in the kernel log. E.g.
"mtd device must be supplied (device name is empty)" when the mtdpstore
module is not configured correctly.

Since dependencies cannot be removed with drop-ins, if a distro wants to
remove some of these modules from systemd-pstore.service, they need to
patch units/systemd-pstore.service.in. On the other hand, if they want
to append to the dependencies this can be done by shipping a drop-in.

Since the original intent of the previous commit was to fix [1], which
only requires the efi_pstore module, remove all other kernel module
dependencies from systemd-pstore.service, and let distros ship drop-ins
to add dependencies if needed.

[1] https://github.com/systemd/systemd/issues/18540
2022-09-14 05:30:03 +09:00
Lennart Poettering
d3d2dd5e4f units: prolong the stop timeout for homed
Let's give IO/resizing/… more time then usual.

Fixes: #22901
2022-09-05 15:22:53 +02:00
Frantisek Sumsal
cd7ad0cbde
Merge pull request #24054 from keszybz/initrd-no-reload
Don't do daemon-reload in the initrd
2022-08-18 13:15:14 +00:00