Commit Graph

41 Commits

Author SHA1 Message Date
Dan Streetman
137d162c42 add CAP_LINUX_IMMUTABLE to systemd-machined, so it can handle machinectl read-only requests
Without this, the 'machinectl read-only ...' command always fails.
2022-09-16 19:50:52 +01:00
Zbigniew Jędrzejewski-Szmek
059cc610b7 meson: use jinja2 for unit templates
We don't need two (and half) templating systems anymore, yay!

I'm keeping the changes minimal, to make the diff manageable. Some enhancements
due to a better templating system might be possible in the future.

For handling of '## ' — see the next commit.
2021-05-19 10:24:43 +09:00
Yu Watanabe
db9ecf0501 license: LGPL-2.1+ -> LGPL-2.1-or-later 2020-11-09 13:23:58 +09:00
Zbigniew Jędrzejewski-Szmek
21006e0e3e man,units: link to the new dbus-api man pages 2020-09-30 10:30:03 +02:00
Guillaume Douézan-Grard
f4665664c4 units: disable ProtectKernelLogs for machined
machined needs access to the host mount namespace to propagate bind
mounts created with the "machinectl bind" command. However, the
"ProtectKernelLogs" directive relies on mount namespaces to make the
kernel ring buffer inaccessible. This commit removes the
"ProtectKernelLogs=yes" directive from machined service file introduced
in 6168ae5.

Closes #14559.
2020-03-02 14:49:14 +09:00
Kevin Kuehler
6168ae5840 units: set ProtectKernelLogs=yes on relevant units
We set ProtectKernelLogs=yes on all long running services except for
udevd, since it accesses /dev/kmsg, and journald, since it calls syslog
and accesses /dev/kmsg.
2019-11-15 00:59:54 -08:00
Zbigniew Jędrzejewski-Szmek
21d0dd5a89 meson: allow WatchdogSec= in services to be configured
As discussed on systemd-devel [1], in Fedora we get lots of abrt reports
about the watchdog firing [2], but 100% of them seem to be caused by resource
starvation in the machine, and never actual deadlocks in the services being
monitored. Killing the services not only does not improve anything, but it
makes the resource starvation worse, because the service needs cycles to restart,
and coredump processing is also fairly expensive. This adds a configuration option
to allow the value to be changed. If the setting is not set, there is no change.

My plan is to set it to some ridiculusly high value, maybe 1h, to catch cases
where a service is actually hanging.

[1] https://lists.freedesktop.org/archives/systemd-devel/2019-October/043618.html
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1300212
2019-10-25 17:20:24 +02:00
Topi Miettinen
99894b867f units: enable ProtectHostname=yes 2019-02-20 10:50:44 +02:00
Lennart Poettering
3ca9940cb9 units: set NoNewPrivileges= for all long-running services
Previously, setting this option by default was problematic due to
SELinux (as this would also prohibit the transition from PID1's label to
the service's label). However, this restriction has since been lifted,
hence let's start making use of this universally in our services.

On SELinux system this change should be synchronized with a policy
update that ensures that NNP-ful transitions from init_t to service
labels is permitted.

An while we are at it: sort the settings in the unit files this touches.
This might increase the size of the change in this case, but hopefully
should result in stabler patches later on.

Fixes: #1219
2018-11-12 19:02:55 +01:00
Lennart Poettering
ee8f26180d units: switch from system call blacklist to whitelist
This is generally the safer approach, and is what container managers
(including nspawn) do, hence let's move to this too for our own
services. This is particularly useful as this this means the new
@system-service system call filter group will get serious real-life
testing quickly.

This also switches from firing SIGSYS on unexpected syscalls to
returning EPERM. This would have probably been a better default anyway,
but it's hard to change that these days. When whitelisting system calls
SIGSYS is highly problematic as system calls that are newly introduced
to Linux become minefields for services otherwise.

Note that this enables a system call filter for udev for the first time,
and will block @clock, @mount and @swap from it. Some downstream
distributions might want to revert this locally if they want to permit
unsafe operations on udev rules, but in general this shiuld be mostly
safe, as we already set MountFlags=shared for udevd, hence at least
@mount won't change anything.
2018-06-14 17:44:20 +02:00
Zbigniew Jędrzejewski-Szmek
a7df2d1e43 Add SPDX license headers to unit files 2017-11-19 19:08:15 +01:00
Lennart Poettering
0a9b166b43 units: prohibit all IP traffic on all our long-running services (#6921)
Let's lock things down further.
2017-10-04 14:16:28 +02:00
Lennart Poettering
bff8f2543b units: set LockPersonality= for all our long-running services (#6819)
Let's lock things down. Also, using it is the only way how to properly
test this to the fullest extent.
2017-09-14 19:45:40 +02:00
AsciiWolf
16a5d4128f units: use https for the freedesktop url (#6227) 2017-06-28 22:54:12 -04:00
Felipe Sateler
2221c17afb machined: add RequiresMountsFor=/var/lib/machines
Since any part of the path could be remote mounted, make sure they are
before starting machined
2017-06-21 16:20:11 -04:00
Lennart Poettering
6489ccfe48 units: make use of @reboot and @swap in our long-running service SystemCallFilter= settings
Tighten security up a bit more.
2017-02-09 16:12:03 +01:00
Lennart Poettering
7f396e5f66 units: set SystemCallArchitectures=native on all our long-running services 2017-02-09 16:12:03 +01:00
Lennart Poettering
0c28d51ac8 units: further lock down our long-running services
Let's make this an excercise in dogfooding: let's turn on more security
features for all our long-running services.

Specifically:

- Turn on RestrictRealtime=yes for all of them

- Turn on ProtectKernelTunables=yes and ProtectControlGroups=yes for most of
  them

- Turn on RestrictAddressFamilies= for all of them, but different sets of
  address families for each

Also, always order settings in the unit files, that the various sandboxing
features are close together.

Add a couple of missing, older settings for a numbre of unit files.

Note that this change turns off AF_INET/AF_INET6 from udevd, thus effectively
turning of networking from udev rule commands. Since this might break stuff
(that is already broken I'd argue) this is documented in NEWS.
2016-09-25 10:52:57 +02:00
Lennart Poettering
5b566d2475 units: machined needs mount-related syscalls for its namespacing operations
Specifically "machinectl shell" (or its OpenShell() bus call) is implemented by
entering the file system namespace of the container  and opening a TTY there.
In order to enter the file system namespace, chroot() is required, which is
filtered by SystemCallFilter='s @mount group. Hence, let's make this work again
and drop @mount from the filter list.
2016-06-21 21:32:17 +02:00
Lennart Poettering
4e069746fe units: tighten system call filters a bit
Take away kernel keyring access, CPU emulation system calls and various debug
system calls from the various daemons we have.
2016-06-13 16:25:54 +02:00
Topi Miettinen
40093ce5dd units: add a basic SystemCallFilter (#3471)
Add a line
SystemCallFilter=~@clock @module @mount @obsolete @raw-io ptrace
for daemons shipped by systemd. As an exception, systemd-timesyncd
needs @clock system calls and systemd-localed is not privileged.
ptrace(2) is blocked to prevent seccomp escapes.
2016-06-09 09:32:04 +02:00
Topi Miettinen
40652ca479 units: enable MemoryDenyWriteExecute (#3459)
Secure daemons shipped by systemd by enabling MemoryDenyWriteExecute.

Closes: #3459
2016-06-08 14:23:37 +02:00
Lennart Poettering
c34b73d050 machined: add CAP_MKNOD to capabilities to run with (#3116)
Container images from Debian or suchlike contain device nodes in /dev. Let's
make sure we can clone them properly, hence pass CAP_MKNOD to machined.

Fixes: #2867 #465
2016-04-25 15:38:56 -04:00
Lennart Poettering
c2fc2c2560 units: increase watchdog timeout to 3min for all our services
Apparently, disk IO issues are more frequent than we hope, and 1min
waiting for disk IO happens, so let's increase the watchdog timeout a
bit, for all our services.

See #1353 for an example where this triggers.
2015-09-29 21:55:51 +02:00
Lennart Poettering
b242faae06 units: add more caps to machined
Otherwise copying full directory trees between container and host won't
work, as we cannot access some fiels and cannot adjust the ownership
properly on the destination.

Of course, adding these many caps to the daemon kinda defeats the
purpose of the caps lock-down... but well...

Fixes #433
2015-07-27 17:45:45 +02:00
Lennart Poettering
90adaa25e8 machined: move logic for bind mounting into containers from machinectl to machined
This extends the bus interface, adding BindMountMachine() for bind
mounting directories from the host into the container.
2015-02-17 17:49:21 +01:00
Lennart Poettering
a24111cea6 Revert "units: add SecureBits"
This reverts commit 6a716208b3.

Apparently this doesn't work.

http://lists.freedesktop.org/archives/systemd-devel/2015-February/028212.html
2015-02-11 18:28:06 +01:00
Topi Miettinen
6a716208b3 units: add SecureBits
No setuid programs are expected to be executed, so add
SecureBits=noroot noroot-locked
to unit files.
2015-02-11 17:33:36 +01:00
Lennart Poettering
cd61c3bfd7 machined/machinectl: add logic to show list of available images
This adds a new bus call to machined that enumerates /var/lib/container
and returns all trees stored in it, distuingishing three types:

        - GPT disk images, which are files suffixed with ".gpt"
        - directory trees
        - btrfs subvolumes
2014-12-19 19:19:29 +01:00
Lennart Poettering
717603e391 machinectl: show /etc/os-release information of container in status output 2014-07-03 17:54:24 +02:00
Lennart Poettering
a55954297d units: add missing caps so that GetAddresses() can work 2014-06-19 19:53:16 +02:00
Lennart Poettering
3c52ad9237 units: fix minor typo 2014-06-06 14:38:04 +02:00
Lennart Poettering
1b8689f949 core: rename ReadOnlySystem= to ProtectSystem= and add a third value for also mounting /etc read-only
Also, rename ProtectedHome= to ProtectHome=, to simplify things a bit.

With this in place we now have two neat options ProtectSystem= and
ProtectHome= for protecting the OS itself (and optionally its
configuration), and for protecting the user's data.
2014-06-04 18:12:55 +02:00
Lennart Poettering
417116f234 core: add new ReadOnlySystem= and ProtectedHome= settings for service units
ReadOnlySystem= uses fs namespaces to mount /usr and /boot read-only for
a service.

ProtectedHome= uses fs namespaces to mount /home and /run/user
inaccessible or read-only for a service.

This patch also enables these settings for all our long-running services.

Together they should be good building block for a minimal service
sandbox, removing the ability for services to modify the operating
system or access the user's private data.
2014-06-03 23:57:51 +02:00
Lennart Poettering
f21a71a907 core: enable PrivateNetwork= for a number of our long running services where this is useful 2014-03-19 23:25:28 +01:00
Lennart Poettering
d99a705296 units: make use of PrivateTmp=yes and PrivateDevices=yes for all our long-running daemons 2014-03-19 19:09:00 +01:00
Lennart Poettering
9a8112f5e9 units: systemd-machined now exits on idle and we shouldn't try to restart it then 2013-12-23 20:37:03 +01:00
Lennart Poettering
cde93897cd event: hook up sd-event with the service watchdog logic
Adds a new call sd_event_set_watchdog() that can be used to hook up the
event loop with the watchdog supervision logic of systemd. If enabled
and $WATCHDOG_USEC is set the event loop will ping the invoking systemd
daemon right after coming back from epoll_wait() but not more often than
$WATCHDOG_USEC/4. The epoll_wait() will sleep no longer than
$WATCHDOG_USEC/4*3, to make sure the service manager is called in time.

This means that setting WatchdogSec= in a .service file and calling
sd_event_set_watchdog() in your daemon is enough to hook it up with the
watchdog logic.
2013-12-11 18:20:09 +01:00
Lennart Poettering
bc5cb1d525 machined: run machined at minimal capabilities 2013-07-19 03:49:24 +02:00
Lennart Poettering
085b90af43 units: add references to bus API documentation to logind+machined 2013-07-19 03:49:07 +02:00
Lennart Poettering
1ee306e124 machined: split out machine registration stuff from logind
Embedded folks don't need the machine registration stuff, hence it's
nice to make this optional. Also, I'd expect that machinectl will grow
additional commands quickly, for example to join existing containers and
suchlike, hence it's better keeping that separate from loginctl.
2013-07-02 03:47:23 +02:00