mirror of
https://github.com/systemd/systemd.git
synced 2024-11-30 22:03:41 +08:00
755fdfffa0
In 2014, systemd started choosing fq_codel as the default_qdisc in order to fight internet bufferbloat.e6c253e363
fa98c99ea7
While the subsequent change made this change no longer trigger warnings if fq_codel wasn't present, it is still recommended to have this enabled. Add the necessary kernel configuration to the documentation.
451 lines
21 KiB
Plaintext
451 lines
21 KiB
Plaintext
systemd System and Service Manager
|
|
|
|
WEB SITE:
|
|
https://systemd.io
|
|
|
|
GIT:
|
|
git@github.com:systemd/systemd.git
|
|
https://github.com/systemd/systemd
|
|
|
|
MAILING LIST:
|
|
https://lists.freedesktop.org/mailman/listinfo/systemd-devel
|
|
|
|
IRC:
|
|
#systemd on irc.libera.chat
|
|
|
|
BUG REPORTS:
|
|
https://github.com/systemd/systemd/issues
|
|
|
|
OLDER DOCUMENTATION:
|
|
https://0pointer.de/blog/projects/systemd.html
|
|
https://www.freedesktop.org/wiki/Software/systemd
|
|
|
|
AUTHOR:
|
|
Lennart Poettering
|
|
Kay Sievers
|
|
...and many others
|
|
|
|
LICENSE:
|
|
LGPL-2.1-or-later for all code, exceptions noted in LICENSES/README.md
|
|
|
|
REQUIREMENTS:
|
|
Linux kernel ≥ 3.15
|
|
≥ 4.3 for ambient capabilities
|
|
≥ 4.5 for pids controller in cgroup v2
|
|
≥ 4.6 for cgroup namespaces
|
|
≥ 4.9 for RENAME_NOREPLACE support in vfat
|
|
≥ 4.10 for cgroup-bpf egress and ingress hooks
|
|
≥ 4.15 for cgroup-bpf device hook and cpu controller in cgroup v2
|
|
≥ 4.17 for cgroup-bpf socket address hooks
|
|
≥ 4.20 for PSI (used by systemd-oomd)
|
|
≥ 5.3 for bounded loops in BPF program
|
|
≥ 5.4 for signed Verity images
|
|
≥ 5.7 for BPF links and the BPF LSM hook
|
|
|
|
⛔ Kernel versions below 3.15 ("minimum baseline") are not supported at
|
|
all, and are missing required functionality (e.g. CLOCK_BOOTTIME
|
|
support for timerfd_create()).
|
|
|
|
⚠️ Kernel versions below 4.15 ("recommended baseline") have significant
|
|
gaps in functionality and are not recommended for use with this version
|
|
of systemd (e.g. lack sufficiently comprehensive and working cgroupv2
|
|
support). Taint flag 'old-kernel' will be set. systemd will most likely
|
|
still function, but upstream support and testing are limited.
|
|
|
|
Kernel Config Options:
|
|
CONFIG_DEVTMPFS
|
|
CONFIG_CGROUPS (it is OK to disable all controllers)
|
|
CONFIG_INOTIFY_USER
|
|
CONFIG_SIGNALFD
|
|
CONFIG_TIMERFD
|
|
CONFIG_EPOLL
|
|
CONFIG_UNIX (it requires CONFIG_NET, but every other flag in it is not necessary)
|
|
CONFIG_SYSFS
|
|
CONFIG_PROC_FS
|
|
CONFIG_FHANDLE (libudev, mount and bind mount handling)
|
|
|
|
udev will fail to work with the legacy sysfs layout:
|
|
CONFIG_SYSFS_DEPRECATED=n
|
|
|
|
Legacy hotplug slows down the system and confuses udev:
|
|
CONFIG_UEVENT_HELPER_PATH=""
|
|
|
|
Userspace firmware loading is not supported and should be disabled in
|
|
the kernel:
|
|
CONFIG_FW_LOADER_USER_HELPER=n
|
|
|
|
Some udev rules and virtualization detection relies on it:
|
|
CONFIG_DMIID
|
|
|
|
Support for some SCSI devices serial number retrieval, to create
|
|
additional symlinks in /dev/disk/ and /dev/tape:
|
|
CONFIG_BLK_DEV_BSG
|
|
|
|
Required for PrivateNetwork= in service units:
|
|
CONFIG_NET_NS
|
|
Note that systemd-localed.service and other systemd units use
|
|
PrivateNetwork so this is effectively required.
|
|
|
|
Required for PrivateUsers= in service units:
|
|
CONFIG_USER_NS
|
|
|
|
Optional but strongly recommended:
|
|
CONFIG_IPV6
|
|
CONFIG_AUTOFS_FS
|
|
CONFIG_TMPFS_XATTR
|
|
CONFIG_{TMPFS,EXT4_FS,XFS,BTRFS_FS,...}_POSIX_ACL
|
|
CONFIG_SECCOMP
|
|
CONFIG_SECCOMP_FILTER (required for seccomp support)
|
|
CONFIG_KCMP (for the kcmp() syscall, used to be under
|
|
CONFIG_CHECKPOINT_RESTORE before ~5.12)
|
|
CONFIG_NET_SCHED
|
|
CONFIG_NET_SCH_FQ_CODEL
|
|
|
|
Required for CPUShares= in resource control unit settings:
|
|
CONFIG_CGROUP_SCHED
|
|
CONFIG_FAIR_GROUP_SCHED
|
|
|
|
Required for CPUQuota= in resource control unit settings:
|
|
CONFIG_CFS_BANDWIDTH
|
|
|
|
Required for IPAddressDeny=, IPAddressAllow=, IPIngressFilterPath=,
|
|
IPEgressFilterPath= in resource control unit settings unit settings:
|
|
CONFIG_BPF
|
|
CONFIG_BPF_SYSCALL
|
|
CONFIG_BPF_JIT
|
|
CONFIG_HAVE_EBPF_JIT
|
|
CONFIG_CGROUP_BPF
|
|
|
|
Required for SocketBind{Allow|Deny}=, RestrictNetworkInterfaces= in
|
|
resource control unit settings:
|
|
CONFIG_BPF
|
|
CONFIG_BPF_SYSCALL
|
|
CONFIG_BPF_JIT
|
|
CONFIG_HAVE_EBPF_JIT
|
|
CONFIG_CGROUP_BPF
|
|
|
|
For UEFI systems:
|
|
CONFIG_EFIVAR_FS
|
|
CONFIG_EFI_PARTITION
|
|
|
|
Required for signed Verity images support:
|
|
CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG
|
|
Required to verify signed Verity images using keys enrolled in the MoK
|
|
(Machine-Owner Key) keyring:
|
|
CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_SECONDARY_KEYRING
|
|
CONFIG_IMA_ARCH_POLICY
|
|
CONFIG_INTEGRITY_MACHINE_KEYRING
|
|
|
|
Required for reading credentials from SMBIOS:
|
|
CONFIG_DMI
|
|
CONFIG_DMI_SYSFS
|
|
|
|
Required for RestrictFileSystems= in service units:
|
|
CONFIG_BPF
|
|
CONFIG_BPF_SYSCALL
|
|
CONFIG_BPF_LSM
|
|
CONFIG_DEBUG_INFO_BTF
|
|
CONFIG_LSM="...,bpf" or kernel booted with lsm="...,bpf".
|
|
|
|
We recommend to turn off Real-Time group scheduling in the kernel when
|
|
using systemd. RT group scheduling effectively makes RT scheduling
|
|
unavailable for most userspace, since it requires explicit assignment of
|
|
RT budgets to each unit whose processes making use of RT. As there's no
|
|
sensible way to assign these budgets automatically this cannot really be
|
|
fixed, and it's best to disable group scheduling hence:
|
|
CONFIG_RT_GROUP_SCHED=n
|
|
|
|
It's a good idea to disable the implicit creation of networking bonding
|
|
devices by the kernel networking bonding module, so that the
|
|
automatically created "bond0" interface doesn't conflict with any such
|
|
device created by systemd-networkd (or other tools). Ideally there would
|
|
be a kernel compile-time option for this, but there currently isn't. The
|
|
next best thing is to make this change through a modprobe.d drop-in.
|
|
This is shipped by default, see modprobe.d/systemd.conf.
|
|
|
|
Required for systemd-nspawn:
|
|
CONFIG_DEVPTS_MULTIPLE_INSTANCES or Linux kernel >= 4.7
|
|
|
|
Required for systemd-oomd:
|
|
CONFIG_PSI
|
|
|
|
Note that kernel auditing is broken when used with systemd's container
|
|
code. When using systemd in conjunction with containers, please make
|
|
sure to either turn off auditing at runtime using the kernel command
|
|
line option "audit=0", or turn it off at kernel compile time using:
|
|
CONFIG_AUDIT=n
|
|
|
|
If systemd is compiled with libseccomp support on architectures which do
|
|
not use socketcall() and where seccomp is supported (this effectively
|
|
means x86-64 and ARM, but excludes 32-bit x86!), then nspawn will now
|
|
install a work-around seccomp filter that makes containers boot even
|
|
with audit being enabled. This works correctly only on kernels 3.14 and
|
|
newer though. TL;DR: turn audit off, still.
|
|
|
|
glibc >= 2.16
|
|
libcap
|
|
libmount >= 2.30 (from util-linux)
|
|
(util-linux *must* be built without --enable-libmount-support-mtab)
|
|
libseccomp >= 2.3.1 (optional)
|
|
libblkid >= 2.24 (from util-linux) (optional)
|
|
libkmod >= 15 (optional)
|
|
PAM >= 1.1.2 (optional)
|
|
libcryptsetup (optional), >= 2.3.0 required for signed Verity images support
|
|
libaudit (optional)
|
|
libacl (optional)
|
|
libbpf >= 0.1.0 (optional)
|
|
libfdisk >= 2.32 (from util-linux) (optional)
|
|
libselinux (optional)
|
|
liblzma (optional)
|
|
liblz4 >= 1.3.0 / 130 (optional)
|
|
libzstd >= 1.4.0 (optional)
|
|
libgcrypt (optional)
|
|
libqrencode (optional)
|
|
libmicrohttpd (optional)
|
|
libidn2 or libidn (optional)
|
|
gnutls >= 3.1.4 (optional, >= 3.6.0 is required to support DNS-over-TLS with gnutls)
|
|
openssl >= 1.1.0 (optional, required to support DNS-over-TLS with openssl)
|
|
elfutils >= 158 (optional)
|
|
polkit (optional)
|
|
tzdata >= 2014f (optional)
|
|
pkg-config
|
|
gperf
|
|
docbook-xsl (optional, required for documentation)
|
|
xsltproc (optional, required for documentation)
|
|
python >= 3.7 (required by meson too, >= 3.9 is required for ukify)
|
|
python-jinja2
|
|
python-pefile (optional, required for ukify)
|
|
python-lxml (optional, required to build the indices)
|
|
pyelftools (optional, required for systemd-boot)
|
|
meson >= 0.60.0
|
|
ninja
|
|
gcc >= 8.4
|
|
awk, sed, grep, and similar tools
|
|
clang >= 10.0, llvm >= 10.0 (optional, required to build BPF programs
|
|
from source code in C)
|
|
|
|
During runtime, you need the following additional
|
|
dependencies:
|
|
|
|
util-linux >= v2.27.1 required (including but not limited to: mount,
|
|
umount, swapon, swapoff, sulogin,
|
|
agetty, fsck)
|
|
dbus >= 1.4.0 (strictly speaking optional, but recommended)
|
|
NOTE: If using dbus < 1.9.18, you should override the default
|
|
policy directory (--with-dbuspolicydir=/etc/dbus-1/system.d).
|
|
polkit (optional)
|
|
|
|
To build in directory build/:
|
|
meson setup build/ && ninja -C build/
|
|
|
|
Any configuration options can be specified as -Darg=value... arguments
|
|
to meson. After the build directory is initially configured, meson will
|
|
refuse to run again, and options must be changed with:
|
|
meson configure -Darg=value build/
|
|
meson configure without any arguments will print out available options and
|
|
their current values.
|
|
|
|
Useful commands:
|
|
ninja -C build -v some/target
|
|
meson test -C build/
|
|
sudo meson install -C build/ --no-rebuild
|
|
DESTDIR=... meson install -C build/
|
|
|
|
A tarball can be created with:
|
|
v=250 && git archive --prefix=systemd-$v/ v$v | zstd >systemd-$v.tar.zstd
|
|
|
|
When systemd-hostnamed is used, it is strongly recommended to install
|
|
nss-myhostname to ensure that, in a world of dynamically changing
|
|
hostnames, the hostname stays resolvable under all circumstances. In
|
|
fact, systemd-hostnamed will warn if nss-myhostname is not installed.
|
|
|
|
nss-systemd must be enabled on systemd systems, as that's required for
|
|
DynamicUser= to work. Note that we ship services out-of-the-box that
|
|
make use of DynamicUser= now, hence enabling nss-systemd is not
|
|
optional.
|
|
|
|
Note that the build prefix for systemd must be /usr/. (Moreover, packages
|
|
systemd relies on — such as D-Bus — really should use the same prefix,
|
|
otherwise you are on your own.) Split-usr and unmerged-usr systems are no
|
|
longer supported, and moving everything under /usr/ is required. Systems
|
|
with a separate /usr/ partition must mount it before transitioning into it
|
|
(i.e.: from the initrd). For more information see:
|
|
https://www.freedesktop.org/wiki/Software/systemd/separate-usr-is-broken
|
|
https://www.freedesktop.org/wiki/Software/systemd/TheCaseForTheUsrMerge
|
|
|
|
Additional packages are necessary to run some tests:
|
|
- nc (used by test/TEST-12-ISSUE-3171)
|
|
- python (test-udev which is installed is in python)
|
|
- python-pyparsing
|
|
- python-evdev (used by hwdb parsing tests)
|
|
- strace (used by test/test-functions)
|
|
- capsh (optional, used by test-execute)
|
|
|
|
POLICY FOR SUPPORT OF DISTRIBUTIONS AND ARCHITECTURES:
|
|
systemd main branch and latest major or stable releases are generally
|
|
expected to compile on current versions of popular distributions (at
|
|
least all non-EOL versions of Fedora, Debian unstable/testing/stable,
|
|
latest Ubuntu LTS and non-LTS releases, openSUSE Tumbleweed/Leap,
|
|
CentOS Stream 8 and 9, up-to-date Arch, etc.) We will generally
|
|
attempt to support also other non-EOL versions of various distros.
|
|
Features which would break compilation on slightly older distributions
|
|
will only be introduced if there are significant reasons for this
|
|
(i.e. supporting them interferes with development or requires too many
|
|
resources to support). In some cases backports of specific libraries or
|
|
tools might be required.
|
|
|
|
The policy is similar for architecture support. systemd is regularly
|
|
tested on popular architectures (currently amd64, i386, arm64, ppc64el,
|
|
and s390x), but should compile and work also on other architectures, for
|
|
which support has been added. systemd will emit warnings when
|
|
architecture-specific constants are not defined.
|
|
|
|
STATIC COMPILATION AND "STANDALONE" BINARIES:
|
|
systemd provides a public shared libraries libsystemd.so and
|
|
libudev.so. The latter is deprecated, and the sd-device APIs in
|
|
libsystemd should be used instead for new code. In addition, systemd is
|
|
built with a private shared library, libsystemd-shared-<suffix>.so,
|
|
that also includes the libsystemd code, and by default most systemd
|
|
binaries are linked to it. Using shared libraries saves disk space and
|
|
memory at runtime, because only one copy of the code is needed.
|
|
|
|
It is possible to build static versions of systemd public shared
|
|
libraries (via the configuration options '-Dstatic-libsystemd' and
|
|
'-Dstatic-libudev'). This allows the libsystemd and libudev code to be
|
|
linked statically into programs. Note that mixing & matching different
|
|
versions of libsystemd and systemd is generally not recommended, since
|
|
various of its APIs wrap internal state and protocols of systemd
|
|
(e.g. logind and udev databases), which are not considered
|
|
stable. Hence, using static libraries is not recommended since it
|
|
generally means that version of the static libsystemd linked into
|
|
applications and the host systemd are not in sync, and will thus create
|
|
compatibility problems.
|
|
|
|
In addition, it is possible to disable the use of
|
|
libsystemd-shared-<suffix>.so for various components (via the
|
|
configuration options '-Dlink-*-shared'). In this mode, the libsystemd
|
|
and libsystemd-shared code is linked statically into selected
|
|
binaries. This option is intended for systems where some of the
|
|
components are intended to be delivered independently of the main
|
|
systemd package. Finally, some binaries can be compiled in a second
|
|
version (via the configuration option '-Dstandalone-binaries'). The
|
|
version suffixed with ".standalone" has the libsystemd and
|
|
libsystemd-shared code linked statically. Those binaries are intended
|
|
as replacements to be used in limited installations where the full
|
|
systemd suite is not installed. Yet another option is to rebuild
|
|
systemd with a different '-Dshared-lib-tag' setting, allowing different
|
|
systemd binaries to be linked to instances of the private shared
|
|
library that can be installed in parallel.
|
|
|
|
Again: Using the default shared linking is recommended, building static
|
|
or "standalone" versions is not. Mixing versions of systemd components
|
|
that would normally be built and used together (in particular various
|
|
daemons and the manager) is not recommended: we do not test such
|
|
combinations upstream and cannot provide support. Distributors making
|
|
use of those options are responsible if things do not work as expected.
|
|
|
|
USERS AND GROUPS:
|
|
Default udev rules use the following standard system group names, which
|
|
need to be resolvable by getgrnam() at any time, even in the very early
|
|
boot stages, where no other databases and network are available:
|
|
|
|
audio, cdrom, dialout, disk, input, kmem, kvm, lp, render, tape, tty, video
|
|
|
|
During runtime, the journal daemon requires the "systemd-journal" system
|
|
group to exist. New journal files will be readable by this group (but
|
|
not writable), which may be used to grant specific users read access. In
|
|
addition, system groups "wheel" and "adm" will be given read-only access
|
|
to journal files using systemd-tmpfiles-setup.service.
|
|
|
|
The journal remote daemon requires the "systemd-journal-remote" system
|
|
user and group to exist. During execution this network facing service
|
|
will drop privileges and assume this uid/gid for security reasons.
|
|
|
|
Similarly, the network management daemon requires the "systemd-network"
|
|
system user and group to exist.
|
|
|
|
Similarly, the name resolution daemon requires the "systemd-resolve"
|
|
system user and group to exist.
|
|
|
|
Similarly, the coredump support requires the "systemd-coredump" system
|
|
user and group to exist.
|
|
|
|
GLIBC NSS:
|
|
systemd ships with four glibc NSS modules:
|
|
|
|
nss-myhostname resolves the local hostname to locally configured IP
|
|
addresses, as well as "localhost" to 127.0.0.1/::1.
|
|
|
|
nss-resolve enables DNS resolution via the systemd-resolved DNS/LLMNR
|
|
caching stub resolver "systemd-resolved".
|
|
|
|
nss-mymachines enables resolution of all local containers registered
|
|
with machined to their respective IP addresses.
|
|
|
|
nss-systemd enables resolution of users/group registered via the
|
|
User/Group Record Lookup API (https://systemd.io/USER_GROUP_API),
|
|
including all dynamically allocated service users. (See the
|
|
DynamicUser= setting in unit files.)
|
|
|
|
To make use of these NSS modules, please add them to the "hosts:",
|
|
"passwd:", "group:", "shadow:" and "gshadow:" lines in
|
|
/etc/nsswitch.conf.
|
|
|
|
The four modules should be used in the following order:
|
|
|
|
passwd: files systemd
|
|
group: files [SUCCESS=merge] systemd
|
|
shadow: files systemd
|
|
gshadow: files systemd
|
|
hosts: mymachines resolve [!UNAVAIL=return] files myhostname dns
|
|
|
|
SYSV INIT.D SCRIPTS:
|
|
When calling "systemctl enable/disable/is-enabled" on a unit which is a
|
|
SysV init.d script, it calls /usr/lib/systemd/systemd-sysv-install;
|
|
this needs to translate the action into the distribution specific
|
|
mechanism such as chkconfig or update-rc.d. Packagers need to provide
|
|
this script if you need this functionality (you don't if you disabled
|
|
SysV init support).
|
|
|
|
Please see src/systemctl/systemd-sysv-install.SKELETON for how this
|
|
needs to look like, and provide an implementation at the marked places.
|
|
|
|
WARNINGS and TAINT FLAGS:
|
|
systemd requires that the /run mount point exists. systemd also
|
|
requires that /var/run is a symlink to /run. Taint flag 'var-run-bad'
|
|
will be set when this condition is detected.
|
|
|
|
Systemd will also warn when the cgroup support is unavailable in the
|
|
kernel (taint flag 'cgroups-missing'), the system is using the old
|
|
cgroup hierarchy (taint flag 'cgroupsv1'), the hardware clock is
|
|
running in non-UTC mode (taint flag 'local-hwclock'), the kernel
|
|
overflow UID or GID are not 65534 (taint flags 'overflowuid-not-65534'
|
|
and 'overflowgid-not-65534'), the UID or GID range assigned to the
|
|
running systemd instance covers less than 0…65534 (taint flags
|
|
'short-uid-range' and 'short-gid-range').
|
|
|
|
Taint conditions are logged during boot, but may also be checked at any
|
|
time with:
|
|
|
|
busctl get-property org.freedesktop.systemd1 /org/freedesktop/systemd1 org.freedesktop.systemd1.Manager Tainted
|
|
|
|
See org.freedesktop.systemd1(5) for more information.
|
|
|
|
VALGRIND:
|
|
To run systemd under valgrind, compile systemd with the valgrind
|
|
development headers available (i.e. valgrind-devel or equivalent).
|
|
Otherwise, false positives will be triggered by code which violates
|
|
some rules but is actually safe. Note that valgrind generates nice
|
|
output only on exit(), hence on shutdown we don't execve()
|
|
systemd-shutdown.
|
|
|
|
STABLE BRANCHES AND BACKPORTS:
|
|
Stable branches with backported patches are available in the
|
|
systemd-stable repo at https://github.com/systemd/systemd-stable.
|
|
|
|
Stable branches are started for certain releases of systemd and named
|
|
after them, e.g. v238-stable. Stable branches are managed by
|
|
distribution maintainers on an as needed basis. See
|
|
https://www.freedesktop.org/wiki/Software/systemd/Backports for some
|
|
more information and examples.
|