In the troff output, this doesn't seem to make any difference. But in the
html output, the whitespace is sometimes preserved, creating an additional
gap before the following content. Drop it everywhere to avoid this.
No functional change, but let's print yes/no rather than on/off in systemd-analyze.
Similar to 2e8a581b9c and
edd3f4d9b7.
(Note, the commit messages of those commits are wrong, as
parse_boolean() supports on/off anyway.)
This will allow units (scopes/slices/services) to override the default
systemd-oomd setting DefaultMemoryPressureDurationSec=.
The semantics of ManagedOOMMemoryPressureDurationSec= are:
- If >= 1 second, overrides DefaultMemoryPressureDurationSec= from oomd.conf
- If is empty, uses DefaultMemoryPressureDurationSec= from oomd.conf
- Ignored if ManagedOOMMemoryPressure= is not "kill"
- Disallowed if < 1 second
Note the corresponding dbus property is DefaultMemoryPressureDurationUSec
which is in microseconds. This is consistent with other time-based
dbus properties.
Certainly on systemd 252 at least a configuration of
```
MemorySwapMax=40%
```
is supported but this was missing from the man page.
Only MemoryMax was documented as supporting a %.
Global resource (whole system or root cg's (e.g. in a container)) is
also a well-defined limit for memory and tasks, take it into account
when calculating effective limits.
Users become perplexed when they run their workload in a unit with no
explicit limits configured (moreover, listing the limit property would
even show it's infinity) but they experience unexpected resource
limitation.
The memory and pid limits come as the most visible, therefore add new
unit read-only properties:
- EffectiveMemoryMax=,
- EffectiveMemoryHigh=,
- EffectiveTasksMax=.
These properties represent the most stringent limit systemd is aware of
for the given unit -- and that is typically(*) the effective value.
Implement the properties by simply traversing all parents in the
leaf-slice tree and picking the minimum value. Note that effective
limits are thus defined even for units that don't enable explicit
accounting (because of the hierarchy).
(*) The evasive case is when systemd runs in a cgroupns and cannot
reason about outer setup. Complete solution would need kernel support.
The benefit of using this setting is that user and group IDs, especially dynamic and random
IDs used by DynamicUser=, can be used in firewall configuration easily.
Example:
```
[Service]
NFTSet=user:inet:filter:serviceuser
```
Corresponding NFT rules:
```
table inet filter {
set serviceuser {
typeof meta skuid
}
chain service_output {
meta skuid @serviceuser accept
drop
}
}
```
```
$ cat /etc/systemd/system/dunft.service
[Service]
DynamicUser=yes
NFTSet=user:inet:filter:serviceuser
ExecStart=/bin/sleep 1000
[Install]
WantedBy=multi-user.target
$ sudo nft list set inet filter serviceuser
table inet filter {
set serviceuser {
typeof meta skuid
elements = { 64864 }
}
}
$ ps -n --format user,group,pid,command -p `systemctl show dunft.service -P MainPID`
USER GROUP PID COMMAND
64864 64864 55158 /bin/sleep 1000
```
New directive `NFTSet=` provides a method for integrating dynamic cgroup IDs
into firewall rules with NFT sets. The benefit of using this setting is to be
able to use control group as a selector in firewall rules easily and this in
turn allows more fine grained filtering. Also, NFT rules for cgroup matching
use numeric cgroup IDs, which change every time a service is restarted, making
them hard to use in systemd environment.
This option expects a whitespace separated list of NFT set definitions. Each
definition consists of a colon-separated tuple of source type (only "cgroup"),
NFT address family (one of "arp", "bridge", "inet", "ip", "ip6", or "netdev"),
table name and set name. The names of tables and sets must conform to lexical
restrictions of NFT table names. The type of the element used in the NFT filter
must be "cgroupsv2". When a control group for a unit is realized, the cgroup ID
will be appended to the NFT sets and it will be be removed when the control
group is removed. systemd only inserts elements to (or removes from) the sets,
so the related NFT rules, tables and sets must be prepared elsewhere in
advance. Failures to manage the sets will be ignored.
If the firewall rules are reinstalled so that the contents of NFT sets are
destroyed, command systemctl daemon-reload can be used to refill the sets.
Example:
```
table inet filter {
...
set timesyncd {
type cgroupsv2
}
chain ntp_output {
socket cgroupv2 != @timesyncd counter drop
accept
}
...
}
```
/etc/systemd/system/systemd-timesyncd.service.d/override.conf
```
[Service]
NFTSet=cgroup:inet:filter:timesyncd
```
```
$ sudo nft list set inet filter timesyncd
table inet filter {
set timesyncd {
type cgroupsv2
elements = { "system.slice/systemd-timesyncd.service" }
}
}
```
As I noticed a lot of missing information when trying to implement checking
for missing info. I reimplemented the version information script to be more
robust, and here is the result.
Follow up to ec07c3c80b
This tries to add information about when each option was added. It goes
back to version 183.
The version info is included from a separate file to allow generating it,
which would allow more control on the formatting of the final output.
Let's visually separate the options associated with cpu, io, memory, …
in subsections
This patch tries to be minimal. It just adds the section titles, and
does minimal reordering to make sure the options on the same kind of
resource are placed close to each other.
Doesn't really matter since the two unicode symbols are supposedly
equivalent, but let's better follow the unicode recommendations to
prefer greek small letter mu, as per:
https://www.unicode.org/reports/tr25
This implements a minimal subset of #24961, but in a lot more
restrictive way: we only allow one level of subcgroup (as that's enough
to address the no-processes in inner cgroups rule), and does not change
anything about threaded cgroup logic or similar, or make any of this new
behaviour mandatory.
All this does is this: all non-control processes we invoke for a unit
we'll invoke in a subgroup by the specified name.
We'll later port all our current services that use cgroup delegation
over to this, i.e. user@.service, systemd-nspawn@.service and
systemd-udevd.service.
This changes the PSI window duration we default to for watching memory
pressure events from 1s to 2s. This is because apparently the kernel
will soon disallow window durations other than 2s for unprivileged
processes.
Hence, we'll bump the threshold from 100m to 200ms, and the window from
1s to 2s.
For any user on a semi-recent kernel, effectively this setting is pointless.
We should deprecate it once not needed anymore for the v1 hierarchy. For
now, adjust the description.
When cpu controller is disabled, thing would often still behave as if
it was. And since the cpu controller can be enabled "magically" e.g. by
starting user@1000, add a note for users to be careful. Autogrouping
is described well in the man page, incl. how to enable or disable it,
so it should be enough to refer to that.
For CPUWeight=: there is an important distinction between our default of
[not set], and the kernel default of "100". Let's not say that our default
is "100" because then 'systemctl show' output is hard to explain.
For task accounting, it's the kernel that does the accounting, not systemd.
For a user, information which cgroup controllers are enabled based on
the unit configuration is rather important. Not only because it determines
what resource control is peformed by the kernel, but also because controllers
have a non-negligible cost, especially for deep nesting, and users may want
to *not* have controllers enabled.
Our documentation did its best to avoid the topic so far. This was partially
caused by the support for cgroup v1, which meant that any discussion of
controllers had to be conditional and messy. But v1 is deprecated on its way
out, so it should be fine to just describe what happens with v2.
The text is extended with a discussion of how controllers are enabled and
disabled, and an example, and for various settings that enable controllers
the relevant controller is now mentioned.
DeviceAllow= and others are applied to the whole cgroup via bpf, so
using '+' on an Exec line will not bypass them. Explain this in the
manpage.
Fixes https://github.com/systemd/systemd/issues/26035
Fixes#25780.
> Man page: crypttab.5
> Issue 1: Missing fullstop
> Issue 2: I<cipher=>, I<hash=>, I<size=> → B<cipher=>, B<hash=>, B<size=>
>
> "Force LUKS mode\\&. When this mode is used, the following options are "
> "ignored since they are provided by the LUKS header on the device: "
> "I<cipher=>, I<hash=>, I<size=>"
Seems OK to me. The full stop is there and has been for at least a few years. And we use <option> for the markup, which is appropriate here.
> Man page: crypttab.5
> Issue 1: Missing fullstop
> Issue 2: I<cipher=>, I<hash=>, I<keyfile-offset=>, I<keyfile-size=>, I<size=> → B<cipher=>, B<hash=>, B<keyfile-offset=>, B<keyfile-size=>, B<size=>
>
> "Use TrueCrypt encryption mode\\&. When this mode is used, the following "
> "options are ignored since they are provided by the TrueCrypt header on the "
> "device or do not apply: I<cipher=>, I<hash=>, I<keyfile-offset=>, I<keyfile-"
> "size=>, I<size=>"
Same.
> Man page: journalctl.1
> Issue 1: make be → may be
Fixed.
> Issue 2: below\\&. → below:
Fixed.
> Man page: journalctl.1
> Issue: Colon at the end?
>
> "The following commands are understood\\&. If none is specified the default "
> "is to display journal records\\&."
> msgstr ""
> "Die folgenden Befehle werden verstanden\\&. Falls keiner festgelegt ist, ist "
> "die Anzeige von Journal-Datensätzen die Vorgabe\\&."
This is a bit awkward, but I'm not sure how to fix it.
> Man page: kernel-install.8
> Issue: methods a fallback → methods fallback
It was correct, but I added a comma to make the sense clearer.
> Man page: loader.conf.5
> Issue 1: secure boot variables → Secure Boot variables
> Issue 2: one → one for (multiple times)
>
> "Supported secure boot variables are one database for authorized images, one "
> "key exchange key (KEK) and one platform key (PK)\\&. For more information, "
> "refer to the \\m[blue]B<UEFI specification>\\m[]\\&\\s-2\\u[2]\\d\\s+2, "
> "under Secure Boot and Driver Signing\\&. Another resource that describe the "
> "interplay of the different variables is the \\m[blue]B<EDK2 "
> "documentation>\\m[]\\&\\s-2\\u[3]\\d\\s+2\\&."
"one of" would sound strange. "One this and one that" is OK.
> Man page: loader.conf.5
> Issue: systemd-boot → B<systemd-boot>(7)
Fixed.
> Man page: logind.conf.5
> Issue: systemd-logind → B<systemd-logind>(8)
We use <filename>systemd-logind</> on subsequent references… I think that's good enough.
> Man page: nss-myhostname.8
> Issue: B<getent> → B<getent>(1)
Fixed.
> Man page: nss-resolve.8
> Issue: B<systemd-resolved> → B<systemd-resolved>(8)
The first reference does this, subsequent are shorter.
> Man page: os-release.5
> Issue: Portable Services → Portable Services Documentation?
Updated.
> Man page: pam_systemd_home.8
> Issue: auth and account use "reason", while session and password do not?
Reworded.
> Man page: portablectl.1
> Issue: In systemd-portabled.service(8): Portable Services Documentation
Updated.
> Man page: repart.d.5
> Issue: The partition → the partition
Fixed.
> Man page: repart.d.5
> Issue: B<systemd-repart> → B<systemd-repart>(8)
The first reference does this. I also change this one, because it's pretty far down in the text.
> Man page: systemd.1
> Issue: kernel command line twice?
>
> "Takes a boolean argument\\&. If false disables importing credentials from "
> "the kernel command line, qemu_fw_cfg subsystem or the kernel command line\\&."
Apparently this was fixed already.
> Man page: systemd-boot.7
> Issue: enrollement → enrollment
Fixed.
> Man page: systemd-cryptenroll.1
> Issue: multiple cases: any specified → the specified
Reworded.
> Man page: systemd-cryptenroll.1
> Issue: If this this → If this
Fixed tree-wide.
> Man page: systemd-cryptsetup-generator.8
> Issue: and the initrd → and in the initrd
"Is honoured by the initrd" is OK, because we often speak about the initrd as a single unit. But in the same paragraph we also used "in the initrd", which makes the other use look sloppy. I changed it to "in the initrd" everywhere in that file.
> Man page: systemd.directives.7
> Issue: Why are these two quoted (but not others)?
>
> "B<\\*(Aqh\\*(Aq>"
>
> B<\\*(Aqs\\*(Aq>"
>
> "B<\\*(Aqy\\*(Aq>"
This is autogenerated from files… We use slightly different markup in different files, and it's just too hard to make it consistent. We gave up on this.
> Man page: systemd.exec.5
> Issue 1: B<at>(1p) → B<at>(1)
> Issue 2: B<crontab>(1p) → B<crontab>(1)
Fixed.
> Man page: systemd.exec.5
> Issue: B<select()> → B<select>(2)
Fixed.
> Man page: systemd.exec.5
> Issue: qemu → B<qemu>(1)
The man page doesn't seem to be in any of the canonical places on the web.
I added a link to online docs.
> Man page: systemd.exec.5
> Issue: variable → variables
Seems to be fixed already.
> Man page: systemd-integritysetup-generator.8
> Issue: systemd-integritysetup-generator → B<systemd-integritysetup-generator>
I changed <filename> to <command>.
> Man page: systemd-integritysetup-generator.8
> Issue: superfluous comma at the end
Already fixed.
> Man page: systemd-measure.1
> Issue: (see B<--pcr-bank=>) below → (see B<--pcr-bank=> below)
Reworded.
> Man page: systemd-measure.1
> Issue: =PATH> → =>I<PATH>
Fixed.
> Man page: systemd-measure.1.po
> Issue: B<--bank=DIGEST> → B<--bank=>I<DIGEST>
Fixed.
> Man page: systemd.netdev.5
> Issue: os the → on the
Appears to have been fixed already.
> Man page: systemd.netdev.5
> Issue: Onboard → On-board (as in previous string)
Updated.
> Man page: systemd.network.5
> Issue: B<systemd-networkd> -> B<systemd-networkd>(8)
First reference does this, subsequent do not.
> Man page: systemd.network.5
> Issue: B<netlabelctl> → B<netlabelctl>(8)
First reference does this, subsequent do not.
> Man page: systemd.network.5
> Issue: Missing verb (aquired? configured?) in the half sentence starting with "or by a "
I dropped the comma.
> Man page: systemd-nspawn.1
> Issue: All host users outside of that range → All other host users
Reworded.
> # FIXME no effect → no effect\\&.
> #. type: Plain text
> #: archlinux debian-unstable fedora-rawhide mageia-cauldron opensuse-tumbleweed
> msgid ""
> "Whichever ID mapping option is used, the same mapping will be used for users "
> "and groups IDs\\&. If B<rootidmap> is used, the group owning the bind "
> "mounted directory will have no effect"
A period is added. Not sure if there's some other issue.
> Man page: systemd-oomd.service.8
> Issue: B<systemd> → B<systemd>(1)
Done.
> Man page: systemd.path.5
> Issue 1: B<systemd.exec>(1) → B<systemd.exec>(5)
> Issue 2: This section does not (yet?) exist
Fixed.
> Man page: systemd-pcrphase.service.8
> Issue 1: indicate phases into TPM2 PCR 11 ??
> Issue 2: Colon at the end of the paragraph?
Fixed.
> Man page: systemd-pcrphase.service.8
> Issue: final boot phase → final shutdown phase?
Updated.
> Man page: systemd-pcrphase.service.8
> Issue: for the the → for the
Fixed tree-wide.
> Man page: systemd-portabled.service.8
> Issue: In systemd-portabled.service(8): Portable Services Documentation
Updated.
> Man page: systemd-pstore.service.8
> Issue: Here and the following paragraphs: . → \\&. // Upstream: What does this comment mean? // You normally write \\&. for a full dot (full stop etc.); here you write only "." (i.e. a plain dot).
>
> "and we look up \"localhost\", nss-dns will send the following queries to "
> "systemd-resolved listening on 127.0.0.53:53: first \"localhost.foobar.com\", "
> "then \"localhost.barbar.com\", and finally \"localhost\". If (hopefully) the "
> "first two queries fail, systemd-resolved will synthesize an answer for the "
> "third query."
Looks all OK to me.
> Man page: systemd.resource-control.5
> Issue: Missing closing bracket after link to Control Groups version 1
Fixed.
> Man page: systemd-sysext.8
> Issue: In systemd-portabled.service(8): Portable Services Documentation
Updated.
> Man page: systemd.timer.5
> Issue 1: B<systemd.exec>(1) → B<systemd.exec>(5)
> Issue 2: This section does not (yet?) exist
Fixed.
> Man page: systemd.unit.5
> Issue: that is → that are
Fixed.
> Man page: systemd-veritysetup-generator.8
> Issue: systemd-veritysetup-generator → B<systemd-veritysetup-generator>
>
> "systemd-veritysetup-generator implements B<systemd.generator>(7)\\&."
>
> "systemd-veritysetup-generator understands the following kernel command line "
> "parameters:"
Updated.
> Man page: systemd-volatile-root.service.8
> Issue: initrdyes → Initrd
Fixed.
> Man page: sysupdate.d.5
> Issue: : → \\&. (As above in TRANSFER)
Updated.
> Man page: sysupdate.d.5
> Issue: some → certain
Updated.
> Man page: sysupdate.d.5
> Issue 1: i\\&.e\\& → I\\&.e\\&
Fixed.
> Issue 2: the image → the system
"image" seems correct.
> Man page: tmpfiles.d.5
> Issue: systemd-tmpfiles → B<systemd-tmpfiles>(8)
Updated.
(s) is just ugly with a vibe of DOS. In most cases just using the normal plural
form is more natural and gramatically correct.
There are some log_debug() statements left, and texts in foreign licenses or
headers. Those are not touched on purpose.
The general idea is that users should be able to figure out if some option
that they see in a config file or on some internet page is something that
systemd knows about. Once users know that, yes, this was an option but has
been deprecated and removed from the documentation, it's much easier for them
to find any docs in old versions if they want to. Or to switch to something
different.