The general idea with users and groups created through sysusers is that an
appropriate number is picked when the allocation is made. The number that is
selected will be different on each system based on the order of creation of
users, installed packages, etc. Since system users and groups are not shared
between installations, this generally is not an issue. But it becomes a problem
for initrd: some file systems are shared between the initrd and the host (/run
and /dev are probably the only ones that matter). If the allocations are
different in the host and the initrd, and files survive switch-root, they will
have wrong ownership.
This makes the gids build-time-configurable for all groups and users where
state may survive the switch from initrd to the host.
In particular, all "hardware access" groups are like this: files in /dev will
be owned by them. Eventually the new udev would change ownership, but there
would be a momemnt where the files were owned by the wrong group. The
allocations are "soft-static" in the language of Fedora packaging guidelines:
the uid/gid will be used if possible, but we'll fall back to a different
one. TTY_GID is the exception, because the number is used directly.
Similarly, the possibility to configure "soft-static" uids is added for daemons
which may usefully run in the initramfs: systemd-network (lease information and
interface state is serialized to /run), systemd-resolve (stub files and
interface state), systemd-timesync (/run/systemd/timesync).
Journal files are owned by the group systemd-journal, and acls are granted
for wheel and adm.
systemd-oom and systemd-coredump are excluded from this patch: I assume that
oomd is not useful in the initrd, and coredump leaves no state (it only creates
a pipe in /run?).
The defaults are not changed: if nothing is configured, dynamic allocation will
be used. I looked at a Debian system, and the numbers are all different than
on Fedora.
For Fedora, see the list of uids and gids at https://pagure.io/setup/blob/master/f/uidgid.
In particular, systemd-network and systemd-resolve got soft-static numbers to
make it easy to transition from a non-host-specific initrd to a host system
already a few years back (https://bugzilla.redhat.com/show_bug.cgi?id=1102002).
I also requested static allocations for sgx, input, render in
https://pagure.io/packaging-committee/issue/1078,
https://pagure.io/setup/pull-request/27.
Debugging udev issues especially during the early boot is fairly
difficult. Currently, you need to enable (at least) debug logging and
start monitoring uevents, try to reproduce the issue and then analyze
and correlate two (usually) huge log files. This is not ideal.
This patch aims to provide much more focused debugging tool,
tracepoints. More often then not we tend to have at least the basic idea
about the issue we are trying to debug further, e.g. we know it is
storage related. Hence all of the debug data generated for network
devices is useless, adds clutter to the log files and generally
slows things down.
Using this set of tracepoints you can start asking very specific
questions related to event processing for given device or subsystem.
Tracepoints can be used with various tracing tools but I will provide
examples using bpftrace.
Another important aspect to consider is that using tracepoints you can
debug production systems. There is no need to install test packages with
added logging, no debuginfo packages, etc...
Example usage (you might be asking such questions during the debug session),
Q: How can I list all tracepoints?
A: bpftrace -l 'usdt:/usr/lib/systemd/systemd-udevd:udev:*'
Q: What are the arguments for each tracepoint?
A: Look at the code and search for use of DEVICE_TRACE_POINT macro.
Q: How many times we have executed external binary?
A: bpftrace -e 'usdt:/usr/lib/systemd/systemd-udevd:udev:spawn_exec { @cnt = count(); }'
Q: What binaries where executed while handling events for "dm-0" device?
A bpftrace -e 'usdt:/usr/lib/systemd/systemd-udevd:udev:spawn_exec / str(arg1) == "dm-0"/ { @cmds[str(arg4)] = count(); }'
Thanks to Thomas Weißschuh <thomas@t-8ch.de> for reviewing this patch
and contributions that allowed us to drop the dependency on dtrace tool
and made the resulting code much more concise.
In commit d895e10a a test was introduced to validate that prefix is a
child of rootprefix. However, it only works when rootprefix is "/".
Since the test is ignored when rootprefix is equal to prefix, this is
only noticed if specifying both -Drootprefix= and -Dprefix=, e.g.:
$ meson foo -Drootprefix=/foo -Dprefix=/foo/bar
meson.build:111:8: ERROR: Problem encountered: Prefix is not below
root prefix (now rootprefix=/foo prefix=/foo/bar)
On Debian, bpftool is installed in /usr/sbin, which is not in $PATH for
non-root users by default, so finding it fails.
Add a secondary, hard-coded '/usr/sbin/bpftool' after 'bpftool' so that
meson can find it.
https://packages.debian.org/sid/amd64/bpftool/filelist
This ensures that the fuzz test code is also built by default.
It also increases the test coverage a bit. Compiling the tests
*with* sanitizers is painfully slow, so this is not enabled. But
just compiling them sauté is hardly noticable. Running the tests
increases the test count and runtime:
622 tests, 26 s
to
922 tests, 35 s
I think this is acceptable.
This doesn't matter too much, but makes things a bit more consistent.
A minor advantage is that the file is not a configuration file for meson
anymore, so:
a) It is not built unless pulled in by another target. Since
we don't usually build man pages by default, this saves a tiny
amount of work.
b) When the .in file is updated, meson does not reconfigure everything,
but just rebuilds the dependent targets.
Now that the conversion is finished, time for benchmarking:
a full build with default settings (and -Dstandalonebinaries=true), yields
before this pull request: 1687 targets, 148.13s user 35.17s system 317% cpu 57.697 total
with the full pull request: 1714 targets, 143.07s user 27.87s system 314% cpu 54.369 total
The difference doesn't seem significant. Partial rebuilds might be faster as
mentioned before.
We had two big 'configuration_data' objects in meson config. (There are in fact
more. On is added in this series, and there's one for efi… But those others
have a handful variables only for specific purposes and don't matter). The two
sets are 'conf' and 'substs', and were inherited from the original autotools
system. In the past there was even a third set ('m4_defines'), but @yuwata
removed it in 348b44372f. And those two/three
systems had very similar data, but with different variable names, because of
historical reasons. They also used subtly different quoting (.set()
vs. .set10() vs. .set_quoted()), which was required because the templating
engines were not flexible enough. This meants we had more work when changing
things, and we needed to search for different variable names, etc.
With a more flexible templating engine we can do with just one
configuration_data object.
The naming of variables is very inconsistent. I tried to use more
modern style naming (UNDERSCORED_TITLE_CASE), but I didn't change existing
names too much. Only SYSTEM_DATA_UNIT_PATH is renamed to SYSTEM_DATA_UNIT_DIR
to match SYSTEM_CONFIG_UNIT_DIR.
I want to stop using 'substs'. But in this case, configure_file() is nicer
than custom_target(), because it causes meson to immediately generate the
helpers after configuration, so it's possible to do
'meson build && build/man/man ...', without building anything first.
We only substitute one variable here, so let's use a custom configuration_data()
object.
m4 was hugely popular in the past, because autotools, automake, flex, bison and
many other things used it. But nowadays it much less popular, and might not even
be installed in the buildroot. (m4 is small, so it doesn't make a big difference.)
(FWIW, Fedora dropped make from the buildroot now,
https://fedoraproject.org/wiki/Changes/Remove_make_from_BuildRoot. I think it's
reasonable to assume that m4 will be dropped at some point too.)
The main reason to drop m4 is that the syntax is not very nice, and we should
minimize the number of different syntaxes that we use. We still have two
(configure_file() with @FOO@ and jinja2 templates with {{foo}} and the
pythonesque conditional expressions), but at least we don't need m4 (with
m4_dnl and `quotes').
HAVE_SMACK_RUN_LABEL was dropped back in 348b44372f,
so one line in etc.conf was not rendered as expected ;(
Checking if names are defined is paying for itself!
We don't need two (and half) templating systems anymore, yay!
I'm keeping the changes minimal, to make the diff manageable. Some enhancements
due to a better templating system might be possible in the future.
For handling of '## ' — see the next commit.
m4 was nice in '85, but the syntax feels a bit dated. Since we use python for
meson, let's use a popular python templating engine to replace some m4 usage.
A little nicety is that typos are caught:
FAILED: sysusers.d/systemd-remote.conf
/usr/bin/meson --internal exe --capture sysusers.d/systemd-remote.conf -- /home/zbyszek/src/systemd/tools/meson-render-jinja2.py config.h ../sysusers.d/systemd-remote.conf.j2
Traceback (most recent call last):
File "/home/zbyszek/src/systemd/tools/meson-render-jinja2.py", line 28, in <module>
print(render(sys.argv[2], defines))
File "/home/zbyszek/src/systemd/tools/meson-render-jinja2.py", line 24, in render
return template.render(defines)
File "/usr/lib/python3.9/site-packages/jinja2/environment.py", line 1090, in render
self.environment.handle_exception()
File "/usr/lib/python3.9/site-packages/jinja2/environment.py", line 832, in handle_exception
reraise(*rewrite_traceback_stack(source=source))
File "/usr/lib/python3.9/site-packages/jinja2/_compat.py", line 28, in reraise
raise value.with_traceback(tb)
File "<template>", line 8, in top-level template code
jinja2.exceptions.UndefinedError: 'HAVE_MICROHTTP' is undefined
This checking mirrors what 349cc4a507 did for C defines.
Old meson fails with:
Element not a string: [<Holder: <ExternalProgram 'sh' -> ['/bin/sh']>>, '-c', 'test -n "$DESTDIR" || /bin/journalctl --update-catalog']
I'm doing it as a revert so that it's easy to undo the revert when we require
newer meson. The effect is not so bad, maybe a dozen or so lines about finding
'sh'.
Meson 0.58 has gotten quite bad with emitting a message every time
a quoted command is used:
Program /home/zbyszek/src/systemd-work/tools/meson-make-symlink.sh found: YES (/home/zbyszek/src/systemd-work/tools/meson-make-symlink.sh)
Program sh found: YES (/usr/bin/sh)
Program sh found: YES (/usr/bin/sh)
Program sh found: YES (/usr/bin/sh)
Program sh found: YES (/usr/bin/sh)
Program sh found: YES (/usr/bin/sh)
Program sh found: YES (/usr/bin/sh)
Program xsltproc found: YES (/usr/bin/xsltproc)
Configuring custom-entities.ent using configuration
Message: Skipping bootctl.1 because ENABLE_EFI is false
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Message: Skipping journal-remote.conf.5 because HAVE_MICROHTTPD is false
Message: Skipping journal-upload.conf.5 because HAVE_MICROHTTPD is false
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Message: Skipping loader.conf.5 because ENABLE_EFI is false
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
Program ln found: YES (/usr/bin/ln)
...
Let's suffer one message only for each command. Hopefully we can silence
even this when https://github.com/mesonbuild/meson/issues/8642 is
resolved.
Today this is v248 with 938bdfc0fa, which,
if you don't know about the github webflow key fails to configure with
meson.build:724:8: ERROR: String "gpg: Signature made Tue 30 Mar 2021 22:59:02 CEST\ngpg: using RSA key 4AEE18F83AFDEB23\ngpg: Can't check signature: No public key\n1617137942\n" cannot be converted to int
or, if you do, with
meson.build:724:8: ERROR: String 'gpg: Signature made Tue 30 Mar 2021 22:59:02 CEST\ngpg: using RSA key 4AEE18F83AFDEB23\ngpg: Good signature from "GitHub (web-flow commit signing) <noreply@github.com>" [unknown]\ngpg: WARNING: This key is not certified with a trusted signature!\ngpg: There is no indication that the signature belongs to the owner.\nPrimary key fingerprint: 5DE3 E050 9C47 EA3C F04A 42D3 4AEE 18F8 3AFD EB23\n1617137942\n' cannot be converted to int
New glibc deprecated mallocinfo(), even newer glibc added mallocinfo2()
as replacement. Use it, if it exists.
Follow-up for 4b6f74f5a0 and related
commits.
https://github.com/systemd/systemd/pull/19316 failed with:
[1065/1670] Linking target systemd-hwdb
--- command ---
14:28:29 /root/src/test/hwdb-test.sh
--- stdout ---
./systemd-hwdb does not exist, please build first
I'm not sure what is going on here… In principle meson says that tests may be
called from any directory, but in practice is was always the build directory.
So far we were relying on systemd-hwdb being present in '.', and this worked.
Either way, it's nicer to pass the exact path, so let's do that.
* Add `bpf-framework` feature gate with 'auto', 'true' and 'false' choices
* Add libbpf [0] dependency
* Search for clang llvm-strip and bpftool binaries in compile time to
generate bpf skeleton.
For libbpf [0], make 0.2.0 [1] the minimum required version.
If libbpf is satisfied, set HAVE_LIBBPF config option to 1.
If `bpf-framework` feature gate is set to 'auto', means that whether
bpf feature is enabled or now is defined by the presence of all of
libbpf, clang, llvm and bpftool in build
environment.
With 'auto' all dependencies are optional.
If the gate is set to `true`, make all of the libbpf, clang and llvm
dependencies mandatory.
If it's set to `false`, set `BPF_FRAMEWORK` to false and make libbpf
dependency optional.
libbpf dependency is dynamic followed by the common pattern in systemd.
meson, bpf: add build rule for socket_bind program
Try to make this more manageable by reording:
- dependencies / inputs
(with subcategory of compression libraries)
- major components / outputs
- optional features / conditionals that don't fit into the two above categories
The division isn't well defined, because libraries often correspond one-to-one
to feature, but not always.
Let's assert if we ever happen to pass 0 to one of the log functions.
With the preceding commit to return -EIO from log_*(), passing 0 wouldn't
affect the return value any more, but it is still most likely an error.
The unit test code is an exception: we fairly often pass the return value
to print it, before checking what it is. So let's assert that we're not
passing 0 in non-test code. As with the previous check for %m, this is only
done in developer mode. We are depending on external code setting
errno correctly for us, which might not always be true, and which we can't
test, so we shouldn't assert, but just handle this gracefully.
I did a bunch of greps to try to figure out if there are any places where
we're passing 0 on purpose, and couldn't find any.
The one place that failed in tests is adjusted.
About "zerook" in the name: I wanted the suffix to be unambiguous. It's a
single "word" because each of the words in log_full_errno is also meaningful,
and having one term use two words would be confusing.
Using a enum is all nice and generic, but at this point it seems unlikely that
we'll add further build modes. But having an enum means that we need to include
the header file with the enumeration whenerever the conditional is used. I want
to use the conditional in log.h, which makes it hard to avoid circular imports.
We intentionally do not inline initializations with definitions for
a bunch of _cleanup_ variables in tests, to ensure valgrind is triggered.
This triggers a lot of maybe-uninitialized false positives when -O2 and
-flto are used. Suppress them.