We need to make sure that our coredump pattern handler manages to read
process metadata from /proc/$PID/ before the kernel reaps the crashed
process. By default the kernel will reap the process as soon as it can.
By setting kernel.core_pipe_limit to a non-zero the kernel will wait for
userspace to finish before reaping.
We'll set the value to 16, which allows 16 crashes to be
processed in parallel. This matches the MaxConnections= setting in
systemd-coredump.socket.
See: #17301
(This doesn't close 17301, since we probably should also gracefully
handle if /proc/$PID/ vanished already while our coredump handler runs,
just in case people loclly set the sysctl back to zero. i.e. we should
collect what we can and rather issue an incomplete log record than
none.)
Right now the kernel will not dump anything that went through setuid or
setgid. But it is routine for daemons to do that, and it makes things hard to
debug.
systemd-coredump saves the coredump readable by the users the process was
running as. This should be enough to avoid information leakage. So let's also
tell the kernel to do the coredump.
For https://bugzilla.redhat.com/show_bug.cgi?id=1790972.
Both patterns are stored in the same file, so they are enabled or disabled
together. (Though suid_dumpable=2 is supposed to be safe even when writing to
plain files.)
I couldn't see any reason why the kernel could provide COMM to the coredump
handler via the core_pattern command line but could not make it available in
/proc. So let's assume that this info is always available in /proc.
For "backtrace" mode (when --backtrace option is passed), I assumed that the
crashing process still exists at the time systemd-coredump is called.
Also changing the core_pattern line is an API breakage for any users of the
backtrace mode but given that systemd-coredump is installed in
/usr/lib/systemd, it's a private tool which has no internal users. At least no
one complained when the hostname was added to the core_pattern line
(f45b801551)...
Indeed it's much easier to get it from /proc since the kernel substitutes '%e'
specifier with multiple strings if the process name contains spaces (!).
This commint adds a new command line parameter to sytemd-coredump. The
parameter should be mappend to core_pattern's placeholder %h - hostname.
The field _HOSTNAME holds the name from the kernel's namespaces which might be
different then the one comming from process' namespaces.
It is true that the real hostname is usually available in the field
COREDUMP_ENVIRON (environment variables) but I believe it is more reliable to
use the value passed by kernel.
----
The length of iovec is no longer static and hence I corrected the declarations
of the functions set_iovec_field and set_iovec_field_free.
Thank you @yuwata and @poettering!
With this change processing/saving of coredumps takes the RLIMIT_CORE resource limit of the crashing process into
account, given the user control whether specific processes shall core dump or not, and how large to make the core dump.
Note that this effectively disables core-dumping for now, as RLIMIT_CORE defaults to 0 (i.e. is disabled) for all
system processes.