We cannot remove CAP_SYS_ADMIN, which basically makes removing all other
capabilities useless. Anyhow, still wouldn't hurt checking whether stuff
like CAP_KILL can be dropped from logind.
This way we have four kinds of properties:
a) those which are constant as long as an object exists
b) those which can change and PropertiesChange messages with contents are generated
c) those which can change and where the PropertesChange merely includes invalidation
d) those which can change but for which no events are generated
Clients (through code generators run on the introspection XML) can thus
aggressively cache a, b, c, with only d excluded.
Uevents are events of the host, which should not leak into a container.
Containers do not support hotplug at the moment, and devices and uevents
are not namespace aware.
The pattern of unreffing an IO event source and then closing its fd is
frequently seen in even source callbacks. Previously this likely
resultet in us removing the fd from the epoll after it was closed which
is problematic, since while we were dispatching we always kept an extra
reference to event source objects because we might still need it later.
With this change a failing event source handler will not cause the
entire event loop to fail. Instead, we just disable the specific event
source, log a message at debug level and go on.
This also introduces a new concept of "exit code" which can be stored in
the event loop and is returned by sd_event_loop(). We also rename "quit"
to "exit" everywhere else.
Altogether this should make things more robus and keep errors local
while still providing a way to return event loop errors in a clear way.
This adds the new library call sd_journal_open_container() and a new
"-M" switch to journalctl. Particular care is taken that journalctl's
"-b" switch resolves to the current boot ID of the container, not the
host.
Adds a new call sd_event_set_watchdog() that can be used to hook up the
event loop with the watchdog supervision logic of systemd. If enabled
and $WATCHDOG_USEC is set the event loop will ping the invoking systemd
daemon right after coming back from epoll_wait() but not more often than
$WATCHDOG_USEC/4. The epoll_wait() will sleep no longer than
$WATCHDOG_USEC/4*3, to make sure the service manager is called in time.
This means that setting WatchdogSec= in a .service file and calling
sd_event_set_watchdog() in your daemon is enough to hook it up with the
watchdog logic.
That way the even source callback is run with the zombie process still
around so that it can access /proc/$PID/ and similar, and so that it can
be sure that the PID has not been reused yet.
Introduces a new concept of "trusted" vs. "untrusted" busses. For the
latter libsystemd-bus will automatically do per-method access control,
for the former all access is automatically granted. Per-method access
control is encoded in the vtables: by default all methods are only
accessible to privileged clients. If the SD_BUS_VTABLE_UNPRIVILEGED flag
is set for a method it is accessible to unprivileged clients too. By
default whether a client is privileged is determined via checking for
its CAP_SYS_ADMIN capability, but this can be altered via the
SD_BUS_VTABLE_CAPABILITY() macro that can be ORed into the flags field
of the method.
Writable properties are also subject to SD_BUS_VTABLE_UNPRIVILEGED and
SD_BUS_VTABLE_CAPABILITY() for controlling write access to them. Note
however that read access is unrestricted, as PropertiesChanged messages
might send out the values anyway as an unrestricted broadcast.
By default the system bus is set to "untrusted" and the user bus is
"trusted" since per-method access control on the latter is unnecessary.
On dbus1 busses we check the UID of the caller rather than the
configured capability since the capability cannot be determined without
race. On kdbus the capability is checked if possible from the attached
meta-data of a message and otherwise queried from the sending peer.
This also decorates the vtables of the various daemons we ship with
these flags.
It tries to find a suitable QEMU binary and will use KVM if present.
We can now configure QEMU from outside with 4 variables :
- $QEMU_BIN : path to QEMU's binary
- $KERNEL_APPEND : arguments appended to kernel cmdline
- $KERNEL_BIN : path to a kernel
Default /boot/vmlinuz-$KERNEL_VER
- $INITRD : path to an initramfs
Default /boot/initramfs-${KERNEL_VER}.img
- $QEMU_SMP : number of CPU simulated by QEMU.
Default 1
(from Alexander Graf's script: http://www.spinics.net/lists/kvm/msg72389.html)
Instead of returning an enum of return codes, make them return error
codes like kdbus does internally.
Also, document this behaviour so that clients can stick to it.
(Also rework bus-control.c to always have to functions for dbus1 vs.
kernel implementation of the various calls.)