This adds the host side of the veth link to the given bridge.
Also refactor the creation of the veth interfaces a bit to set it up
from the host rather than the container. This simplifies the addition
to the bridge, but otherwise the behavior is unchanged.
We cannot remove CAP_SYS_ADMIN, which basically makes removing all other
capabilities useless. Anyhow, still wouldn't hurt checking whether stuff
like CAP_KILL can be dropped from logind.
This way we have four kinds of properties:
a) those which are constant as long as an object exists
b) those which can change and PropertiesChange messages with contents are generated
c) those which can change and where the PropertesChange merely includes invalidation
d) those which can change but for which no events are generated
Clients (through code generators run on the introspection XML) can thus
aggressively cache a, b, c, with only d excluded.
Uevents are events of the host, which should not leak into a container.
Containers do not support hotplug at the moment, and devices and uevents
are not namespace aware.
The pattern of unreffing an IO event source and then closing its fd is
frequently seen in even source callbacks. Previously this likely
resultet in us removing the fd from the epoll after it was closed which
is problematic, since while we were dispatching we always kept an extra
reference to event source objects because we might still need it later.
With this change a failing event source handler will not cause the
entire event loop to fail. Instead, we just disable the specific event
source, log a message at debug level and go on.
This also introduces a new concept of "exit code" which can be stored in
the event loop and is returned by sd_event_loop(). We also rename "quit"
to "exit" everywhere else.
Altogether this should make things more robus and keep errors local
while still providing a way to return event loop errors in a clear way.
This adds the new library call sd_journal_open_container() and a new
"-M" switch to journalctl. Particular care is taken that journalctl's
"-b" switch resolves to the current boot ID of the container, not the
host.
Adds a new call sd_event_set_watchdog() that can be used to hook up the
event loop with the watchdog supervision logic of systemd. If enabled
and $WATCHDOG_USEC is set the event loop will ping the invoking systemd
daemon right after coming back from epoll_wait() but not more often than
$WATCHDOG_USEC/4. The epoll_wait() will sleep no longer than
$WATCHDOG_USEC/4*3, to make sure the service manager is called in time.
This means that setting WatchdogSec= in a .service file and calling
sd_event_set_watchdog() in your daemon is enough to hook it up with the
watchdog logic.
That way the even source callback is run with the zombie process still
around so that it can access /proc/$PID/ and similar, and so that it can
be sure that the PID has not been reused yet.
Introduces a new concept of "trusted" vs. "untrusted" busses. For the
latter libsystemd-bus will automatically do per-method access control,
for the former all access is automatically granted. Per-method access
control is encoded in the vtables: by default all methods are only
accessible to privileged clients. If the SD_BUS_VTABLE_UNPRIVILEGED flag
is set for a method it is accessible to unprivileged clients too. By
default whether a client is privileged is determined via checking for
its CAP_SYS_ADMIN capability, but this can be altered via the
SD_BUS_VTABLE_CAPABILITY() macro that can be ORed into the flags field
of the method.
Writable properties are also subject to SD_BUS_VTABLE_UNPRIVILEGED and
SD_BUS_VTABLE_CAPABILITY() for controlling write access to them. Note
however that read access is unrestricted, as PropertiesChanged messages
might send out the values anyway as an unrestricted broadcast.
By default the system bus is set to "untrusted" and the user bus is
"trusted" since per-method access control on the latter is unnecessary.
On dbus1 busses we check the UID of the caller rather than the
configured capability since the capability cannot be determined without
race. On kdbus the capability is checked if possible from the attached
meta-data of a message and otherwise queried from the sending peer.
This also decorates the vtables of the various daemons we ship with
these flags.
It tries to find a suitable QEMU binary and will use KVM if present.
We can now configure QEMU from outside with 4 variables :
- $QEMU_BIN : path to QEMU's binary
- $KERNEL_APPEND : arguments appended to kernel cmdline
- $KERNEL_BIN : path to a kernel
Default /boot/vmlinuz-$KERNEL_VER
- $INITRD : path to an initramfs
Default /boot/initramfs-${KERNEL_VER}.img
- $QEMU_SMP : number of CPU simulated by QEMU.
Default 1
(from Alexander Graf's script: http://www.spinics.net/lists/kvm/msg72389.html)
Instead of returning an enum of return codes, make them return error
codes like kdbus does internally.
Also, document this behaviour so that clients can stick to it.
(Also rework bus-control.c to always have to functions for dbus1 vs.
kernel implementation of the various calls.)
Since we want to retain the ability to break kernel ←→ userspace ABI
after the next release, let's not make use by default of kdbus, so that
people with future kernels will not suddenly break with current systemd
versions.
kdbus support is left in all builds but must now be explicitly requested
at runtime (for example via setting $DBUS_SESSION_BUS). Via a configure
switch the old behaviour can be restored. In fact, we change autogen.sh
to do this, so that git builds (which run autogen.sh) get kdbus by
default, but tarball builds (which ue the configure defaults) do not get
it, and hence this stays out of the distros by default.
This reverts commit adcf4c81c5.
We have a better solution for the problem of making two processes run in
the same namespace, and --listener is not needed hence and should be
dropped.
Conflicts:
man/systemd-socket-proxyd.xml
* library support for setns() system call was added to glibc
version 2.14 (setns() call is use in src/machine/machinectl.c
and src/libsystemd-bus-container.c)
* utf8 validation call are already exported (via sd-utf8.c file) -
commit - 369c583b3f
Message handler callbacks can be simplified drastically if the
dispatcher automatically replies to method calls if errors are returned.
Thus: add an sd_bus_error argument to all message handlers. When we
dispatch a message handler and it returns negative or a set sd_bus_error
we send this as message error back to the client. This means errors
returned by handlers by default are given back to clients instead of
rippling all the way up to the event loop, which is desirable to make
things robust.
As a side-effect we can now easily turn the SELinux checks into normal
function calls, since the method call dispatcher will generate the right
error replies automatically now.
Also, make sure we always pass the error structure to all property and
method handlers as last argument to follow the usual style of passing
variables for return values as last argument.
This patch converts PID 1 to libsystemd-bus and thus drops the
dependency on libdbus. The only remaining code using libdbus is a test
case that validates our bus marshalling against libdbus' marshalling,
and this dependency can be turned off.
This patch also adds a couple of things to libsystem-bus, that are
necessary to make the port work:
- Synthesizing of "Disconnected" messages when bus connections are
severed.
- Support for attaching multiple vtables for the same interface on the
same path.
This patch also fixes the SetDefaultTarget() and GetDefaultTarget() bus
calls which used an inappropriate signature.
As a side effect we will now generate PropertiesChanged messages which
carry property contents, rather than just invalidation information.
When a service exits succesfully and has RemainAfterExit set, its hold
on the console (in m->n_on_console) wasn't released since the unit state
didn't change.
This tool applies hardware specific settings to network devices before they
are announced via libudev.
Settings that will probably eventually be supported are MTU, Speed,
DuplexMode, WakeOnLan, MACAddress, MACAddressPolicy (e.g., 'hardware',
'synthetic' or 'random'), Name and NamePolicy (replacing our current
interface naming logic). This patch only introduces support for
Description, as a proof of concept.
Some of these settings may later be overriden by a network management
daemon/script. However, these tools should always listen and wait on libudev
before touching a device (listening on netlink is not enough). This is no
different from how things used to be, as we always supported changing the
network interface name from udev rules, which does not work if someone
has already started using it.
The tool is configured by .link files in /etc/net/links/ (with the usual
overriding logic in /run and /lib). The first (in lexicographical order)
matching .link file is applied to a given device, and all others are ignored.
The .link files contain a [Match] section with (currently) the keys
MACAddress, Driver, Type (see DEVTYPE in udevadm info) and Path (this
matches on the stable device path as exposed as ID_PATH, and not the
unstable DEVPATH). A .link file matches a given device if all of the
specified keys do. Currently the keys are treated as plain strings,
but some limited globbing may later be added to the keys where it
makes sense.
Example:
/etc/net/links/50-wireless.link
[Match]
MACAddress=98:f2:e4:42:c6:92
Path=pci-0000:02:00.0-bcma-0
Type=wlan
[Link]
Description=The wireless link
We now treat passno as boleans in the generators, and don't need this any more. fsck itself
is able to sequentialize checks on the same local media, so in the common case the ordering
is redundant.
It is still possible to force an order by using .d fragments, in case that is desired.
rename old versions to ascii_*
Do not take into account zerowidth characters, but do consider double-wide characters.
Import needed utf8 helper code from glib.
v3: rebase ontop of utf8 restructuring work
[zj: tweak the algorithm a bit, move new code to separate file]
This adds a lightweight scheme how to define interfaces in static fixed
arrays which then can be easily registered on a bus connection. This
makes it much easier to write bus services.
This automatically handles implementation of the Properties,
ObjectManager, and Introspection bus interfaces.
The cgroup attribute memory.soft_limit_in_bytes is unlikely to stay
around in the kernel for good, so let's not expose it for now. We can
readd something like it later when the kernel guys decided on a final
API for this.