Device mapper devices are set up in multiple steps. The first step, which
generates the initial "add" event, only creates an empty container, which is
useless for higher layers. SYSTEMD_READY should be set to 0 on this event to
avoid premature device activation.
The event that matters is the "activation" event: the first "change" event on
which DM_UDEV_DISABLE_OTHER_RULES_FLAG=1 is not set. When this event arrives,
the device is ready for being scanned by blkid and similar tools, and for being
activated by systemd.
Intermittent events with DM_UDEV_DISABLE_OTHER_RULES_FLAG=1 should be ignored
as far as systemd or higher-level block layers are concerned. Previous device
properties and symlinks should be preserved: the device shouldn't be scanned or
activated, but shouldn't be deactivated, either. In particular, SYSTEM_READY
shouldn't be set to 0 if it wasn't set before, because that might cause mounted
file systems to be unmounted. Such intermittent events may occur any time,
before or after the "activation" event.
DM_UDEV_DISABLE_OTHER_RULES_FLAG=1 can have multiple reasons. One possible reason
is that the device is suspended. There are other reasons that depend on the
device-mapper subsystem (LVM, multipath, dm-crypt, etc.).
The current systemd rule set
1) sets SYSTEMD_READY=0 if DM_UDEV_DISABLE_OTHER_RULES_FLAG is set in "add"
events;
2) imports SYSTEMD_READY from the udev db if DM_SUSPENDED is set, and jumps to systemd_end;
3) sets SYSTEMD_READY=1, otherwise.
This logic has several flaws:
* 1) can cause file systems to be unmounted if an coldplug event arrives while
a file system is suspended. This rule shouldn't be applied for coldplug events
or in general, "synthetic" add events;
* 2) evaluates DM_SUSPENDED=1, which is a device-mapper internal property.
It's wrong to infer that a device is accessible if DM_SUSPENDED=0.
The jump to systemd_end may cause properties and/or symlinks to be lost;
* 3) is superfluous, because SYSTEMD_READY=1 is equivalent with SYSTEMD_READY
being unset, and can create the wrong impression that the device was explicitly
activated.
This patch fixes the logic as follows:
- apply 1) only if DM_NAME is empty, which is only the case for the first
"genuine add" event;
- change 2) to use DM_UDEV_DISABLE_OTHER_RULES_FLAG instead of DM_SUSPENDED,
and remove the GOTO directive;
- remove 3).
Fixes: b7cf1b6 ("udev: use SYSTEMD_READY to mask uninitialized DM devices")
Fixes: 35a6750 ("rules: set SYSTEMD_READY=0 on DM_UDEV_DISABLE_OTHER_RULES_FLAG=1 only with ADD event (#2747)")
Signed-off-by: Martin Wilck <mwilck@suse.com>
Add persistent symlinks for media controller ("mediaX") devices, based
on their ID_PATH udev properties.
For example, if the uvcvideo driver creates /dev/media0, a persistent
name may be:
/dev/media/by-path/pci-0000:04:00.3-usb-0:1:1.0-media-controller
Persistent links are a handy tool to make scripts self-documenting
during development or in tests, as well as less error prone in case of
devices changing enumeration order. For media controllers, one can
alternatively scan through all of them and look for a matching bus_info
in their struct media_device_info, but the links are much handier when
drafting something by hand.
A similar pattern already exists for Video4Linux /dev/videoX devices,
see 60-persistent-v4l.rules for those.
If there are name collisions in the leds subsystem, the 2nd device node with the
colliding name gets automatically renamed by appending _1, the third by
appending _2 and so on.
This wildcard change makes sure that systemd-backlight also catches these
renamed nodes for kbd_backlight entries.
Distributions apparently only compile a subset of TPM2 drivers into the
kernel. For those not compiled it but provided as kmod we need a
synchronization point: we must wait before the first TPM2 interaction
until the driver is available and accessible.
This adds a tpm2.target unit as such a synchronization point. It's
ordered after /dev/tpmrm0, and is pulled in by a generator whenever we
detect that the kernel reported a TPM2 to exist but we have no device
for it yet.
This should solve the issue, but might create problems: if there are TPM
devices supported by firmware that we don't have Linux drivers for we'll
hang for a bit. Hence let's add a kernel cmdline switch to disable (or
alternatively force) this logic.
Fixes: #30164
Hwdb call for hidraw subsystem is missing and AV controller devices defined in hwdb.d/70-av-production.hwdb never get the proper permissions for /dev/hidraw*. This patch implements hwdb execution also for hidraw devices.
Users can currently pick specific versions of NIC naming, but that
does not guarantee that NIC names won't change after the kernel adds
a new sysfs attribute.
This patch allows for an allow/deny list of sysfs attributes
that could be used when composing the name.
These lists can be supplied as an hwdb entry in the form of
/etc/udev/hwdb.d/50-net-naming-allowlist.hwdb
net:naming:drvirtio_net
ID_NET_NAME_ALLOW=0
ID_NET_NAME_ALLOW_ACPI_INDEX=1
ID_NET_NAME_ALLOW_ADDR_ASSIGN_TYPE=1
ID_NET_NAME_ALLOW_ADDRESS=1
ID_NET_NAME_ALLOW_ARI_ENABLED=1
ID_NET_NAME_ALLOW_DEV_PORT=1
ID_NET_NAME_ALLOW_FUNCTION_ID=1
ID_NET_NAME_ALLOW_IFLINK=1
ID_NET_NAME_ALLOW_INDEX=1
ID_NET_NAME_ALLOW_LABEL=1
ID_NET_NAME_ALLOW_PHYS_PORT_NAME=1
ID_NET_NAME_ALLOW_TYPE=1
When the same disk image is written to multiple storage units, for
example an external SD card and an internal eMMC, the symlinks in
/dev/disk/by-{label,uuid,partlabel,partuuid}/ are no longer unique, and
will point to the device that is probed last.
Adressing partitions via labels and UUIDs is nice to work with, and
depending on the use case, it might also be more robust than using the
symlinks in /dev/disk/by-path/ containing the partition number. Combine
the two approaches to create unique symlinks containing both the device
path as well as the respective UUIDs or labels, and throw in a symlink
using the devpath and the partition number for the sake of completeness.
For an exemplary GPT-partitioned disk at "platform-2198000.mmc" with a
partition containing an ext4 file system, this might create symlinks of
the following form:
/dev/disk/by-path/platform-2198000.mmc-part/by-partnum/1
/dev/disk/by-path/platform-2198000.mmc-part/by-partuuid/e5a75233-3b90-4aec-8075-b4dd7132b48d
/dev/disk/by-path/platform-2198000.mmc-part/by-partlabel/rootfs
/dev/disk/by-path/platform-2198000.mmc-part/by-uuid/b2c92f24-8215-4680-b931-f423aae5f1c9
/dev/disk/by-path/platform-2198000.mmc-part/by-label/rootfs
Signed-off-by: Roland Hieber <rhi@pengutronix.de>
The previous patch 466266c does not make sense indeed, that is to say, if the SYSTEMD_READY is not recorded in the database, the GOTO="systemd_end" will not be applied.
The IMPORT{db} is actually a matching token, it returns false when there is no SYSTEMD_READY recorded in the database.
The previous patch 466266c tended to inherit the state of SYSTEMD_READY from the database and skip to the end of current rule file. But when the database does not contain SYSTEMD_READY, e.g., the dm-* is not set db_persistent during initrd and the database will be cleared after switching root, the following rules will still be applied not as expected.
Currently the ID_NET_DRIVER is set in net_setup_link builtin.
But this is called pretty late in the udev processing chain.
Right now in some custom rules it was workarounded by calling ethtool
binary directly, which is ugly.
So let's split this code to a separate builtin.
This reverts the following two commits:
- "udev: decrease devlink priority for encrypted partitions"
c4521fc17b.
- "udev: decrease devlink priority for iso disks"
df1dccd255.
These commits are workarounds for issues caused by
331aa7aa15.
With the previous commit, these workarounds are not necessary anymore,
as partitions are always processed later than their whole disk, and
a decrypted volume is also processed later than its backing volume.
Add persistent symlinks for MTD devices like SPI-NOR flash, based on the
partition names specified on the cmdline, in a Device Tree, or by other
MTD partitioning parser drivers. Using the persistent name can be
preferable to using the numbered /dev/mtdX device, as the latter can
change depending on probe order or when partitioning has changed.
Chronyd and similar time services, when using PTP devices, may need
the BindsTo/After directives to ensure the devices are available
before starting. Tag PTP devices with systemd to allow for wider
adoption.
Signed-off-by: Chris Patterson <cpatterson@microsoft.com>
But the directories are changed from /dev/loop/by-ref/ -> /dev/disk/by-loop-ref/
and /dev/loop/by-inode/ -> /dev/disk/by-loop-inode/.
As /dev/loop/ is used by losetup command for other purpose.
See issue #28475.
This effectively reverts commits 9915cc6086,
5022fab15f, and
c0d998248e.
Decrease devlink priority for encrypted partitions, and make the priority for
decrypted DM devices relatively higher. This is for the case that an encrypted
partition and its decrypted DM device have the same label.
Before c43ff248f9, the following line in
60-drm.rules also sets ID_PATH for all pci, usb, and platform devices:
===
ACTION!="remove", SUBSYSTEM=="drm", SUBSYSTEMS=="pci|usb|platform", IMPORT{builtin}="path_id"
===
Unfortunately, some existing rules rely on the unexpected behavior.
To keep the backward compatibility, let's set ID_PATH for them.
Fixes#28411.
Previously, if the priority is same, devlinks are always replaced by
newer events. The commit 331aa7aa15 changes
that to keep the existing devlink. That should not change any behavior
when the devices that request the same symlink do not have any
dependency, e.g. when /dev/sda1 and /dev/adb1 request the same
/dev/disk/by-label symlink, as there are no guarantee that which device
is processed first.
However, when devices has dependency, e.g. /dev/sda and /dev/sda1
request the same /dev/disk/by-label symlink, previously the symlink
always pointed to the partition, as the partition is always processed
later. But, 331aa7aa15 makes the symlink
point to the whole disk.
The change by 331aa7aa15 is crucial to
improve performance of devlink handling, especially when a system has
large number of disks with same label or so. Hence, cannot and should
not be reverted.
So, let's workaround the case, as such situation should happen only when
the disk is a hybrind ISO image, I guess.
Fixes#28468.
The DMI rules where so far guarded by an ACTION=="add" rule, but that
doesn't really make sense for setting properties (only for setting
access modes/ownership of nodes).
Hence let's move this into its own file, that guards properly on
ACTION!="remove".
Before this change the hardware vendor/model info would be dropped
whenever the device was retriggered.
Linux kernel will, as documented in drivers/video/backlight/backlight.c,
report changes to a backlights brightness as a uevent (ACTION=change).
systemd-udev will consume the uevent, match on this rule and try to
activate the systemd-backlight service for the backlight. BUT when
systemd is not compiled with backlight support, this will lead to
failure that is reported in the journal.
Since the failure to activate systemd-backlight and subsequent failure
log entry happens on every backlight brightness change, we found the
resulting logspam during regular operation excessive and came up with
this patch to mitigate it.
The conditional is also extended to "*kbd_backlight" match, since
even though we did not investigate to see if the logspam would be
similar, the unconditional match to activate systemd-backlight here
would also not make sense when the feature is not compiled in.
Signed-off-by: Simon Braunschmidt <simon.braunschmidt@iba-group.com>
Accel (Compute Acceleration) are new devices for AI/ML computation:
https://docs.kernel.org/accel/introduction.html
They are part of DRM subsystem. Add them to 'render' group since
no other appropriate group in standard linux systems exist. This
can be changed when proper common user-space components will emerge,
and new group for acceleration devices access will be established.
Signed-off-by: Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>
xHCI host controller may register two (or more?) USB root hubs for USB
2.0 and USB 3.0, and devices under the hubs may have same ID_PATH.
So, to avoid the conflict, let's introduce ID_PATH_WITH_USB_REVISION
that includes the USB revision.
Closes#19406.
We started systemd-vconsole-setup in two ways: via a dbus call from localed to
do systemd-vconsole-setup.service/restart, and from udev, calling the binary
directly. This patch makes udev call
systemctl restart systemd-vconsole-setup.service
effectively implementing the same method as localed.
Ordering is implemented at the unit level, so we can use --no-block to not
block here.
Meta's resource control demo project[0] includes a benchmark tool that can
be used to calculate the best iocost solutions for a given SSD.
[0]: https://github.com/facebookexperimental/resctl-demo
A project[1] has now been started to create a publicly available database
of results that can be used to apply them automatically.
[1]: https://github.com/iocost-benchmark/iocost-benchmarks
This change adds a new tool that gets triggered by a udev rule for any
block device and queries the hwdb for known solutions. The format for
the hwdb file that is currently generated by the github action looks like
this:
# This file was auto-generated on Tue, 23 Aug 2022 13:03:57 +0000.
# From the following commit:
# ca82acfe93
#
# Match key format:
# block:<devpath>:name:<model name>:
# 12 points, MOF=[1.346,1.346], aMOF=[1.249,1.249]
block:*:name:HFS256GD9TNG-62A0A:fwver:*:
IOCOST_SOLUTIONS=isolation isolated-bandwidth bandwidth naive
IOCOST_MODEL_ISOLATION=rbps=1091439492 rseqiops=52286 rrandiops=63784 wbps=192329466 wseqiops=12309 wrandiops=16119
IOCOST_QOS_ISOLATION=rpct=0.00 rlat=8807 wpct=0.00 wlat=59023 min=100.00 max=100.00
IOCOST_MODEL_ISOLATED_BANDWIDTH=rbps=1091439492 rseqiops=52286 rrandiops=63784 wbps=192329466 wseqiops=12309 wrandiops=16119
IOCOST_QOS_ISOLATED_BANDWIDTH=rpct=0.00 rlat=8807 wpct=0.00 wlat=59023 min=100.00 max=100.00
IOCOST_MODEL_BANDWIDTH=rbps=1091439492 rseqiops=52286 rrandiops=63784 wbps=192329466 wseqiops=12309 wrandiops=16119
IOCOST_QOS_BANDWIDTH=rpct=0.00 rlat=8807 wpct=0.00 wlat=59023 min=100.00 max=100.00
IOCOST_MODEL_NAIVE=rbps=1091439492 rseqiops=52286 rrandiops=63784 wbps=192329466 wseqiops=12309 wrandiops=16119
IOCOST_QOS_NAIVE=rpct=99.00 rlat=8807 wpct=99.00 wlat=59023 min=75.00 max=100.00
The IOCOST_SOLUTIONS key lists the solutions available for that device
in the preferred order for higher isolation, which is a reasonable
default for most client systems. This can be overriden to choose better
defaults for custom use cases, like the various data center workloads.
The tool can also be used to query the known solutions for a specific
device or to apply a non-default solution (say, isolation or bandwidth).
Co-authored-by: Santosh Mahto <santosh.mahto@collabora.com>
In 5118e8e71d, the rules were changed to add
OPTIONS="string_escape=replace" to creation of
ENV{ID_SERIAL}="$env{ID_MODEL}_$env{ID_SERIAL_SHORT}", so that "/" would be
escaped. But this also changes how the symlink looks for devices that do not
have "/". This adds back the old symlink for compat, except when a slash
is present.
In the meantime, we changed the symlink format to include ${ND_NSID}. Since
the symlink with unescaped characters are older than that, for compat we
only need to cover the older type. (Symlinks without escaping and with ${ND_NSID}
were never created.) This makes it slightly easier on users: the non-deprecated
symlinks are with "_${ND_NSID}", so they are easier to distinguish.
Fixes#27155.
Mostly untested :( I only have a boring nvme device with no special characters
in the id, and the symlinks are unchanged for it by this patch.
The nvme by-id symlink changes to the latest namespace when a new namespace gets
added, for example by connecting multiple NVMe/TCP host controllers via nvme
connect-all.
That is incorrect for persistent device links.
The persistent symbolic device link should continue to point to the same NVMe
namespace throughout the lifetime of the current boot.
Therefore the namespace id needs to be added to the link name.
This adds symlinks that allow accessing loopback block devices via stable
names that reference their backing block devices, make the unpredictable
naming of loopback devices less of an issue.
Example:
1. Create a loopback block device for a file $F
losetup --find $F
2. Reference the backing block device via its inode:
L="$(stat -c '/dev/loop/by-inode/%Hd:%Ld-%i' $F)"
fdisk $L
In the above the loop device name (which might be /dev/loop47 or any
other name) is not used at all.
When built without blkid, then udev-builtin-blkid is not built,
and the verifier warns about the unknown builtin:
60-persistent-storage.rules:114 Unknown builtin command: blkid --hint=session_offset=$env{ID_CDROM_MEDIA_SESSION_LAST_OFFSET}
60-persistent-storage.rules:117 Unknown builtin command: blkid --noraid
60-persistent-storage.rules:120 Unknown builtin command: blkid
60-persistent-storage.rules: udev rules check failed