mirror of
https://mirrors.bfsu.edu.cn/git/linux.git
synced 2024-12-15 06:55:13 +08:00
688f118e31
- Add a SPDX header; - Mark some literals as such; - Add it to filesystems/index.rst. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/0c36091b6660cd372f994bd98e1264491d766c22.1581955849.git.mchehab+huawei@kernel.org Signed-off-by: Jonathan Corbet <corbet@lwn.net>
447 lines
22 KiB
ReStructuredText
447 lines
22 KiB
ReStructuredText
.. SPDX-License-Identifier: GPL-2.0
|
|
|
|
:orphan:
|
|
|
|
.. UBIFS Authentication
|
|
.. sigma star gmbh
|
|
.. 2018
|
|
|
|
Introduction
|
|
============
|
|
|
|
UBIFS utilizes the fscrypt framework to provide confidentiality for file
|
|
contents and file names. This prevents attacks where an attacker is able to
|
|
read contents of the filesystem on a single point in time. A classic example
|
|
is a lost smartphone where the attacker is unable to read personal data stored
|
|
on the device without the filesystem decryption key.
|
|
|
|
At the current state, UBIFS encryption however does not prevent attacks where
|
|
the attacker is able to modify the filesystem contents and the user uses the
|
|
device afterwards. In such a scenario an attacker can modify filesystem
|
|
contents arbitrarily without the user noticing. One example is to modify a
|
|
binary to perform a malicious action when executed [DMC-CBC-ATTACK]. Since
|
|
most of the filesystem metadata of UBIFS is stored in plain, this makes it
|
|
fairly easy to swap files and replace their contents.
|
|
|
|
Other full disk encryption systems like dm-crypt cover all filesystem metadata,
|
|
which makes such kinds of attacks more complicated, but not impossible.
|
|
Especially, if the attacker is given access to the device multiple points in
|
|
time. For dm-crypt and other filesystems that build upon the Linux block IO
|
|
layer, the dm-integrity or dm-verity subsystems [DM-INTEGRITY, DM-VERITY]
|
|
can be used to get full data authentication at the block layer.
|
|
These can also be combined with dm-crypt [CRYPTSETUP2].
|
|
|
|
This document describes an approach to get file contents _and_ full metadata
|
|
authentication for UBIFS. Since UBIFS uses fscrypt for file contents and file
|
|
name encryption, the authentication system could be tied into fscrypt such that
|
|
existing features like key derivation can be utilized. It should however also
|
|
be possible to use UBIFS authentication without using encryption.
|
|
|
|
|
|
MTD, UBI & UBIFS
|
|
----------------
|
|
|
|
On Linux, the MTD (Memory Technology Devices) subsystem provides a uniform
|
|
interface to access raw flash devices. One of the more prominent subsystems that
|
|
work on top of MTD is UBI (Unsorted Block Images). It provides volume management
|
|
for flash devices and is thus somewhat similar to LVM for block devices. In
|
|
addition, it deals with flash-specific wear-leveling and transparent I/O error
|
|
handling. UBI offers logical erase blocks (LEBs) to the layers on top of it
|
|
and maps them transparently to physical erase blocks (PEBs) on the flash.
|
|
|
|
UBIFS is a filesystem for raw flash which operates on top of UBI. Thus, wear
|
|
leveling and some flash specifics are left to UBI, while UBIFS focuses on
|
|
scalability, performance and recoverability.
|
|
|
|
::
|
|
|
|
+------------+ +*******+ +-----------+ +-----+
|
|
| | * UBIFS * | UBI-BLOCK | | ... |
|
|
| JFFS/JFFS2 | +*******+ +-----------+ +-----+
|
|
| | +-----------------------------+ +-----------+ +-----+
|
|
| | | UBI | | MTD-BLOCK | | ... |
|
|
+------------+ +-----------------------------+ +-----------+ +-----+
|
|
+------------------------------------------------------------------+
|
|
| MEMORY TECHNOLOGY DEVICES (MTD) |
|
|
+------------------------------------------------------------------+
|
|
+-----------------------------+ +--------------------------+ +-----+
|
|
| NAND DRIVERS | | NOR DRIVERS | | ... |
|
|
+-----------------------------+ +--------------------------+ +-----+
|
|
|
|
Figure 1: Linux kernel subsystems for dealing with raw flash
|
|
|
|
|
|
|
|
Internally, UBIFS maintains multiple data structures which are persisted on
|
|
the flash:
|
|
|
|
- *Index*: an on-flash B+ tree where the leaf nodes contain filesystem data
|
|
- *Journal*: an additional data structure to collect FS changes before updating
|
|
the on-flash index and reduce flash wear.
|
|
- *Tree Node Cache (TNC)*: an in-memory B+ tree that reflects the current FS
|
|
state to avoid frequent flash reads. It is basically the in-memory
|
|
representation of the index, but contains additional attributes.
|
|
- *LEB property tree (LPT)*: an on-flash B+ tree for free space accounting per
|
|
UBI LEB.
|
|
|
|
In the remainder of this section we will cover the on-flash UBIFS data
|
|
structures in more detail. The TNC is of less importance here since it is never
|
|
persisted onto the flash directly. More details on UBIFS can also be found in
|
|
[UBIFS-WP].
|
|
|
|
|
|
UBIFS Index & Tree Node Cache
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Basic on-flash UBIFS entities are called *nodes*. UBIFS knows different types
|
|
of nodes. Eg. data nodes (``struct ubifs_data_node``) which store chunks of file
|
|
contents or inode nodes (``struct ubifs_ino_node``) which represent VFS inodes.
|
|
Almost all types of nodes share a common header (``ubifs_ch``) containing basic
|
|
information like node type, node length, a sequence number, etc. (see
|
|
``fs/ubifs/ubifs-media.h`` in kernel source). Exceptions are entries of the LPT
|
|
and some less important node types like padding nodes which are used to pad
|
|
unusable content at the end of LEBs.
|
|
|
|
To avoid re-writing the whole B+ tree on every single change, it is implemented
|
|
as *wandering tree*, where only the changed nodes are re-written and previous
|
|
versions of them are obsoleted without erasing them right away. As a result,
|
|
the index is not stored in a single place on the flash, but *wanders* around
|
|
and there are obsolete parts on the flash as long as the LEB containing them is
|
|
not reused by UBIFS. To find the most recent version of the index, UBIFS stores
|
|
a special node called *master node* into UBI LEB 1 which always points to the
|
|
most recent root node of the UBIFS index. For recoverability, the master node
|
|
is additionally duplicated to LEB 2. Mounting UBIFS is thus a simple read of
|
|
LEB 1 and 2 to get the current master node and from there get the location of
|
|
the most recent on-flash index.
|
|
|
|
The TNC is the in-memory representation of the on-flash index. It contains some
|
|
additional runtime attributes per node which are not persisted. One of these is
|
|
a dirty-flag which marks nodes that have to be persisted the next time the
|
|
index is written onto the flash. The TNC acts as a write-back cache and all
|
|
modifications of the on-flash index are done through the TNC. Like other caches,
|
|
the TNC does not have to mirror the full index into memory, but reads parts of
|
|
it from flash whenever needed. A *commit* is the UBIFS operation of updating the
|
|
on-flash filesystem structures like the index. On every commit, the TNC nodes
|
|
marked as dirty are written to the flash to update the persisted index.
|
|
|
|
|
|
Journal
|
|
~~~~~~~
|
|
|
|
To avoid wearing out the flash, the index is only persisted (*commited*) when
|
|
certain conditions are met (eg. ``fsync(2)``). The journal is used to record
|
|
any changes (in form of inode nodes, data nodes etc.) between commits
|
|
of the index. During mount, the journal is read from the flash and replayed
|
|
onto the TNC (which will be created on-demand from the on-flash index).
|
|
|
|
UBIFS reserves a bunch of LEBs just for the journal called *log area*. The
|
|
amount of log area LEBs is configured on filesystem creation (using
|
|
``mkfs.ubifs``) and stored in the superblock node. The log area contains only
|
|
two types of nodes: *reference nodes* and *commit start nodes*. A commit start
|
|
node is written whenever an index commit is performed. Reference nodes are
|
|
written on every journal update. Each reference node points to the position of
|
|
other nodes (inode nodes, data nodes etc.) on the flash that are part of this
|
|
journal entry. These nodes are called *buds* and describe the actual filesystem
|
|
changes including their data.
|
|
|
|
The log area is maintained as a ring. Whenever the journal is almost full,
|
|
a commit is initiated. This also writes a commit start node so that during
|
|
mount, UBIFS will seek for the most recent commit start node and just replay
|
|
every reference node after that. Every reference node before the commit start
|
|
node will be ignored as they are already part of the on-flash index.
|
|
|
|
When writing a journal entry, UBIFS first ensures that enough space is
|
|
available to write the reference node and buds part of this entry. Then, the
|
|
reference node is written and afterwards the buds describing the file changes.
|
|
On replay, UBIFS will record every reference node and inspect the location of
|
|
the referenced LEBs to discover the buds. If these are corrupt or missing,
|
|
UBIFS will attempt to recover them by re-reading the LEB. This is however only
|
|
done for the last referenced LEB of the journal. Only this can become corrupt
|
|
because of a power cut. If the recovery fails, UBIFS will not mount. An error
|
|
for every other LEB will directly cause UBIFS to fail the mount operation.
|
|
|
|
::
|
|
|
|
| ---- LOG AREA ---- | ---------- MAIN AREA ------------ |
|
|
|
|
-----+------+-----+--------+---- ------+-----+-----+---------------
|
|
\ | | | | / / | | | \
|
|
/ CS | REF | REF | | \ \ DENT | INO | INO | /
|
|
\ | | | | / / | | | \
|
|
----+------+-----+--------+--- -------+-----+-----+----------------
|
|
| | ^ ^
|
|
| | | |
|
|
+------------------------+ |
|
|
| |
|
|
+-------------------------------+
|
|
|
|
|
|
Figure 2: UBIFS flash layout of log area with commit start nodes
|
|
(CS) and reference nodes (REF) pointing to main area
|
|
containing their buds
|
|
|
|
|
|
LEB Property Tree/Table
|
|
~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
The LEB property tree is used to store per-LEB information. This includes the
|
|
LEB type and amount of free and *dirty* (old, obsolete content) space [1]_ on
|
|
the LEB. The type is important, because UBIFS never mixes index nodes with data
|
|
nodes on a single LEB and thus each LEB has a specific purpose. This again is
|
|
useful for free space calculations. See [UBIFS-WP] for more details.
|
|
|
|
The LEB property tree again is a B+ tree, but it is much smaller than the
|
|
index. Due to its smaller size it is always written as one chunk on every
|
|
commit. Thus, saving the LPT is an atomic operation.
|
|
|
|
|
|
.. [1] Since LEBs can only be appended and never overwritten, there is a
|
|
difference between free space ie. the remaining space left on the LEB to be
|
|
written to without erasing it and previously written content that is obsolete
|
|
but can't be overwritten without erasing the full LEB.
|
|
|
|
|
|
UBIFS Authentication
|
|
====================
|
|
|
|
This chapter introduces UBIFS authentication which enables UBIFS to verify
|
|
the authenticity and integrity of metadata and file contents stored on flash.
|
|
|
|
|
|
Threat Model
|
|
------------
|
|
|
|
UBIFS authentication enables detection of offline data modification. While it
|
|
does not prevent it, it enables (trusted) code to check the integrity and
|
|
authenticity of on-flash file contents and filesystem metadata. This covers
|
|
attacks where file contents are swapped.
|
|
|
|
UBIFS authentication will not protect against rollback of full flash contents.
|
|
Ie. an attacker can still dump the flash and restore it at a later time without
|
|
detection. It will also not protect against partial rollback of individual
|
|
index commits. That means that an attacker is able to partially undo changes.
|
|
This is possible because UBIFS does not immediately overwrites obsolete
|
|
versions of the index tree or the journal, but instead marks them as obsolete
|
|
and garbage collection erases them at a later time. An attacker can use this by
|
|
erasing parts of the current tree and restoring old versions that are still on
|
|
the flash and have not yet been erased. This is possible, because every commit
|
|
will always write a new version of the index root node and the master node
|
|
without overwriting the previous version. This is further helped by the
|
|
wear-leveling operations of UBI which copies contents from one physical
|
|
eraseblock to another and does not atomically erase the first eraseblock.
|
|
|
|
UBIFS authentication does not cover attacks where an attacker is able to
|
|
execute code on the device after the authentication key was provided.
|
|
Additional measures like secure boot and trusted boot have to be taken to
|
|
ensure that only trusted code is executed on a device.
|
|
|
|
|
|
Authentication
|
|
--------------
|
|
|
|
To be able to fully trust data read from flash, all UBIFS data structures
|
|
stored on flash are authenticated. That is:
|
|
|
|
- The index which includes file contents, file metadata like extended
|
|
attributes, file length etc.
|
|
- The journal which also contains file contents and metadata by recording changes
|
|
to the filesystem
|
|
- The LPT which stores UBI LEB metadata which UBIFS uses for free space accounting
|
|
|
|
|
|
Index Authentication
|
|
~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Through UBIFS' concept of a wandering tree, it already takes care of only
|
|
updating and persisting changed parts from leaf node up to the root node
|
|
of the full B+ tree. This enables us to augment the index nodes of the tree
|
|
with a hash over each node's child nodes. As a result, the index basically also
|
|
a Merkle tree. Since the leaf nodes of the index contain the actual filesystem
|
|
data, the hashes of their parent index nodes thus cover all the file contents
|
|
and file metadata. When a file changes, the UBIFS index is updated accordingly
|
|
from the leaf nodes up to the root node including the master node. This process
|
|
can be hooked to recompute the hash only for each changed node at the same time.
|
|
Whenever a file is read, UBIFS can verify the hashes from each leaf node up to
|
|
the root node to ensure the node's integrity.
|
|
|
|
To ensure the authenticity of the whole index, the UBIFS master node stores a
|
|
keyed hash (HMAC) over its own contents and a hash of the root node of the index
|
|
tree. As mentioned above, the master node is always written to the flash whenever
|
|
the index is persisted (ie. on index commit).
|
|
|
|
Using this approach only UBIFS index nodes and the master node are changed to
|
|
include a hash. All other types of nodes will remain unchanged. This reduces
|
|
the storage overhead which is precious for users of UBIFS (ie. embedded
|
|
devices).
|
|
|
|
::
|
|
|
|
+---------------+
|
|
| Master Node |
|
|
| (hash) |
|
|
+---------------+
|
|
|
|
|
v
|
|
+-------------------+
|
|
| Index Node #1 |
|
|
| |
|
|
| branch0 branchn |
|
|
| (hash) (hash) |
|
|
+-------------------+
|
|
| ... | (fanout: 8)
|
|
| |
|
|
+-------+ +------+
|
|
| |
|
|
v v
|
|
+-------------------+ +-------------------+
|
|
| Index Node #2 | | Index Node #3 |
|
|
| | | |
|
|
| branch0 branchn | | branch0 branchn |
|
|
| (hash) (hash) | | (hash) (hash) |
|
|
+-------------------+ +-------------------+
|
|
| ... | ... |
|
|
v v v
|
|
+-----------+ +----------+ +-----------+
|
|
| Data Node | | INO Node | | DENT Node |
|
|
+-----------+ +----------+ +-----------+
|
|
|
|
|
|
Figure 3: Coverage areas of index node hash and master node HMAC
|
|
|
|
|
|
|
|
The most important part for robustness and power-cut safety is to atomically
|
|
persist the hash and file contents. Here the existing UBIFS logic for how
|
|
changed nodes are persisted is already designed for this purpose such that
|
|
UBIFS can safely recover if a power-cut occurs while persisting. Adding
|
|
hashes to index nodes does not change this since each hash will be persisted
|
|
atomically together with its respective node.
|
|
|
|
|
|
Journal Authentication
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
The journal is authenticated too. Since the journal is continuously written
|
|
it is necessary to also add authentication information frequently to the
|
|
journal so that in case of a powercut not too much data can't be authenticated.
|
|
This is done by creating a continuous hash beginning from the commit start node
|
|
over the previous reference nodes, the current reference node, and the bud
|
|
nodes. From time to time whenever it is suitable authentication nodes are added
|
|
between the bud nodes. This new node type contains a HMAC over the current state
|
|
of the hash chain. That way a journal can be authenticated up to the last
|
|
authentication node. The tail of the journal which may not have a authentication
|
|
node cannot be authenticated and is skipped during journal replay.
|
|
|
|
We get this picture for journal authentication::
|
|
|
|
,,,,,,,,
|
|
,......,...........................................
|
|
,. CS , hash1.----. hash2.----.
|
|
,. | , . |hmac . |hmac
|
|
,. v , . v . v
|
|
,.REF#0,-> bud -> bud -> bud.-> auth -> bud -> bud.-> auth ...
|
|
,..|...,...........................................
|
|
, | ,
|
|
, | ,,,,,,,,,,,,,,,
|
|
. | hash3,----.
|
|
, | , |hmac
|
|
, v , v
|
|
, REF#1 -> bud -> bud,-> auth ...
|
|
,,,|,,,,,,,,,,,,,,,,,,
|
|
v
|
|
REF#2 -> ...
|
|
|
|
|
V
|
|
...
|
|
|
|
Since the hash also includes the reference nodes an attacker cannot reorder or
|
|
skip any journal heads for replay. An attacker can only remove bud nodes or
|
|
reference nodes from the end of the journal, effectively rewinding the
|
|
filesystem at maximum back to the last commit.
|
|
|
|
The location of the log area is stored in the master node. Since the master
|
|
node is authenticated with a HMAC as described above, it is not possible to
|
|
tamper with that without detection. The size of the log area is specified when
|
|
the filesystem is created using `mkfs.ubifs` and stored in the superblock node.
|
|
To avoid tampering with this and other values stored there, a HMAC is added to
|
|
the superblock struct. The superblock node is stored in LEB 0 and is only
|
|
modified on feature flag or similar changes, but never on file changes.
|
|
|
|
|
|
LPT Authentication
|
|
~~~~~~~~~~~~~~~~~~
|
|
|
|
The location of the LPT root node on the flash is stored in the UBIFS master
|
|
node. Since the LPT is written and read atomically on every commit, there is
|
|
no need to authenticate individual nodes of the tree. It suffices to
|
|
protect the integrity of the full LPT by a simple hash stored in the master
|
|
node. Since the master node itself is authenticated, the LPTs authenticity can
|
|
be verified by verifying the authenticity of the master node and comparing the
|
|
LTP hash stored there with the hash computed from the read on-flash LPT.
|
|
|
|
|
|
Key Management
|
|
--------------
|
|
|
|
For simplicity, UBIFS authentication uses a single key to compute the HMACs
|
|
of superblock, master, commit start and reference nodes. This key has to be
|
|
available on creation of the filesystem (`mkfs.ubifs`) to authenticate the
|
|
superblock node. Further, it has to be available on mount of the filesystem
|
|
to verify authenticated nodes and generate new HMACs for changes.
|
|
|
|
UBIFS authentication is intended to operate side-by-side with UBIFS encryption
|
|
(fscrypt) to provide confidentiality and authenticity. Since UBIFS encryption
|
|
has a different approach of encryption policies per directory, there can be
|
|
multiple fscrypt master keys and there might be folders without encryption.
|
|
UBIFS authentication on the other hand has an all-or-nothing approach in the
|
|
sense that it either authenticates everything of the filesystem or nothing.
|
|
Because of this and because UBIFS authentication should also be usable without
|
|
encryption, it does not share the same master key with fscrypt, but manages
|
|
a dedicated authentication key.
|
|
|
|
The API for providing the authentication key has yet to be defined, but the
|
|
key can eg. be provided by userspace through a keyring similar to the way it
|
|
is currently done in fscrypt. It should however be noted that the current
|
|
fscrypt approach has shown its flaws and the userspace API will eventually
|
|
change [FSCRYPT-POLICY2].
|
|
|
|
Nevertheless, it will be possible for a user to provide a single passphrase
|
|
or key in userspace that covers UBIFS authentication and encryption. This can
|
|
be solved by the corresponding userspace tools which derive a second key for
|
|
authentication in addition to the derived fscrypt master key used for
|
|
encryption.
|
|
|
|
To be able to check if the proper key is available on mount, the UBIFS
|
|
superblock node will additionally store a hash of the authentication key. This
|
|
approach is similar to the approach proposed for fscrypt encryption policy v2
|
|
[FSCRYPT-POLICY2].
|
|
|
|
|
|
Future Extensions
|
|
=================
|
|
|
|
In certain cases where a vendor wants to provide an authenticated filesystem
|
|
image to customers, it should be possible to do so without sharing the secret
|
|
UBIFS authentication key. Instead, in addition the each HMAC a digital
|
|
signature could be stored where the vendor shares the public key alongside the
|
|
filesystem image. In case this filesystem has to be modified afterwards,
|
|
UBIFS can exchange all digital signatures with HMACs on first mount similar
|
|
to the way the IMA/EVM subsystem deals with such situations. The HMAC key
|
|
will then have to be provided beforehand in the normal way.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
[CRYPTSETUP2] http://www.saout.de/pipermail/dm-crypt/2017-November/005745.html
|
|
|
|
[DMC-CBC-ATTACK] http://www.jakoblell.com/blog/2013/12/22/practical-malleability-attack-against-cbc-encrypted-luks-partitions/
|
|
|
|
[DM-INTEGRITY] https://www.kernel.org/doc/Documentation/device-mapper/dm-integrity.rst
|
|
|
|
[DM-VERITY] https://www.kernel.org/doc/Documentation/device-mapper/verity.rst
|
|
|
|
[FSCRYPT-POLICY2] https://www.spinics.net/lists/linux-ext4/msg58710.html
|
|
|
|
[UBIFS-WP] http://www.linux-mtd.infradead.org/doc/ubifs_whitepaper.pdf
|