mirror of
https://git.code.sf.net/p/ntfs-3g/ntfs-3g.git
synced 2024-11-23 10:04:00 +08:00
Add description of compression algorithm.
(Logical change 1.5)
This commit is contained in:
parent
6a4f3f919e
commit
72da03cc86
@ -0,0 +1,153 @@
|
||||
|
||||
Description of the NTFS (de)compression algorithm (based on a modified LZ77
|
||||
algorithm)
|
||||
|
||||
Copyright (c) 2001 Anton Altaparmakov.
|
||||
|
||||
This document is published under the GNU General Public License.
|
||||
|
||||
Credits: This is based on notes taken from various places (most notably from
|
||||
Regis Duchesne's NTFS documentation and from various LZ77 descriptions) and
|
||||
further refined by looking at a few compressed streams to figure out some
|
||||
uncertainties.
|
||||
|
||||
Note: You should also read the run list description with regards to compression
|
||||
in linux-ntfs/include/layout.h. Just search for "Attribute compression".
|
||||
FIXME: Should merge the info from there into this document some time.
|
||||
|
||||
Compressed data is organized in logical "compression" blocks (cb). Each cb has
|
||||
a size (cb_size) of 2^compression_unit clusters. In all versions of Windows,
|
||||
NTFS (NT/2k/XP, NTFS 1.2-3.1), the only valid compression_unit is 4, IOW, each
|
||||
cb is 2^4 = 16 clusters in size.
|
||||
|
||||
We detect and warn about a compression_unit != 4 but we try to decompress the
|
||||
data anyway.
|
||||
|
||||
Compression is only supported for cluster sizes between 512 and 4096. Thus a
|
||||
cb can be between 8 and 64kiB in size.
|
||||
|
||||
Each cb is independent of the other cbs and is thus the minimal unit we have
|
||||
to parse even if we wanted to decompress only one byte.
|
||||
|
||||
Also, a cb can be totally uncompressed and this would be indicated as a sparse
|
||||
cb in the run list.
|
||||
|
||||
Thus, we need to look at the run list of the compressed data stream, starting
|
||||
at the beginning of the first cb overlapping @page. So we convert the page
|
||||
offset into units of clusters (vcn), and round the vcn down to a mutliple of
|
||||
cb_size clusters.
|
||||
|
||||
We then scan the run list for the appropriate position. Based on what we find
|
||||
there, we decide how to proceed.
|
||||
|
||||
If the cb is not compressed at all, and covers the whole of @page, we pretend
|
||||
to be accessing an uncompressed file, so we fall back to what we do in
|
||||
aops.c::ntfs_file_readpage(), i.e. we do:
|
||||
return block_read_full_page(page, ntfs_file_get_block);
|
||||
|
||||
If the cb is completely sparse, and covers the whole of @page, we can just
|
||||
zero out @page and complete the io (set @page up-to-date, unlock it, and
|
||||
finally return 0).
|
||||
|
||||
In all other cases we initiate the decompression engine, but first some more
|
||||
on the compression algorithm.
|
||||
|
||||
Before compression the data of each cb is further divided into 4kiB blocks, we
|
||||
call them "sub compression" blocks (sb), each including a header specifying
|
||||
its compressed length. So we could just scan the cb for the first sb
|
||||
overlapping @page and skip the sbs before that, or we could decompress the
|
||||
whole cb injecting the superfluous decompressed pages into the page cache as a
|
||||
form of read ahead (this is what zisofs does for example).
|
||||
|
||||
In either case, we then need to read and decompress all sbs overlapping @page,
|
||||
potentially having to decompress one or more other cbs, too.
|
||||
|
||||
As soon as @page is completed we could either stop or continue until we finish
|
||||
the current cb, injecting pages as we go along (again following the zisofs
|
||||
example).
|
||||
|
||||
Because the sbs follow each other directly, we need to actually read in the
|
||||
whole cb in order to be able to scan through the cb to find the first sb
|
||||
overlapping @page, so it does make sense to follow the zisofs approach of
|
||||
decompressing the whole cb and injecting pages as we go along. So all
|
||||
discussion from now on will assume that we are going to do that. Although it
|
||||
might make sense not to decompress any sbs locate before @page because this
|
||||
would be a kind of "read-behind" which is probably silly, unless someone is
|
||||
reading the file backwards. Performing read-ahead by decompressing all sbs
|
||||
following @page OTOH, is very likely to be a good idea.
|
||||
|
||||
So, we read the whole cb from disk and start at the first sb.
|
||||
|
||||
As mentioned above, each sb is started with a header. The header is 16 bits of
|
||||
which the lower twelve bits (i.e. bits 0 to 11) are the length (L) - 3 of the
|
||||
sb (including the two bytes for the header itself, or L - 1 not counting the
|
||||
two bytes for the header). The higher four bits are set to 1011 (0xb) by the
|
||||
compressor for a compressed block, or to 0000 for an uncompressed block, but
|
||||
the decompressor only checks the most significant bit taking a 1 to signify a
|
||||
compressed block, and a 0 an uncompressed block.
|
||||
|
||||
So from the header we know how many compressed bytes we need to decompress to
|
||||
obtain the next 4kiB of uncompressed data and if we didn't want to decompress
|
||||
this sb we could just seek to the next next one using the length read from the
|
||||
header. We could then continue seeking until we reach the first sb overlapping
|
||||
@page.
|
||||
|
||||
In either case, we will reach a sb which we want to decompress.
|
||||
|
||||
Having dealt with the 16-bit header of the sb, we now have length bytes of
|
||||
compressed data to decompress. This compressed stream is further split into
|
||||
tokens which are organized into groups of eight tokens. Each token group (tg)
|
||||
starts with a tag byte, which is an eight bit bitmap, the bits specifying the
|
||||
type of each of the following eight tokens. The least significant bit (LSB)
|
||||
corresponds to the first token and the most significant bit (MSB) corresponds
|
||||
to the last token.
|
||||
|
||||
The two types of tokens are symbol tokens, specified by a zero bit, and phrase
|
||||
tokens, specified by a set bit.
|
||||
|
||||
A symbol token (st) is a single byte and is to be taken literally and copied
|
||||
into the sliding window (the decompressed data).
|
||||
|
||||
A phrase token (pt) is a pointer back into the sliding window (in bytes),
|
||||
together with a length (again in bytes), starting at the byte the back pointer
|
||||
is pointing to. Thus a phrase token defines a sequence of bytes in the sliding
|
||||
window which need to be copied at the current position into the sliding window
|
||||
(the decompressed data stream).
|
||||
|
||||
Each pt consists of 2 bytes split into the back pointer (p) and the length (l),
|
||||
each of variable bit width (but the sum of the widths of p and l is fixed at
|
||||
16 bits). p is at least 4 bits and l is at most 12 bits.
|
||||
|
||||
The most significant bits contain the back pointer (p), while the least
|
||||
significant bits contain the length (l).
|
||||
|
||||
l is actually stored as the number of bytes minus 3 (unsigned) as anything
|
||||
shorter than that would be at least as long as the 2 bytes needed for the
|
||||
actual pt, so no compression would be achieved.
|
||||
|
||||
p is stored as the positive number of bytes minus 1 (unsigned) as going zero
|
||||
bytes back is meaningless.
|
||||
|
||||
Note that decompression has to occur byte by byte, as it is possible that some
|
||||
of the bytes pointed to by the pt will only be generated in the sliding window
|
||||
as the byte sequence pointed to by the pt is being copied into it!
|
||||
|
||||
To give a concrete example; a block full of the letter A would be compressed
|
||||
by storing the byte A once as a symbol token, followed by a single phrase
|
||||
token with back pointer -1 (p = 0, therefore go back by -(0 + 1) bytes) and
|
||||
length 4095 (l=0xffc, therefore length 0xffc + 3 bytes).
|
||||
|
||||
The widths of p and l are determined from the current position within the
|
||||
decompressed data (cur_pos). We don't actually care about the widths as such
|
||||
however, but instead we want the mask (l_mask) with which to AND the pt to
|
||||
obtain l, and the number of bits (p_shift) by which to right shift the pt to
|
||||
obtain p. These are determined using the following algorithm:
|
||||
|
||||
for (i = cur_pos, l_mask = 0xfff, p_shift = 12; i >= 0x10; i >>= 1) {
|
||||
l_mask >>= 1;
|
||||
p_shift--;
|
||||
}
|
||||
|
||||
Note, that as usual in NTFS, the sb header, as well as each pt, are stored in
|
||||
little endian format.
|
||||
|
Loading…
Reference in New Issue
Block a user