clarified dictionary in format description

2025-01-25 06:23:21 +08:00 · 2016-09-02 17:04:49 -07:00 · 2016-09-02 17:04:49 -07:00 · 855766d73d
commit 855766d73d
parent 2b26ad1947
4 changed files with 19 additions and 18 deletions
--- a/1
+++ b/1
@ -2,6 +2,7 @@ v1.0.1
 New : contrib/pzstd, parallel version of zstd, by Nick Terrell
 Fixed : CLI -d output to stdout by default when input is stdin (#322)
 Fixed : CLI correctly detects console on Mac OS-X
+Fixed : compatibility with OpenBSD, reported by Juan Francisco Cantero Hurtado (#319)
 Fixed : zstd-pgo, reported by octoploid (#329)

 v1.0.0
--- a/lib/compress/zstd_compress.c
+++ b/lib/compress/zstd_compress.c
@ -8,7 +8,6 @@
 */


-
 /*-*******************************************************
 *  Compiler specifics
 *********************************************************/
--- a/lib/dictBuilder/zdict.c
+++ b/lib/dictBuilder/zdict.c
@ -463,12 +463,6 @@ static U32 ZDICT_dictSize(const dictItem* dictList)
 }


-#define DISPLAYUPDATE(l, ...) if (g_displayLevel>=l) { \
-            if (ZDICT_clockSpan(displayClock) > refreshRate)  \
-            { displayClock = clock(); DISPLAY(__VA_ARGS__); \
-            if (g_displayLevel>=4) fflush(stdout); } }
-static const clock_t refreshRate = CLOCKS_PER_SEC * 3 / 10;
-
 static size_t ZDICT_trainBuffer(dictItem* dictList, U32 dictListSize,
                            const void* const buffer, size_t bufferSize,   /* buffer must end with noisy guard band */
                            const size_t* fileSizes, unsigned nbFiles,
@ -481,6 +475,12 @@ static size_t ZDICT_trainBuffer(dictItem* dictList, U32 dictListSize,
    U32* filePos = (U32*)malloc(nbFiles * sizeof(*filePos));
    size_t result = 0;
    clock_t displayClock = 0;
+    clock_t const refreshRate = CLOCKS_PER_SEC * 3 / 10;
+
+#   define DISPLAYUPDATE(l, ...) if (g_displayLevel>=l) { \
+            if (ZDICT_clockSpan(displayClock) > refreshRate)  \
+            { displayClock = clock(); DISPLAY(__VA_ARGS__); \
+            if (g_displayLevel>=4) fflush(stdout); } }

    /* init */
    DISPLAYLEVEL(2, "\r%70s\r", "");   /* clean display line */
--- a/zstd_compression_format.md
+++ b/zstd_compression_format.md
@ -551,7 +551,7 @@ Let's presume the following Huffman tree must be described :
 The tree depth is 4, since its smallest element uses 4 bits.
 Value `5` will not be listed, nor will values above `5`.
 Values from `0` to `4` will be listed using `Weight` instead of `Number_of_Bits`.
-Weight formula is : 
+Weight formula is :
 ```
 Weight = Number_of_Bits ? (Max_Number_of_Bits + 1 - Number_of_Bits) : 0
 ```
@ -779,7 +779,7 @@ which specifies `Baseline` and `Number_of_Bits` to add.
 _Codes_ are FSE compressed,
 and interleaved with raw additional bits in the same bitstream.

-##### Literals length codes 
+##### Literals length codes

 Literals length codes are values ranging from `0` to `35` included.
 They define lengths from 0 to 131071 bytes.
@ -1126,10 +1126,10 @@ When `Repeated_Offset2` is used, it's swapped with `Repeated_Offset1`.
 Dictionary format
 -----------------

-`zstd` is compatible with "pure content" dictionaries, free of any format restriction.
+`zstd` is compatible with "raw content" dictionaries, free of any format restriction.
 But dictionaries created by `zstd --train` follow a format, described here.

-__Pre-requisites__ : a dictionary has a known length,
+__Pre-requisites__ : a dictionary has a size,
                     defined either by a buffer limit, or a file size.

 | `Magic_Number` | `Dictionary_ID` | `Entropy_Tables` | `Content` |
@ -1151,20 +1151,21 @@ _Reserved ranges :_
              - high range : >= (2^31)

 __`Entropy_Tables`__ : following the same format as a [compressed blocks].
-            They are stored in following order :
-            Huffman tables for literals, FSE table for offsets,
-            FSE table for match lengths, and FSE table for literals lengths.
-            It's finally followed by 3 offset values, populating recent offsets,
-            stored in order, 4-bytes little-endian each, for a total of 12 bytes.
+              They are stored in following order :
+              Huffman tables for literals, FSE table for offsets,
+              FSE table for match lengths, and FSE table for literals lengths.
+              It's finally followed by 3 offset values, populating recent offsets,
+              stored in order, 4-bytes little-endian each, for a total of 12 bytes.

-__`Content`__ : Where the actual dictionary content is.
-              Content size depends on Dictionary size.
+__`Content`__ : The rest of the dictionary is its content.
+              The content act as a "past" in front of data to compress or decompress.

 [compressed blocks]: #the-format-of-compressed_block


 Version changes
 ---------------
+- 0.2.1 : clarify field names, by Przemyslaw Skibinski
 - 0.2.0 : numerous format adjustments for zstd v0.8
 - 0.1.2 : limit Huffman tree depth to 11 bits
 - 0.1.1 : reserved dictID ranges