Commit Graph

51 Commits

Author SHA1 Message Date
Yann Collet
64e8511b26 added clarifications for sizes of compressed huffman blocks and streams. 2023-03-08 15:31:36 -08:00
Yann Collet
832f559b0b clarify zstd specification for Huffman blocks
Following detailed comments from @dweiller in #3508.
2023-02-18 18:18:16 -08:00
Yann Collet
6a9c525903 spec update : require minimum nb of literals for 4-streams mode
Reported by @shulib :
the specification for 4-streams mode
doesn't work when the amount of literals to compress is 5 bytes.
Extending it, it also doesn't work for sizes 1 or 2.

This patch updates the specification and the implementation
to require a minimum of 6 literals to trigger or accept the 4-streams mode.

The impact is expected to be a no-op :
the 4-streams mode is never triggered for such small quantity of literals anyway,
since it would be wasteful (it costs ~7.3 bytes more than single-stream mode).
An informal lower limit is set at ~256 bytes,
so the technical minimum is very far from this limit.

This is just meant for completeness of the specification.
2022-12-22 16:14:34 -08:00
W. Felix Handte
5d693cc38c Coalesce Almost All Copyright Notices to Standard Phrasing
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do sed -i '/Copyright .* \(Yann Collet\)\|\(Meta Platforms\)/ s/Copyright .*/Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done

git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0*.c lib/legacy/zstd_v0*.h
nano ./programs/windres/zstd.rc
nano ./build/VS2010/zstd/zstd.rc
nano ./build/VS2010/libzstd-dll/libzstd-dll.rc
```
2022-12-20 12:52:34 -05:00
W. Felix Handte
7f12f24cf4 Rewrite Copyright Date Ranges from -present to -2022
Apparently it's better. Somehow.

```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do echo $f; sed -i 's/\-present/-2022/' $f; done

g co HEAD -- build/meson/
```
2022-12-20 12:44:56 -05:00
W. Felix Handte
36d5c2f326 Update Copyright Year ('2021' -> 'present')
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f);
do
  sed -i 's/\-2021/-present/' $f;
done

g co HEAD -- .github/workflows/dev-short-tests.yml # fix bad match
```
2022-12-20 12:42:50 -05:00
W. Felix Handte
8927f985ff Update Copyright Headers 'Facebook' -> 'Meta Platforms'
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
  sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
2022-12-20 12:37:57 -05:00
Danielle Rozenblit
4dffc35f2e Convert references to https from http 2022-12-14 06:58:35 -08:00
Yann Collet
f33ccd2d1b fix small error in format documentation example
reported by @dkcasset
fix #3142
2022-05-24 04:47:49 -07:00
Dominique Pelle
b772f53952 Typo and grammar fixes 2022-03-12 08:58:04 +01:00
Dimitris Apostolou
ebbd675998
Fix typos 2021-11-13 10:04:04 +02:00
Yann Collet
0b0b62d1cf minor mention of RFC8878
more recent update
2021-05-15 23:04:46 -07:00
senhuang42
1d6d64afa3 Change year to 2021 for compression format file 2021-01-11 08:53:29 -05:00
W. Felix Handte
2d46d764cf Update Zstd Compression Format to Clarify Repcode Behavior 2020-12-09 20:03:58 -05:00
senhuang42
8adeb9f1e6 Updated to repcode documentation to reflect dict content size 2020-09-22 13:24:27 -04:00
senhuang42
9dcfe4d7b7 Update documentation about repcodes in dictionaries 2020-09-22 13:02:26 -04:00
Yann Collet
11a392ce23 minor markdown formatting fix 2020-05-26 13:15:35 -07:00
Yann Collet
bb3c9bf43a updated spec on dictID==0
Specified decoder behavior on receiving a frame with dictID=0.

Pushed paragraph on reserved DictID ranges into the Dictionary Format section.
2020-05-25 08:15:09 -07:00
Yann Collet
098b36e9ab clarifications for Block_Maximum_Size
as a follow up of #1882
2019-11-13 09:50:15 -08:00
Yann Collet
ff7bd16c0a clarifications for the FSE decoding table
requested in #1782
2019-10-18 17:48:12 -07:00
Yann Collet
97bb38635c number instead of nb
suggested by @terrelln
2019-08-17 08:04:42 +02:00
Yann Collet
1e07eb4d5c clarifications on the meaning of field Block_Size
following comments from Intel's Smita Kumar.
2019-08-16 15:15:25 +02:00
W. Felix Handte
a2861d75eb [doc] Bump Format Spec Version 2019-07-17 18:55:45 -04:00
W. Felix Handte
c05b270edc [doc] Remove Limitation that Compressed Block is Smaller than Uncompressed Content
This changes the size limit on compressed blocks to match those of the other
block types: they may not be larger than the `Block_Maximum_Decompressed_Size`,
which is the smaller of the `Window_Size` and 128 KB, removing the additional
restriction that had been placed on `Compressed_Block`s, that they be smaller
than the decompressed content they represent.

Several things motivate removing this restriction. On the one hand, this
restriction is not useful for decoders: the decoder must nonetheless be
prepared to accept compressed blocks that are the full
`Block_Maximum_Decompressed_Size`. And on the other, this bound is actually
artificially limiting. If block representations were entirely independent,
a compressed representation of a block that is larger than the contents of the
block would be ipso facto useless, and it would be strictly better to send it
as an `Raw_Block`. However, blocks are not entirely independent, and it can
make sense to pay the cost of encoding custom entropy tables in a block, even
if that pushes that block size over the size of the data it represents,
because those tables can be re-used by subsequent blocks.

Finally, as far as I can tell, this restriction in the spec is not currently
enforced in any Zstandard implementation, nor has it ever been. This change
should therefore be safe to make.
2019-07-17 18:55:45 -04:00
Yann Collet
9bf00707c7 minor clarifications of history update rules 2018-10-26 15:51:51 -07:00
Ulrich Kunitz
f0fe9b0f02 Reverted removal of a trailing space.
My editor removes trailing spaces while saving. Not confusing things I
reverted that change.
2018-10-23 08:43:19 +02:00
Ulrich Kunitz
4f702e4445 Fixed a typo
I fixed a typo in the last commit. Many thanks to @terrelin for pointing
that out.
2018-10-23 08:36:50 +02:00
Ulrich Kunitz
c7942caff0 Clarify special case of offset history update
If the current sequence has literal length of zero then an offset value
of three is handled in a special manner. While I implemented a golang
decoder I had to consult the educational decoder for clarification on
the update of the offset history in that case. This commit provides the
clarification that the offset value Repeated_Offset1-1 is handled as a
new offset is added to the offset history accordingly.
2018-10-22 23:46:43 +02:00
Yann Collet
72a3adf826 updated format documentation
to match last edits of RFC8478.
2018-09-25 16:34:26 -07:00
Yann Collet
55a8f84a2c spec clarification
following #1305 comments from @ulikunitz
2018-09-05 12:31:33 -07:00
Nick Terrell
c1a7defee1 Small fixes to zstd specification
Update to keep in sync with the RFC.
2018-07-10 15:07:36 -07:00
Yann Collet
c1e6347717 fixed minor typos, detected by @terrelln 2018-06-21 18:08:11 -07:00
Yann Collet
7639db939f updated Zstandard frame format
adding clarifications from IETF RFC DISCUSS.
2018-06-21 17:55:55 -07:00
Yann Collet
a4c9c4defe update Zstandard format specification
answering a few questions from IETF RFC Discuss stage.
2018-05-31 10:47:44 -07:00
Nick Terrell
73f4c890cd Clarify what happens when Number_of_Sequences == 0 2018-05-22 16:12:33 -07:00
Yann Collet
82ad249645 Clarifications of Zstandard format specification
from IETF RFC review
2018-04-30 12:36:55 -07:00
Shawn Landden
914d983879
fix unbounded range
I think you meant 8 MiB or smaller, instead of an unbounded (and illogical) range
2017-12-21 16:15:12 -08:00
Yann Collet
fccb46fbe0 minor spelling fixes 2017-11-18 11:28:00 -08:00
Yann Collet
e8d35cc5e9 minor formulation change, recommended by @ulikunitz 2017-08-20 10:39:20 -07:00
Yann Collet
d0d06e421f added alternative representation for huffman bistream 2017-08-19 12:26:09 -07:00
Yann Collet
8b12812147 fix #803 : wrong example in huffman bitstream section, reported by @ulikunitz 2017-08-19 12:17:57 -07:00
Yann Collet
a935d67bf1 minor typo fixes in specification 2017-03-31 16:19:04 -07:00
Yann Collet
14433ca1ad numerous typos and clarifications in format specification
fix limit values of Window_Size
bump version to 0.2.5
2017-03-31 15:45:58 -07:00
Sean Purcell
3bee41a70e Add default distributions and fix typos 2017-02-21 10:20:36 -08:00
Sean Purcell
042419ec2a Restructure Format Specification 2017-02-17 16:24:26 -08:00
Yann Collet
20bed4210c changed format specification version number 2017-01-27 12:16:16 -08:00
Sean Purcell
d86153d903 Edits as per comments, and change wildcard 'X' to '?' 2017-01-26 16:58:25 -08:00
Sean Purcell
81c9670226 Fixed commented issues 2017-01-26 11:15:34 -08:00
Sean Purcell
ab226d4828 Updated format specification to be easier to understand 2017-01-25 16:42:41 -08:00
Nick Terrell
d82efd8a70 ZSTD_compress_usingDict() when dict gets loaded
Specify that when `dict == NULL || dictSize < 8` no dictionary
gets loaded.
Also add some periods.
2016-11-02 18:07:16 -07:00