Commit Graph

1638 Commits

Author SHA1 Message Date
W. Felix Handte
80790c587b Copy the Dict Table Into the Context for Large Compressions 2018-03-12 14:58:43 -04:00
W. Felix Handte
9dcd9abc14 Make LZ4F_compressFrame_usingCDict Take a Compression Context 2018-03-12 14:58:43 -04:00
W. Felix Handte
14ce912b70 Switch Current Offset to 1 Only When in External Dictionary Context Mode 2018-03-12 14:58:43 -04:00
W. Felix Handte
cea09d67a9 Hoist Table Reset One Level Up 2018-03-12 14:58:43 -04:00
W. Felix Handte
68c6bd17b8 Set Dictionary Context Pointer Rather than Copying the Context In 2018-03-12 14:58:43 -04:00
W. Felix Handte
73cc39327e Lookup Matches in Separate Dictionary Context 2018-03-12 14:58:43 -04:00
W. Felix Handte
62cb52b341 Initialize Current Offset to 1 2018-03-12 14:58:43 -04:00
W. Felix Handte
7060bcabf0 Only Re-Alloc / Reset When Needed When Switching Between Regular and High Compression Modes 2018-03-12 14:58:43 -04:00
W. Felix Handte
b3628cb0c5 Avoid Resetting the Context When Possible 2018-03-12 14:58:43 -04:00
W. Felix Handte
aa36e118f1 Const-ify Table Arg to LZ4_getPosition(OnHash) 2018-03-12 14:58:43 -04:00
W. Felix Handte
d6a3024dbb Add LZ4_compress_fast_safeExtState Function 2018-03-12 14:58:43 -04:00
W. Felix Handte
f34fb3c42d Add Bounds Check to locateBuffDiff 2018-03-12 14:58:43 -04:00
W. Felix Handte
5709891de6 Add a Table Type Field to LZ4_stream_t 2018-03-12 14:58:43 -04:00
W. Felix Handte
6933f5ad9c Remove Obsolete Stream Functions to Free Space in LZ4_stream_t 2018-03-12 14:58:43 -04:00
W. Felix Handte
6d156fea56 Allow Empty Dictionaries 2018-03-12 14:58:43 -04:00
W. Felix Handte
8c006b19bb Add a Benchmarking Tool For Compression with Context Re-Use 2018-03-12 14:58:43 -04:00
Yann Collet
9dc249ee3c
Merge pull request #483 from lz4/dev
update to dev
2018-03-09 13:10:40 -08:00
Yann Collet
6c23f03b93 fix #482: change CFLAGS to CXXFLAGS
as they are associated with $(CXX)
2018-03-09 11:54:32 -08:00
Yann Collet
6d4e60e365 fix #481: ensure liblz4.a dependency for make all
`make all` will trigger several sub-directory makefiles.
several of them need `liblz4.a`.
When built with `-j#`, there are several concurrent liblz4.a built

Make liblz4.a a dependency, which is built once,
before moving to sub-directory Makefiles
2018-03-09 09:57:29 -08:00
Yann Collet
b5233d3726 updated LZ4F_compressBound() documentation
to clarify it includes potentially buffered data.
2018-02-27 23:23:27 -08:00
Yann Collet
85201c4beb
Merge pull request #479 from lz4/check
added target make check
2018-02-26 16:40:32 -08:00
Yann Collet
0ddd1ceb1d added target make check
according to GNU Makefile conventions,
the Makefile should feature a make check target
to self-test the generated program:
https://www.gnu.org/prep/standards/html_node/Standard-Targets.html .

this is much less thorough and less taxing than `make test`,
and can be run on any target in a reasonable timeframe (several seconds).
2018-02-26 14:09:46 -08:00
Yann Collet
860ff77909
Merge pull request #478 from lz4/mergeOpt
merge lz4opt.h into lz4hc.c
2018-02-26 14:06:31 -08:00
Yann Collet
39fda9a447 bumped version number to v1.8.2
updated NEWS was current progresses
2018-02-26 13:50:04 -08:00
Yann Collet
ba115386fa update code comment on LZ4 streaming interface
notably regarding LZ4_saveDict() speed advantage,
answering #477.
2018-02-26 13:31:18 -08:00
Yann Collet
1882b10e41
Merge pull request #476 from lz4/mflimit
edge case fix : compress up to end-mflimit (12 bytes)
2018-02-26 12:29:54 -08:00
Yann Collet
550b40849f merge lz4opt.h into lz4hc.c
Having a dedicated file for optimal parser
made sense during its creation,
it allowed Przemyslaw to work more freely on lz4opt, with less dependency on lz4hc,
moreover, the optimal parser was more complex, with its own search functions.

Since the optimal was rewritten last year, it's now a lot lighter.
It makes more sense now to integrate it directly inside lz4hc.c,
making it easier to edit (editors are a bit "lost" inside a `*.h` dependent on its #include position),
it also reduces the number of files in the project,
which fits pretty well with lz4 objectives.
(adding lz4hc requires "just" lz4hc.h and lz4hc.c).
2018-02-25 00:32:09 -08:00
Yann Collet
7173a631db edge case : compress up to end-mflimit (12 bytes)
The LZ4 block format specification
states that the last match must start
at a minimum distance of 12 bytes from the end of the block.

However, out of an abundance of caution,
the reference implementation would actually stop searching matches
at 13 bytes from the end of the block.

This patch fixes this small detail.
The new version is now able to properly compress a limit case
such as `aaaaaaaabaaa\n`
as reported by Gao Xiang (@hsiangkao).

Obviously, it doesn't change a lot of things.
This is just one additional match candidate per block, with a maximum match length of 7 (since last 5 bytes must remain literals).

With default policy, blocks are 4 MB long, so it doesn't happen too often
Compressing silesia.tar at default level 1 saves 5 bytes (100930101 -> 100930096).
At max level 12, it saves a grand 16 bytes (77389871 -> 77389855).

The impact is a bit more visible when blocks are smaller, hence more numerous.
For example, compressing silesia with blocks of 64 KB (using -12 -B4D) saves 543 bytes (77304583 -> 77304040).
So the smaller the packet size, the more visible the impact.

And it happens we have a ton of scenarios with little blocks using LZ4 compression ...

And a useless "hooray" sidenote :
the patch improves the LZ4 compression record of silesia (using -12 -B7D --no-frame-crc) by 16 bytes (77270672 -> 77270656)
and the record on enwik9 by 44 bytes (371680396 -> 371680352) (previously claimed by [smallz4](http://create.stephan-brumme.com/smallz4/) ).
2018-02-24 11:47:53 -08:00
Yann Collet
99c26729b5
Merge pull request #475 from lz4/betterBench
Better bench measurements for small inputs
2018-02-21 05:48:58 -08:00
Yann Collet
71e16fa11a
Merge pull request #471 from lz4/fasterHC
Faster HC
2018-02-20 21:04:07 -08:00
Yann Collet
179670f32f use TIMELOOP_NANOSEC,
as suggested by @terrelln
2018-02-20 15:26:59 -08:00
Yann Collet
25b16e8a2e added one assert()
suggested by @terrelln
2018-02-20 15:25:45 -08:00
Yann Collet
34c1634d4b fixed minor conversion warning 2018-02-20 14:13:13 -08:00
Yann Collet
ae3dededed ensure bench speed measurement is more accurate for small inputs
Previous method would produce too many time() invocations,
becoming a significant fraction of workload measured.

The new strategy is to use time() only once per batch,
and dynamically resize batch size so that each round lasts approximately 1 second.

This only matters for small inputs.
Measurement for large files (such as silesia.tar) are much less impacted
(though decoding speed is so fast that even medium-size files will notice an improvement).
2018-02-20 13:09:13 -08:00
Yann Collet
1a233c5f0f update bench.c to use less time invocations
translating into more accurate speed measurements for small sources
2018-02-20 11:37:19 -08:00
Yann Collet
d74f079748 update API doc regarding double-buffer strategy
answering question #473
2018-02-18 11:00:33 -08:00
Yann Collet
9f338ae204
Merge pull request #472 from hobomind/dev
fix: missed semicolon at programs/lz4io.c:954
2018-02-14 13:00:50 -08:00
hobomind
b202c67234
fix: missed semicolon at programs/lz4io.c:954 2018-02-14 18:47:56 +03:00
Yann Collet
3d3d5af4e1
Merge pull request #470 from lz4/fasterDec
Faster decoding speed
2018-02-12 16:56:45 -08:00
Yann Collet
d3a13397d9 slight hc speed benefit (~+1%)
by optimizing countback
2018-02-12 00:01:58 -08:00
Yann Collet
219abab74b removed LZ4_copy8
better use memcpy() directly
2018-02-11 22:20:09 -08:00
Yann Collet
2b674bf02f slightly improved hc compression speed (+~1-2%)
by removing bad candidates faster.
2018-02-11 02:45:36 -08:00
Yann Collet
3ad3b0f850 slightly improved decompression speed (~+1-2%)
by making shortcut slightly more common
2018-02-11 01:43:20 -08:00
Yann Collet
f76ee4e267
Merge pull request #469 from mathstuf/intel-windows-packing-selection
intel: do not use __attribute__((packed)) on Windows
2018-02-08 08:45:25 -08:00
Ben Boeckel
c4671be550 intel: do not use __attribute__((packed)) on Windows
On Windows, the Intel compiler is closer to MSVC rather than GCC and
does not support the GCC attribute syntax.

Fixes #468
2018-02-08 09:15:27 -05:00
Yann Collet
ea25250c99 fixed code comment as detected in #466
Also clarified a few API code comments
and updated associated html documentation
2018-02-07 02:21:25 -08:00
Yann Collet
20e969e579 fuzzer: added low address compression test
is expected to work on linux+gcc only.
2018-02-05 15:19:00 -08:00
Yann Collet
e3f73fa6a6
Merge pull request #461 from terrelln/docs
Clarify the requirements of the LZ4 streaming API
2018-02-01 16:14:54 -08:00
Nick Terrell
e832a3d87a Clarify the requirements of the LZ4 streaming API 2018-02-01 16:08:59 -08:00
Yann Collet
99a81c89f0
Merge pull request #458 from lz4/ff161
Minor change to LZ4 Frame format specification
2018-02-01 10:55:02 -08:00