Commit Graph

9666 Commits

Author SHA1 Message Date
William Ahern
cca3544708 restore combine.sh bash performance while still sticking to POSIX 2022-08-10 16:51:17 -07:00
Yann Collet
1b249cf075
Merge pull request #3235 from facebook/docTraining
[easy] added a few documentation words about dictionary training
2022-08-08 12:06:49 -07:00
Yann Collet
3f7a1b1328 added a few documentation words about dictionary training
partially answering questions such as #3233
which looks for guidance within `exmaples/`.
2022-08-05 17:09:22 +02:00
Nick Terrell
03cc84fddb
Add explicit --pass-through flag and default to enabled for *cat (#3223)
Fixes #3211.

Adds the `--[no-]pass-through` flag which enables/disables pass-through mode.

* `zstdcat`, `zcat`, and `gzcat` default to `--pass-through`.
  Pass-through mode can be disabled by passing `--no-pass-through`.
* All other binaries default to not setting pass-through mode.
  However, we preserve the legacy behavior of enabling pass-through
  mode when writing to stdout with `-f` set, unless pass-through
  mode is explicitly disabled with `--no-pass-through`.

Adds a new test for this behavior that should codify the behavior we want.
2022-08-04 17:15:59 -07:00
zengyijing
d0dcc9d775
fix issue #3144 (#3226)
* fix issue #3144

* add test case for verbose-wlog

Co-authored-by: zengyijing <yijingzeng@fb.com>
2022-08-04 13:51:14 -07:00
Yann Collet
e818fa8eb0
Merge pull request #3232 from facebook/fileiotypes_nomemh
fileio_types.h : avoid dependency on mem.h
2022-08-03 22:57:16 +02:00
Yann Collet
9e90b180c5
Merge pull request #3231 from facebook/fileio_missingInclude
[easy] fixed missing include
2022-08-03 22:48:05 +02:00
Yann Collet
3dfcafacd7 fileio_types.h : avoid dependency on mem.h
fileio_types.h cannot be parsed by itself
because it relies on basic types defined in `lib/common/mem.h`.
As for #3231, it likely wasn't detected because `mem.h` was probably included before within target files.
But this is not proper.

A "easy" solution would be to add the missing include,
but each dependency should be considered "bad" by default,
and only allowed if it brings some tangible value.

In this case, since these types are only used to declare internal structure variables
which are effectively only flags,
I believe it's really not valuable to add a dependency on `mem.h` for this purpose
while the standard `int` type can do the same job.

I was expecting some compiler warnings following this change,
but it turns out we don't use `-Wconversion` by default on `zstd` source code,
so there is none.

Nevertheless, I enabled `-Wconversion` locally and proceeded to fix a few conversion warnings in the process.

Adding `-Wconversion` to the list of flags used for `zstd` is something I would be favorable over the long term,
but it cannot be done overnight,
because the nb of places where this warning is triggered is daunting.
Better progressively reduce the nb of triggered `-Wconversion` warnings before enabling this flag by default.
2022-08-03 21:39:35 +02:00
Yann Collet
a925362534 minor : fixed missing include
I presume it was not detected before
because "fileio.h" is probably always included after "util.h".
2022-08-03 20:52:15 +02:00
Nick Terrell
a70ca2bd7d
Fix off-by-one error in superblock mode (#3221)
Fixes #3212.

Long literal and match lengths had an off-by-one error in ZSTD_getSequenceLength.
Fix the off-by-one error, and add a golden compression test that catches the bug.
Also run all the golden tests in the cli-tests framework.
2022-08-03 11:28:39 -07:00
Felix Handte
7e6278a706
Merge pull request #3196 from mileshu/dev
[T124890272] Mark 2 Obsolete Functions(ZSTD_copy*Ctx) Deprecated in Zstd
2022-08-02 12:34:04 -04:00
Miles Hu
201f2e339b Merge branch 'dev' of https://github.com/mileshu/zstd into dev 2022-08-01 22:52:47 -07:00
Miles HU
c450f9f952 [T124890272] Mark 2 Obsolete Functions(ZSTD_copy*Ctx) Deprecated in Zstd
The discussion for this task is here: facebook/zstd#3128.

This task can probably be scoped to the first part: marking these functions deprecated.
We'll later look at removal when we roll out v1.6.0.
2022-08-01 22:45:52 -07:00
Nick Terrell
0f4fd28a64
Deprecate ZSTD_getDecompressedSize() (#3225)
Fixes #3158.

Mark ZSTD_getDecompressedSize() as deprecated and replaced by ZSTD_getFrameContentSize().
2022-08-01 11:52:14 -07:00
Elliot Gorokhovsky
28ceb63503
Merge pull request #3220 from embg/issue3200
Disallow empty string as argument for --output-dir-flat and --output-dir-mirror
2022-08-01 14:04:57 -04:00
Qiongsi Wu
1b445c1c2e
Fix hash4Ptr for big endian (#3227) 2022-08-01 10:41:24 -07:00
Yonatan Komornik
ae4670466c
stdin multiple file fixes (#3222)
* Fixes for https://github.com/facebook/zstd/issues/3206 - bugs when handling stdin as part of multiple files.

* new line at end of multiple-files.sh
2022-07-29 16:13:07 -07:00
Elliot Gorokhovsky
f9f27de91c Disallow empty output directory 2022-07-29 14:48:33 -07:00
Tom Wang
d4a5bc4efc
Add warning when multi-thread decompression is requested (#3208)
When user pass in argument for both decompression and multi-thread, print a warning message
to indicate that multi-threaded decompression is not supported.

* Add warning when multi-thread decompression is requested
* add test case for multi-threaded decoding warning
   Expectation is for -d -T0 we will not throw any warning,
   and see warning for any other -d -T(>1) inputs
2022-07-29 12:51:58 -07:00
Chris Burgess
2b9fde932b
Fix small file passthrough (#3215) 2022-07-29 12:22:46 -07:00
orbea
1e09cffd9b
zlibWrapper: Update for zlib 1.2.12 (#3217)
In zlib 1.2.12 the OF macro was changed to _Z_OF breaking any
project that used zlibWrapper. To fix this the OF has been
changed to _Z_OF everywhere and _Z_OF is defined as OF in the
case it is not yet defined for zlib 1.2.11 and older.

Fixes: https://github.com/facebook/zstd/issues/3216
2022-07-29 12:22:10 -07:00
Qiongsi Wu
b1bbb0eb4c
[AIX] Fix Compiler Flags and Bugs on AIX to Pass All Tests (#3219)
* Fixing compiler warnings

* Replace the old -s flag with the -Wl,-s flag

* Fixing compiler warnings

* Fixing the linker strip flag and tests/code not working as expected on AIX
2022-07-29 12:21:59 -07:00
Elliot Gorokhovsky
e1873ad576 Fix buffer underflow for null dir1 2022-07-29 11:10:47 -07:00
Jun He
ec5fdcde19
lib: add hint to generate more pipeline friendly code (#3138)
With statistic data of test data files of silesia
the chance of position beyond highThreshold is very
low (~1.3%@L8 in most cases, all <2.5%), and is in
"lowprob area". Add the branch hint so compiler can
get better pipiline codegen.
With this change it is observed ~1% of mozilla and
xml, and slight (0.3%~0.8%) but consistent uplift on
other files on Arm N1.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: Id9ba1d5c767e975290b5c1bf0ecce906544f4ade
2022-07-29 10:28:04 -07:00
Jun He
558cf20d0d
decomp: add prefetch for matched seq on aarch64 (#3164)
match is used for following sequence copy. It is
only updated when extDict is needed, which is a
low probability case. So it can be prefetched to
reduce cache miss.
The benchmarks on various Arm platforms showed
uplift from 1% ~ 14% with gcc-11/clang-14.

Signed-off-by: Jun He <jun.he@arm.com>
Change-Id: If201af4799d2455d74c79f8387404439d7f684ae
2022-07-29 10:27:20 -07:00
Mathew R Gordon
85d633042d
Add transparency and optimize logo (#3218)
Make the front page look better in dark GH themes
2022-07-29 10:17:31 -07:00
Elliot Gorokhovsky
e5db7c93f5
Merge pull request #3197 from embg/docstring_clarify
Clarify benchmark chunking docstring
2022-07-26 13:26:15 -04:00
Elliot Gorokhovsky
bef1d9a831
Merge pull request #3209 from zhuhan0/dev
[largeNbDicts] Second try at fixing decompression segfault to always create compressInstructions
2022-07-26 13:19:38 -04:00
Han Zhu
6255f994d3 [largeNbDicts] Second try at fixing decompression segfault to always create compressInstructions
Summary:
Freeing an uninitialized pointer is undefined behavior. This caused a segfault
when compiling the benchmark with Clang -O3 and benching decompression.

V2: always create compressInstructions but check if cctxParams is NULL before
setting CCtx params to avoid segfault.

Test Plan:
make and run
2022-07-21 11:55:01 -07:00
Elliot Gorokhovsky
466e13f722
Merge pull request #3205 from zhuhan0/dev
[contrib][largeNbDicts] Fix decompression segfault; Add additional benchmark metrics
2022-07-20 16:07:04 -04:00
Han Zhu
d993a288e0 [largeNbDicts] Add an option to print out median speed
Summary:
Added an option -p# where -p0 (default) sets the aggregation method to fastest
speed while -p1 sets the aggregation method to median. Also added a new column
in the csv file to report this option's value.

Test Plan:
``
$ ./largeNbDicts -1 --nbDicts=1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03  (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28  (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.0 MB/s
Fastest Speed : 310.6 MB/s

$ ./largeNbDicts -1 --nbDicts=1 -p1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03  (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28  (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.9 MB/s
Median Speed : 298.4 MB/s
```
2022-07-20 11:19:41 -07:00
Han Zhu
b550f9b77e [largeNbDicts] Print more metrics into csv file
Summary:
Add column headers and data for whether it's a compression or a decompression
run, compression level, nbDicts and dictAttachPref in additional to
compr/decompr speed.

Test Plan:
Example output:

```
./largeNbDicts
Compression/Decompression,Level,nbDicts,dictAttachPref,Speed
Compression,1,1,0,300.9
Compression,1,1,1,296.4
Compression,1,1,2,307.8
Compression,1,10,0,292.3
Compression,1,100,0,293.3
Compression,3,110,0,106.0
Decompression,-1,110,-1,155.6
Decompression,-1,110,-1,709.4
Decompression,-1,120,-1,709.1
Decompression,-1,120,-1,734.6
```
2022-07-19 16:50:28 -07:00
Han Zhu
d0c88afe6d [largeNbDicts] Fix decompression segfault in createCompressInstructions
Benchmarking decompression results in a segfault in `createCompressInstructions`
because `cctxParams` is NULL. Skip running that function if we are not benching
compression.
2022-07-19 13:55:52 -07:00
udayanbapat
43f21a600e
Intial commit to address 3090. Added support to decompress empty block. (#3118)
* Intial commit to address 3090. Added support to decompress empty block

* Update zstd_decompress_block.c

Addressed review comments for the case of 'set_basic'

* Update lib/decompress/zstd_decompress_block.c

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>

* Update lib/decompress/zstd_decompress_block.c

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>

Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
2022-07-14 11:54:34 -07:00
Elliot Gorokhovsky
6d75b36b7f Clarify -B docstring 2022-07-14 00:22:21 -04:00
Miles HU
6b233d5d41 [T124890272] Mark 2 Obsolete Functions(ZSTD_copy*Ctx) Deprecated in Zstd
The discussion for this task is here: facebook/zstd#3128.

This task can probably be scoped to the first part: marking these functions deprecated.
We'll later look at removal when we roll out v1.6.0.
2022-07-13 11:00:05 -07:00
Miles HU
a5655e4017 Revert "T119975957"
This reverts commit 962746edff.
2022-07-12 11:17:25 -07:00
Miles HU
962746edff T119975957
Signed-off-by: Miles HU <yuanpu@fb.com>
2022-07-08 15:01:36 -07:00
Felix Handte
02ef78be58
Merge pull request #3184 from htnhan/features/list_verbose_to_show_dictionary_id
zstd -lv <file> to show dictID
2022-07-08 16:04:39 -04:00
htnhan
d7eb829af5 Detect multiple dictIDs in one file 2022-07-08 12:20:50 -05:00
htnhan
cc8c98485a zstd -lv <file> to show dictID 2022-07-05 21:28:33 -05:00
Elliot Gorokhovsky
3ef92cfcd4
Merge pull request #3180 from nocnokneo/MSVCBuildTests
Fix ZSTD_BUILD_TESTS=ON with MSVC
2022-07-05 13:13:34 -04:00
Taylor Braun-Jones
cd9d0a7e6e Fix ZSTD_BUILD_TESTS=ON build with MSVC
Fixes:

    Command line error D8021 : invalid numeric argument '/Wno-deprecated-declarations'
2022-06-30 13:20:42 -04:00
Elliot Gorokhovsky
5d2fb4288f
Merge pull request #3179 from embg/1.5.3_bump
Prepare v1.5.3
2022-06-29 13:03:52 -07:00
Elliot Gorokhovsky
bb3839a78c make -C programs zstd.1 2022-06-29 14:55:14 -04:00
Elliot Gorokhovsky
5c382bf110 1.5.3 version bump 2022-06-29 14:45:53 -04:00
Elliot Gorokhovsky
e9d6fc867a
Merge pull request #3177 from embg/dms_prefetch2
Add prefetchCDictTables CCtxParam (+10-20% cold dict compression speed)
2022-06-24 08:24:43 -07:00
Elliot Gorokhovsky
cb9e341129 Nits 2022-06-23 16:59:21 -04:00
Elliot Gorokhovsky
bb4a3c71ef
Update README.md for fuzzers (#3174)
* Update README.md for fuzzers

* Add ls corpora/*crash command

* nit

* Clarify wording and add Nick's command

* Minor clarification
2022-06-22 21:02:07 -04:00
Elliot Gorokhovsky
747e06f4f6 Add tests 2022-06-22 17:05:23 -04:00