commit | 5603d30f81c99fbe55f48d953e3df728720ff982 | [log] [download] |
---|---|---|
author | Alexey Tourbin <alexey.tourbin@gmail.com> | Thu Apr 26 07:46:26 2018 +0300 |
committer | Alexey Tourbin <alexey.tourbin@gmail.com> | Thu Apr 26 08:23:54 2018 +0300 |
tree | a0c27174b8e5e9661f760ae86879c9f71790faf3 | |
parent | b4eda8d08f307dfc3cbd0d06baea4e6c581b70de [diff] |
lz4.c: fixed the LZ4_decompress_safe_continue case The previous change broke decoding with a ring buffer. That's because I didn't realize that the "double dictionary mode" was possible, i.e. that the decoding routine can look both at the first part of the dictionary passed as prefix and the second part passed via dictStart+dictSize. So this change introduces the LZ4_decompress_safe_doubleDict helper, which handles this "double dictionary" situation. (This is a bit of a misnomer, there is only one dictionary, but I can't think of a better name, and perhaps the designation is not all too bad.) The helper is used only once, in LZ4_decompress_safe_continue, it should be inlined with LZ4_FORCE_O2_GCC_PPC64LE attached to LZ4_decompress_safe_continue. (Also, in the helper functions, I change the dictStart parameter type to "const void*", to avoid a cast when calling helpers. In the helpers, the upcast to "BYTE*" is still required, for compatibility with C++.) So this fixes the case of LZ4_decompress_safe_continue, and I'm surprised by the fact that the fuzzer is now happy and does not detect a similar problem with LZ4_decompress_fast_continue. So before fixing LZ4_decompress_fast_continue, the next logical step is to enhance the fuzzer.
LZ4 is lossless compression algorithm, providing compression speed at 400 MB/s per core, scalable with multi-cores CPU. It features an extremely fast decoder, with speed in multiple GB/s per core, typically reaching RAM speed limits on multi-core systems.
Speed can be tuned dynamically, selecting an “acceleration” factor which trades compression ratio for more speed up. On the other end, a high compression derivative, LZ4_HC, is also provided, trading CPU time for improved compression ratio. All versions feature the same decompression speed.
LZ4 library is provided as open-source software using BSD 2-Clause license.
Branch | Status |
---|---|
master | |
dev |
Branch Policy:
- The “master” branch is considered stable, at all times.
- The “dev” branch is the one where all contributions must be merged before being promoted to master.
- If you plan to propose a patch, please commit into the “dev” branch, or its own feature branch. Direct commit to “master” are not permitted.
The benchmark uses lzbench, from @inikep compiled with GCC v6.2.0 on Linux 64-bits. The reference system uses a Core i7-3930K CPU @ 4.5GHz. Benchmark evaluates the compression of reference Silesia Corpus in single-thread mode.
Compressor | Ratio | Compression | Decompression |
---|---|---|---|
memcpy | 1.000 | 7300 MB/s | 7300 MB/s |
LZ4 fast 8 (v1.7.3) | 1.799 | 911 MB/s | 3360 MB/s |
LZ4 default (v1.7.3) | 2.101 | 625 MB/s | 3220 MB/s |
LZO 2.09 | 2.108 | 620 MB/s | 845 MB/s |
QuickLZ 1.5.0 | 2.238 | 510 MB/s | 600 MB/s |
Snappy 1.1.3 | 2.091 | 450 MB/s | 1550 MB/s |
LZF v3.6 | 2.073 | 365 MB/s | 820 MB/s |
Zstandard 1.1.1 -1 | 2.876 | 330 MB/s | 930 MB/s |
Zstandard 1.1.1 -3 | 3.164 | 200 MB/s | 810 MB/s |
zlib deflate 1.2.8 -1 | 2.730 | 100 MB/s | 370 MB/s |
LZ4 HC -9 (v1.7.3) | 2.720 | 34 MB/s | 3240 MB/s |
zlib deflate 1.2.8 -6 | 3.099 | 33 MB/s | 390 MB/s |
LZ4 is also compatible and well optimized for x32 mode, for which it provides an additional +10% speed performance.
make make install # this command may require root access
LZ4's Makefile
supports standard Makefile conventions, including staged installs, redirection, or command redefinition. It is compatible with parallel builds (-j#
).
The raw LZ4 block compression format is detailed within lz4_Block_format.
To compress an arbitrarily long file or data stream, multiple blocks are required. Organizing these blocks and providing a common header format to handle their content is the purpose of the Frame format, defined into lz4_Frame_format. Interoperable versions of LZ4 must respect this frame format.
Beyond the C reference source, many contributors have created versions of lz4 in multiple languages (Java, C#, Python, Perl, Ruby, etc.). A list of known source ports is maintained on the LZ4 Homepage.