15 Aug, 2020
1 commit
-
This patch replaces all memcpy() calls with LZ4_memcpy() which calls
__builtin_memcpy() so the compiler can inline it.LZ4 relies heavily on memcpy() with a constant size being inlined. In x86
and i386 pre-boot environments memcpy() cannot be inlined because memcpy()
doesn't get defined as __builtin_memcpy().An equivalent patch has been applied upstream so that the next import
won't lose this change [1].I've measured the kernel decompression speed using QEMU before and after
this patch for the x86_64 and i386 architectures. The speed-up is about
10x as shown below.Code Arch Kernel Size Time Speed
v5.8 x86_64 11504832 B 148 ms 79 MB/s
patch x86_64 11503872 B 13 ms 885 MB/s
v5.8 i386 9621216 B 91 ms 106 MB/s
patch i386 9620224 B 10 ms 962 MB/sI also measured the time to decompress the initramfs on x86_64, i386, and
arm. All three show the same decompression speed before and after, as
expected.[1] https://github.com/lz4/lz4/pull/890
Signed-off-by: Nick Terrell
Signed-off-by: Andrew Morton
Cc: Yann Collet
Cc: Gao Xiang
Cc: Sven Schmidt
Cc: Greg Kroah-Hartman
Cc: Ingo Molnar
Cc: Arvind Sankar
Link: http://lkml.kernel.org/r/20200803194022.2966806-1-nickrterrell@gmail.com
Signed-off-by: Linus Torvalds
11 Jun, 2020
1 commit
-
This operation was intentional, but tools such as smatch will warn that it
might not have been.Signed-off-by: Andrew Morton
Cc: Yann Collet
Cc: Vasily Averin
Cc: Gao Xiang
Link: http://lkml.kernel.org/r/3bf931c6ea0cae3e23f3485801986859851b4f04.camel@perches.com
Signed-off-by: Linus Torvalds
21 Sep, 2019
1 commit
-
Kbuild now complains (rightly) about it.
Signed-off-by: Linus Torvalds
21 May, 2019
1 commit
-
Add SPDX license identifiers to all Make/Kconfig files which:
- Have no license information of any form
These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:GPL-2.0-only
Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
31 Oct, 2018
1 commit
-
Update the LZ4 compression module based on LZ4 v1.8.3 in order for the
erofs file system to use the newest LZ4_decompress_safe_partial() which
can now decode exactly the nb of bytes requested [1] to take place of the
open hacked code in the erofs file system itself.Currently, apart from the erofs file system, no other users use
LZ4_decompress_safe_partial, so no worry about the interface.In addition, LZ4 v1.8.x boosts up decompression speed compared to the
current code which is based on LZ4 v1.7.3, mainly due to shortcut
optimization for the specific common LZ4-sequences [2].lzbench testdata (tested in kirin710, 8 cores, 4 big cores
at 2189Mhz, 2GB DDR RAM at 1622Mhz, with enwik8 testdata [3]):Compressor name Compress. Decompress. Compr. size Ratio Filename
memcpy 5004 MB/s 4924 MB/s 100000000 100.00 enwik8
lz4hc 1.7.3 -9 12 MB/s 653 MB/s 42203253 42.20 enwik8
lz4hc 1.8.0 -9 12 MB/s 908 MB/s 42203096 42.20 enwik8
lz4hc 1.8.3 -9 11 MB/s 965 MB/s 42203094 42.20 enwik8[1] https://github.com/lz4/lz4/issues/566
https://github.com/lz4/lz4/commit/08d347b5b217b011ff7487130b79480d8cfdaeb8[2] v1.8.1 perf: slightly faster compression and decompression speed
https://github.com/lz4/lz4/commit/a31b7058cb97e4393da55e78a77a1c6f0c9ae038
v1.8.2 perf: slightly faster HC compression and decompression speed
https://github.com/lz4/lz4/commit/45f8603aae389d34c689d3ff7427b314071ccd2c
https://github.com/lz4/lz4/commit/1a191b3f8d26b50a7c1d41590b529ec308d768cd[3] http://mattmahoney.net/dc/textdata.html
http://mattmahoney.net/dc/enwik8.zipLink: http://lkml.kernel.org/r/1537181207-21932-1-git-send-email-gaoxiang25@huawei.com
Signed-off-by: Gao Xiang
Tested-by: Guo Xuenan
Cc: Colin Ian King
Cc: Yann Collet
Cc: Greg Kroah-Hartman
Cc: Fang Wei
Cc: Chao Yu
Cc: Miao Xie
Cc: Sven Schmidt
Cc: Kyungsik Lee
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
04 Oct, 2017
1 commit
-
Don't populate the read-only arrays dec32table and dec64table on the
stack, instead make them both static const. Makes the object code
smaller by over 10K bytes:Before:
text data bss dec hex filename
31500 0 0 31500 7b0c lib/lz4/lz4_decompress.oAfter:
text data bss dec hex filename
20237 176 0 20413 4fbd lib/lz4/lz4_decompress.o(gcc version 7.2.0 x86_64)
Link: http://lkml.kernel.org/r/20170921221939.20820-1-colin.king@canonical.com
Signed-off-by: Colin Ian King
Cc: Christophe JAILLET
Cc: Sven Schmidt
Cc: Arnd Bergmann
Cc: Joe Perches
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
25 Feb, 2017
2 commits
-
Remove the functions introduced as wrappers for providing backwards
compatibility to the prior LZ4 version. They're not needed anymore
since there's no callers left.Link: http://lkml.kernel.org/r/1486321748-19085-6-git-send-email-4sschmid@informatik.uni-hamburg.de
Signed-off-by: Sven Schmidt
Cc: Bongkyu Kim
Cc: Rui Salvaterra
Cc: Sergey Senozhatsky
Cc: Greg Kroah-Hartman
Cc: Herbert Xu
Cc: David S. Miller
Cc: Anton Vorontsov
Cc: Colin Cross
Cc: Kees Cook
Cc: Tony Luck
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Patch series "Update LZ4 compressor module", v7.
This patchset updates the LZ4 compression module to a version based on
LZ4 v1.7.3 allowing to use the fast compression algorithm aka LZ4 fast
which provides an "acceleration" parameter as a tradeoff between high
compression ratio and high compression speed.We want to use LZ4 fast in order to support compression in lustre and
(mostly, based on that) investigate data reduction techniques in behalf
of storage systems.Also, it will be useful for other users of LZ4 compression, as with LZ4
fast it is possible to enable applications to use fast and/or high
compression depending on the usecase. For instance, ZRAM is offering a
LZ4 backend and could benefit from an updated LZ4 in the kernel.LZ4 homepage: http://www.lz4.org/
LZ4 source repository: https://github.com/lz4/lz4 Source version: 1.7.3Benchmark (taken from [1], Core i5-4300U @1.9GHz):
----------------|--------------|----------------|----------
Compressor | Compression | Decompression | Ratio
----------------|--------------|----------------|----------
memcpy | 4200 MB/s | 4200 MB/s | 1.000
LZ4 fast 50 | 1080 MB/s | 2650 MB/s | 1.375
LZ4 fast 17 | 680 MB/s | 2220 MB/s | 1.607
LZ4 fast 5 | 475 MB/s | 1920 MB/s | 1.886
LZ4 default | 385 MB/s | 1850 MB/s | 2.101[1] http://fastcompression.blogspot.de/2015/04/sampling-or-faster-lz4.html
[PATCH 1/5] lib: Update LZ4 compressor module
[PATCH 2/5] lib/decompress_unlz4: Change module to work with new LZ4 module version
[PATCH 3/5] crypto: Change LZ4 modules to work with new LZ4 module version
[PATCH 4/5] fs/pstore: fs/squashfs: Change usage of LZ4 to work with new LZ4 version
[PATCH 5/5] lib/lz4: Remove back-compat wrappersThis patch (of 5):
Update the LZ4 kernel module to LZ4 v1.7.3 by Yann Collet. The kernel
module is inspired by the previous work by Chanho Min. The updated LZ4
module will not break existing code since the patchset contains
appropriate changes.API changes:
New method LZ4_compress_fast which differs from the variant available in
kernel by the new acceleration parameter, allowing to trade compression
ratio for more compression speed and vice versa.LZ4_decompress_fast is the respective decompression method, featuring a
very fast decoder (multiple GB/s per core), able to reach RAM speed in
multi-core systems. The decompressor allows to decompress data
compressed with LZ4 fast as well as the LZ4 HC (high compression)
algorithm.Also the useful functions LZ4_decompress_safe_partial and
LZ4_compress_destsize were added. The latter reverses the logic by
trying to compress as much data as possible from source to dest while
the former aims to decompress partial blocks of data.A bunch of streaming functions were also added which allow
compressig/decompressing data in multiple steps (so called "streaming
mode").The methods lz4_compress and lz4_decompress_unknownoutputsize are now
known as LZ4_compress_default respectivley LZ4_decompress_safe. The old
methods will be removed since there's no callers left in the code.[arnd@arndb.de: fix KERNEL_LZ4 support]
Link: http://lkml.kernel.org/r/20170208211946.2839649-1-arnd@arndb.de
[akpm@linux-foundation.org: simplify]
[akpm@linux-foundation.org: fix the simplification]
[4sschmid@informatik.uni-hamburg.de: fix performance regressions]
Link: http://lkml.kernel.org/r/1486898178-17125-2-git-send-email-4sschmid@informatik.uni-hamburg.de
[4sschmid@informatik.uni-hamburg.de: v8]
Link: http://lkml.kernel.org/r/1487182598-15351-2-git-send-email-4sschmid@informatik.uni-hamburg.de
Link: http://lkml.kernel.org/r/1486321748-19085-2-git-send-email-4sschmid@informatik.uni-hamburg.de
Signed-off-by: Sven Schmidt
Signed-off-by: Arnd Bergmann
Cc: Bongkyu Kim
Cc: Rui Salvaterra
Cc: Sergey Senozhatsky
Cc: Greg Kroah-Hartman
Cc: Herbert Xu
Cc: David S. Miller
Cc: Anton Vorontsov
Cc: Colin Cross
Cc: Kees Cook
Cc: Tony Luck
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
14 Apr, 2016
2 commits
-
These identifiers are bogus. The interested architectures should define
HAVE_EFFICIENT_UNALIGNED_ACCESS whenever relevant to do so. If this
isn't true for some arch, it should be fixed in the arch definition.Signed-off-by: Rui Salvaterra
Reviewed-by: Sergey Senozhatsky
Signed-off-by: Greg Kroah-Hartman -
Based on Sergey's test patch [1], this fixes zram with lz4 compression
on big endian cpus.Note that the 64-bit preprocessor test is not a cleanup, it's part of
the fix, since those identifiers are bogus (for example, __ppc64__
isn't defined anywhere else in the kernel, which means we'd fall into
the 32-bit definitions on ppc64).Tested on ppc64 with no regression on x86_64.
[1] http://marc.info/?l=linux-kernel&m=145994470805853&w=4
Cc: stable@vger.kernel.org
Suggested-by: Sergey Senozhatsky
Signed-off-by: Rui Salvaterra
Reviewed-by: Sergey Senozhatsky
Signed-off-by: Greg Kroah-Hartman
25 May, 2015
1 commit
-
Sometimes, on x86_64, decompression fails with the following
error:Decompressing Linux...
Decoding failed
-- System halted
This condition is not needed for a 64bit kernel(from commit d5e7caf):
if( ... ||
(op + COPYLENGTH) > oend)
goto _output_errormacro LZ4_SECURE_COPY() tests op and does not copy any data
when op exceeds the value.added by analogy to lz4_uncompress_unknownoutputsize(...)
Signed-off-by: Krzysztof Kolasa
Tested-by: Alexander Kuleshov
Tested-by: Caleb Jorden
Signed-off-by: Greg Kroah-Hartman
25 Mar, 2015
1 commit
-
There's no reason to allocate the dec{32,64}table on the stack; it
just wastes a bunch of instructions setting them up and, of course,
also consumes quite a bit of stack. Using size_t for such small
integers is a little excessive.$ scripts/bloat-o-meter /tmp/built-in.o lib/built-in.o
add/remove: 2/2 grow/shrink: 2/0 up/down: 1304/-1548 (-244)
function old new delta
lz4_decompress_unknownoutputsize 55 718 +663
lz4_decompress 55 632 +577
dec64table - 32 +32
dec32table - 32 +32
lz4_uncompress 747 - -747
lz4_uncompress_unknownoutputsize 801 - -801The now inlined lz4_uncompress functions used to have a stack
footprint of 176 bytes (according to -fstack-usage); their inlinees
have increased their stack use from 32 bytes to 48 and 80 bytes,
respectively.Signed-off-by: Rasmus Villemoes
Signed-off-by: Greg Kroah-Hartman
17 Mar, 2015
1 commit
-
If the part of the compression data are corrupted, or the compression
data is totally fake, the memory access over the limit is possible.This is the log from my system usning lz4 decompression.
[6502]data abort, halting
[6503]r0 0x00000000 r1 0x00000000 r2 0xdcea0ffc r3 0xdcea0ffc
[6509]r4 0xb9ab0bfd r5 0xdcea0ffc r6 0xdcea0ff8 r7 0xdce80000
[6515]r8 0x00000000 r9 0x00000000 r10 0x00000000 r11 0xb9a98000
[6522]r12 0xdcea1000 usp 0x00000000 ulr 0x00000000 pc 0x820149bc
[6528]spsr 0x400001f3
and the memory addresses of some variables at the moment are
ref:0xdcea0ffc, op:0xdcea0ffc, oend:0xdcea1000As you can see, COPYLENGH is 8bytes, so @ref and @op can access the momory
over @oend.Signed-off-by: JeHyeon Yeon
Reviewed-by: David Sterba
Signed-off-by: Greg Kroah-Hartman
04 Jul, 2014
1 commit
-
Jan points out that I forgot to make the needed fixes to the
lz4_uncompress_unknownoutputsize() function to mirror the changes done
in lz4_decompress() with regards to potential pointer overflows.The only in-kernel user of this function is the zram code, which only
takes data from a valid compressed buffer that it made itself, so it's
not a big issue. But due to external kernel modules using this
function, it's better to be safe here.Reported-by: Jan Beulich
Cc: "Don A. Bailey"
Cc: stable
Signed-off-by: Greg Kroah-Hartman
28 Jun, 2014
1 commit
-
There is one other possible overrun in the lz4 code as implemented by
Linux at this point in time (which differs from the upstream lz4
codebase, but will get synced at in a future kernel release.) As
pointed out by Don, we also need to check the overflow in the data
itself.While we are at it, replace the odd error return value with just a
"simple" -1 value as the return value is never used for anything other
than a basic "did this work or not" check.Reported-by: "Don A. Bailey"
Reported-by: Willy Tarreau
Cc: stable
Signed-off-by: Greg Kroah-Hartman
24 Jun, 2014
1 commit
-
Given some pathologically compressed data, lz4 could possibly decide to
wrap a few internal variables, causing unknown things to happen. Catch
this before the wrapping happens and abort the decompression.Reported-by: "Don A. Bailey"
Cc: stable
Signed-off-by: Greg Kroah-Hartman
12 Sep, 2013
1 commit
-
LZ4 compression and decompression functions require different in
signedness input/output parameters: unsigned char for compression and
signed char for decompression.Change decompression API to require "(const) unsigned char *".
Signed-off-by: Sergey Senozhatsky
Cc: Kyungsik Lee
Cc: Geert Uytterhoeven
Cc: Yann Collet
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
24 Aug, 2013
1 commit
-
The LZ4 code is listed as using the "BSD 2-Clause License".
Signed-off-by: Richard Laager
Acked-by: Kyungsik Lee
Cc: Chanho Min
Cc: Richard Yao
Signed-off-by: Andrew Morton
[ The 2-clause BSD can be just converted into GPL, but that's rude and
pointless, so don't do it - Linus ]
Signed-off-by: Linus Torvalds
10 Jul, 2013
3 commits
-
This patchset is for supporting LZ4 compression and the crypto API using
it.As shown below, the size of data is a little bit bigger but compressing
speed is faster under the enabled unaligned memory access. We can use
lz4 de/compression through crypto API as well. Also, It will be useful
for another potential user of lz4 compression.lz4 Compression Benchmark:
Compiler: ARM gcc 4.6.4
ARMv7, 1 GHz based board
Kernel: linux 3.4
Uncompressed data Size: 101 MB
Compressed Size compression Speed
LZO 72.1MB 32.1MB/s, 33.0MB/s(UA)
LZ4 75.1MB 30.4MB/s, 35.9MB/s(UA)
LZ4HC 59.8MB 2.4MB/s, 2.5MB/s(UA)
- UA: Unaligned memory Access support
- Latest patch set for LZO appliedThis patch:
Add support for LZ4 compression in the Linux Kernel. LZ4 Compression APIs
for kernel are based on LZ4 implementation by Yann Collet and were changed
for kernel coding style.LZ4 homepage : http://fastcompression.blogspot.com/p/lz4.html
LZ4 source repository : http://code.google.com/p/lz4/
svn revision : r90Two APIs are added:
lz4_compress() support basic lz4 compression whereas lz4hc_compress()
support high compression or CPU performance get lower but compression
ratio get higher. Also, we require the pre-allocated working memory with
the defined size and destination buffer must be allocated with the size of
lz4_compressbound.[akpm@linux-foundation.org: make lz4_compresshcctx() static]
Signed-off-by: Chanho Min
Cc: "Darrick J. Wong"
Cc: Bob Pearson
Cc: Richard Weinberger
Cc: Herbert Xu
Cc: Yann Collet
Cc: Kyungsik Lee
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add support for extracting LZ4-compressed kernel images, as well as
LZ4-compressed ramdisk images in the kernel boot process.Signed-off-by: Kyungsik Lee
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Russell King
Cc: Borislav Petkov
Cc: Florian Fainelli
Cc: Yann Collet
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds -
Add support for LZ4 decompression in the Linux Kernel. LZ4 Decompression
APIs for kernel are based on LZ4 implementation by Yann Collet.Benchmark Results(PATCH v3)
Compiler: Linaro ARM gcc 4.6.21. ARMv7, 1.5GHz based board
Kernel: linux 3.4
Uncompressed Kernel Size: 14MB
Compressed Size Decompression Speed
LZO 6.7MB 20.1MB/s, 25.2MB/s(UA)
LZ4 7.3MB 29.1MB/s, 45.6MB/s(UA)2. ARMv7, 1.7GHz based board
Kernel: linux 3.7
Uncompressed Kernel Size: 14MB
Compressed Size Decompression Speed
LZO 6.0MB 34.1MB/s, 52.2MB/s(UA)
LZ4 6.5MB 86.7MB/s
- UA: Unaligned memory Access support
- Latest patch set for LZO appliedThis patch set is for adding support for LZ4-compressed Kernel. LZ4 is a
very fast lossless compression algorithm and it also features an extremely
fast decoder [1].But we have five of decompressors already and one question which does
arise, however, is that of where do we stop adding new ones? This issue
had been discussed and came to the conclusion [2].Russell King said that we should have:
- one decompressor which is the fastest
- one decompressor for the highest compression ratio
- one popular decompressor (eg conventional gzip)If we have a replacement one for one of these, then it should do exactly
that: replace it.The benchmark shows that an 8% increase in image size vs a 66% increase
in decompression speed compared to LZO(which has been known as the
fastest decompressor in the Kernel). Therefore the "fast but may not be
small" compression title has clearly been taken by LZ4 [3].[1] http://code.google.com/p/lz4/
[2] http://thread.gmane.org/gmane.linux.kbuild.devel/9157
[3] http://thread.gmane.org/gmane.linux.kbuild.devel/9347LZ4 homepage: http://fastcompression.blogspot.com/p/lz4.html
LZ4 source repository: http://code.google.com/p/lz4/Signed-off-by: Kyungsik Lee
Signed-off-by: Yann Collet
Cc: "H. Peter Anvin"
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: Russell King
Cc: Borislav Petkov
Cc: Florian Fainelli
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds