20 Jun, 2020

1 commit

  • There are several files that I was unable to find a proper place
    for them, and 3 ones that are still in plain old text format.

    Let's place those stuff behind the carpet, as we'd like to keep the
    root directory clean.

    We can later discuss and move those into better places.

    Signed-off-by: Mauro Carvalho Chehab
    Link: https://lore.kernel.org/r/11bd0d75e65a874f7c276a0aeab0fe13f3376f5f.1592203650.git.mchehab+huawei@kernel.org
    Signed-off-by: Jonathan Corbet

    Mauro Carvalho Chehab
     

12 Jun, 2020

1 commit

  • In some rare cases, for input data over 32 KB, lzo-rle could encode two
    different inputs to the same compressed representation, so that
    decompression is then ambiguous (i.e. data may be corrupted - although
    zram is not affected because it operates over 4 KB pages).

    This modifies the compressor without changing the decompressor or the
    bitstream format, such that:

    - there is no change to how data produced by the old compressor is
    decompressed

    - an old decompressor will correctly decode data from the updated
    compressor

    - performance and compression ratio are not affected

    - we avoid introducing a new bitstream format

    In testing over 12.8M real-world files totalling 903 GB, three files
    were affected by this bug. I also constructed 37M semi-random 64 KB
    files totalling 2.27 TB, and saw no affected files. Finally I tested
    over files constructed to contain each of the ~1024 possible bad input
    sequences; for all of these cases, updated lzo-rle worked correctly.

    There is no significant impact to performance or compression ratio.

    Signed-off-by: Dave Rodgman
    Signed-off-by: Andrew Morton
    Cc: Mark Rutland
    Cc: Dave Rodgman
    Cc: Willy Tarreau
    Cc: Sergey Senozhatsky
    Cc: Markus F.X.J. Oberhumer
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Chao Yu
    Cc:
    Link: http://lkml.kernel.org/r/20200507100203.29785-1-dave.rodgman@arm.com
    Signed-off-by: Linus Torvalds

    Dave Rodgman
     

26 Sep, 2019

1 commit

  • Fix an unaligned access which breaks on platforms where this is not
    permitted (e.g., Sparc).

    Link: http://lkml.kernel.org/r/20190912145502.35229-1-dave.rodgman@arm.com
    Signed-off-by: Dave Rodgman
    Cc: Dave Rodgman
    Cc: Markus F.X.J. Oberhumer
    Cc: Minchan Kim
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Rodgman
     

21 May, 2019

2 commits


06 Apr, 2019

1 commit

  • For very short input data (0 - 1 bytes), lzo-rle was not behaving
    correctly. Fix this behaviour and update documentation accordingly.

    For zero-length input, lzo v0 outputs an end-of-stream marker only,
    which was misinterpreted by lzo-rle as a bitstream version number.
    Ensure bitstream versions > 0 require a minimum stream length of 5.

    Also fixes a bug in handling the tail for very short inputs when a
    bitstream version is present.

    Link: http://lkml.kernel.org/r/20190326165857.34613-1-dave.rodgman@arm.com
    Signed-off-by: Dave Rodgman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Rodgman
     

08 Mar, 2019

5 commits

  • To prevent any issues with persistent data, separate lzo-rle from lzo so
    that it is treated as a separate algorithm, and lzo is still available.

    Link: http://lkml.kernel.org/r/20190205155944.16007-3-dave.rodgman@arm.com
    Signed-off-by: Dave Rodgman
    Cc: David S. Miller
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: Markus F.X.J. Oberhumer
    Cc: Matt Sealey
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Richard Purdie
    Cc: Sergey Senozhatsky
    Cc: Sonny Rao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Rodgman
     
  • Patch series "lib/lzo: run-length encoding support", v5.

    Following on from the previous lzo-rle patchset:

    https://lkml.org/lkml/2018/11/30/972

    This patchset contains only the RLE patches, and should be applied on
    top of the non-RLE patches ( https://lkml.org/lkml/2019/2/5/366 ).

    Previously, some questions were raised around the RLE patches. I've
    done some additional benchmarking to answer these questions. In short:

    - RLE offers significant additional performance (data-dependent)

    - I didn't measure any regressions that were clearly outside the noise

    One concern with this patchset was around performance - specifically,
    measuring RLE impact separately from Matt Sealey's patches (CTZ & fast
    copy). I have done some additional benchmarking which I hope clarifies
    the benefits of each part of the patchset.

    Firstly, I've captured some memory via /dev/fmem from a Chromebook with
    many tabs open which is starting to swap, and then split this into 4178
    4k pages. I've excluded the all-zero pages (as zram does), and also the
    no-zero pages (which won't tell us anything about RLE performance).
    This should give a realistic test dataset for zram. What I found was
    that the data is VERY bimodal: 44% of pages in this dataset contain 5%
    or fewer zeros, and 44% contain over 90% zeros (30% if you include the
    no-zero pages). This supports the idea of special-casing zeros in zram.

    Next, I've benchmarked four variants of lzo on these pages (on 64-bit
    Arm at max frequency): baseline LZO; baseline + Matt Sealey's patches
    (aka MS); baseline + RLE only; baseline + MS + RLE. Numbers are for
    weighted roundtrip throughput (the weighting reflects that zram does
    more compression than decompression).

    https://drive.google.com/file/d/1VLtLjRVxgUNuWFOxaGPwJYhl_hMQXpHe/view?usp=sharing

    Matt's patches help in all cases for Arm (and no effect on Intel), as
    expected.

    RLE also behaves as expected: with few zeros present, it makes no
    difference; above ~75%, it gives a good improvement (50 - 300 MB/s on
    top of the benefit from Matt's patches).

    Best performance is seen with both MS and RLE patches.

    Finally, I have benchmarked the same dataset on an x86-64 device. Here,
    the MS patches make no difference (as expected); RLE helps, similarly as
    on Arm. There were no definite regressions; allowing for observational
    error, 0.1% (3/4178) of cases had a regression > 1 standard deviation,
    of which the largest was 4.6% (1.2 standard deviations). I think this
    is probably within the noise.

    https://drive.google.com/file/d/1xCUVwmiGD0heEMx5gcVEmLBI4eLaageV/view?usp=sharing

    One point to note is that the graphs show RLE appears to help very
    slightly with no zeros present! This is because the extra code causes
    the clang optimiser to change code layout in a way that happens to have
    a significant benefit. Taking baseline LZO and adding a do-nothing line
    like "__builtin_prefetch(out_len);" immediately before the "goto next"
    has the same effect. So this is a real, but basically spurious effect -
    it's small enough not to upset the overall findings.

    This patch (of 3):

    When using zram, we frequently encounter long runs of zero bytes. This
    adds a special case which identifies runs of zeros and encodes them
    using run-length encoding.

    This is faster for both compression and decompresion. For high-entropy
    data which doesn't hit this case, impact is minimal.

    Compression ratio is within a few percent in all cases.

    This modifies the bitstream in a way which is backwards compatible
    (i.e., we can decompress old bitstreams, but old versions of lzo cannot
    decompress new bitstreams).

    Link: http://lkml.kernel.org/r/20190205155944.16007-2-dave.rodgman@arm.com
    Signed-off-by: Dave Rodgman
    Cc: David S. Miller
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: Markus F.X.J. Oberhumer
    Cc: Matt Sealey
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Richard Purdie
    Cc: Sergey Senozhatsky
    Cc: Sonny Rao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Rodgman
     
  • Enable faster 8-byte copies on arm64.

    Link: http://lkml.kernel.org/r/20181127161913.23863-6-dave.rodgman@arm.com
    Link: http://lkml.kernel.org/r/20190205141950.9058-4-dave.rodgman@arm.com
    Signed-off-by: Matt Sealey
    Signed-off-by: Dave Rodgman
    Cc: David S. Miller
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: Markus F.X.J. Oberhumer
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Richard Purdie
    Cc: Sergey Senozhatsky
    Cc: Sonny Rao
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Sealey
     
  • LZO leaves some performance on the table by not realising that arm64 can
    optimize count-trailing-zeros bit operations.

    Add CONFIG_ARM64 to the checked definitions alongside CONFIG_X86_64 to
    enable the use of rbit/clz instructions on full 64-bit quantities.

    Link: http://lkml.kernel.org/r/20181127161913.23863-5-dave.rodgman@arm.com
    Link: http://lkml.kernel.org/r/20190205141950.9058-3-dave.rodgman@arm.com
    Signed-off-by: Matt Sealey
    Signed-off-by: Dave Rodgman
    Cc: David S. Miller
    Cc: Greg Kroah-Hartman
    Cc: Herbert Xu
    Cc: Markus F.X.J. Oberhumer
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Cc: Richard Purdie
    Cc: Sergey Senozhatsky
    Cc: Sonny Rao
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matt Sealey
     
  • Patch series "lib/lzo: performance improvements", v5.

    This patch (of 3):

    Modify the ifdefs in lzodefs.h to be more consistent with normal kernel
    macros (e.g., change __aarch64__ to CONFIG_ARM64).

    Link: http://lkml.kernel.org/r/20190205141950.9058-2-dave.rodgman@arm.com
    Signed-off-by: Dave Rodgman
    Cc: Herbert Xu
    Cc: David S. Miller
    Cc: Nitin Gupta
    Cc: Richard Purdie
    Cc: Markus F.X.J. Oberhumer
    Cc: Minchan Kim
    Cc: Sergey Senozhatsky
    Cc: Sonny Rao
    Cc: Greg Kroah-Hartman
    Cc: Matt Sealey
    Cc: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Rodgman
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

28 Sep, 2014

2 commits

  • This fix ensures that we never meet an integer overflow while adding
    255 while parsing a variable length encoding. It works differently from
    commit 206a81c ("lzo: properly check for overruns") because instead of
    ensuring that we don't overrun the input, which is tricky to guarantee
    due to many assumptions in the code, it simply checks that the cumulated
    number of 255 read cannot overflow by bounding this number.

    The MAX_255_COUNT is the maximum number of times we can add 255 to a base
    count without overflowing an integer. The multiply will overflow when
    multiplying 255 by more than MAXINT/255. The sum will overflow earlier
    depending on the base count. Since the base count is taken from a u8
    and a few bits, it is safe to assume that it will always be lower than
    or equal to 2*255, thus we can always prevent any overflow by accepting
    two less 255 steps.

    This patch also reduces the CPU overhead and actually increases performance
    by 1.1% compared to the initial code, while the previous fix costs 3.1%
    (measured on x86_64).

    The fix needs to be backported to all currently supported stable kernels.

    Reported-by: Willem Pinckaers
    Cc: "Don A. Bailey"
    Cc: stable
    Signed-off-by: Willy Tarreau
    Signed-off-by: Greg Kroah-Hartman

    Willy Tarreau
     
  • This reverts commit 206a81c ("lzo: properly check for overruns").

    As analysed by Willem Pinckaers, this fix is still incomplete on
    certain rare corner cases, and it is easier to restart from the
    original code.

    Reported-by: Willem Pinckaers
    Cc: "Don A. Bailey"
    Cc: stable
    Signed-off-by: Willy Tarreau
    Signed-off-by: Greg Kroah-Hartman

    Willy Tarreau
     

24 Jun, 2014

1 commit

  • The lzo decompressor can, if given some really crazy data, possibly
    overrun some variable types. Modify the checking logic to properly
    detect overruns before they happen.

    Reported-by: "Don A. Bailey"
    Tested-by: "Don A. Bailey"
    Cc: stable
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

21 Feb, 2013

2 commits


12 Jan, 2010

1 commit

  • This patch series adds generic support for creating and extracting
    LZO-compressed kernel images, as well as support for using such images on
    the x86 and ARM architectures, and support for creating and using
    LZO-compressed initrd and initramfs images.

    Russell King said:

    : Testing on a Cortex A9 model:
    : - lzo decompressor is 65% of the time gzip takes to decompress a kernel
    : - lzo kernel is 9% larger than a gzip kernel
    :
    : which I'm happy to say confirms your figures when comparing the two.
    :
    : However, when comparing your new gzip code to the old gzip code:
    : - new is 99% of the size of the old code
    : - new takes 42% of the time to decompress than the old code
    :
    : What this means is that for a proper comparison, the results get even better:
    : - lzo is 7.5% larger than the old gzip'd kernel image
    : - lzo takes 28% of the time that the old gzip code took
    :
    : So the expense seems definitely worth the effort. The only reason I
    : can think of ever using gzip would be if you needed the additional
    : compression (eg, because you have limited flash to store the image.)
    :
    : I would argue that the default for ARM should therefore be LZO.

    This patch:

    The lzo compressor is worse than gzip at compression, but faster at
    extraction. Here are some figures for an ARM board I'm working on:

    Uncompressed size: 3.24Mo
    gzip 1.61Mo 0.72s
    lzo 1.75Mo 0.48s

    So for a compression ratio that is still relatively close to gzip, it's
    much faster to extract, at least in that case.

    This part contains:
    - Makefile routine to support lzo compression
    - Fixes to the existing lzo compressor so that it can be used in
    compressed kernels
    - wrapper around the existing lzo1x_decompress, as it only extracts one
    block at a time, while we need to extract a whole file here
    - config dialog for kernel compression

    [akpm@linux-foundation.org: coding-style fixes]
    [akpm@linux-foundation.org: cleanup]
    Signed-off-by: Albin Tonnerre
    Tested-by: Wu Zhangjin
    Acked-by: "H. Peter Anvin"
    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Tested-by: Russell King
    Acked-by: Russell King
    Cc: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Albin Tonnerre
     

26 Jul, 2008

1 commit


11 Apr, 2008

1 commit

  • Shift of a LE value seems strange, probably meant to shift the cpu-order
    variable as in the prvious section of the switch statement.

    Signed-off-by: Harvey Harrison
    Acked-by: Richard Purdie
    Signed-off-by: Linus Torvalds

    Harvey Harrison
     

01 Aug, 2007

1 commit

  • Add some casts to the LZO compression algorithm after they were removed
    during cleanup and shouldn't have been.

    Signed-off-by: Richard Purdie
    Cc: Edward Shishkin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Richard Purdie
     

11 Jul, 2007

1 commit

  • This is a hybrid version of the patch to add the LZO1X compression
    algorithm to the kernel. Nitin and myself have merged the best parts of
    the various patches to form this version which we're both happy with (and
    are jointly signing off).

    The performance of this version is equivalent to the original minilzo code
    it was based on. Bytecode comparisons have also been made on ARM, i386 and
    x86_64 with favourable results.

    There are several users of LZO lined up including jffs2, crypto and reiser4
    since its much faster than zlib.

    Signed-off-by: Nitin Gupta
    Signed-off-by: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Richard Purdie