12 Oct, 2018

1 commit

  • The previous patch introduced very large kernel stack usage and a Makefile
    change to hide the warning about it.

    From what I can tell, a number of things went wrong here:

    - The BCH_MAX_T constant was set to the maximum value for 'n',
    not the maximum for 't', which is much smaller.

    - The stack usage is actually larger than the entire kernel stack
    on some architectures that can use 4KB stacks (m68k, sh, c6x), which
    leads to an immediate overrun.

    - The justification in the patch description claimed that nothing
    changed, however that is not the case even without the two points above:
    the configuration is machine specific, and most boards never use the
    maximum BCH_ECC_WORDS() length but instead have something much smaller.
    That maximum would only apply to machines that use both the maximum
    block size and the maximum ECC strength.

    The largest value for 't' that I could find is '32', which in turn leads
    to a 60 byte array instead of 2048 bytes. Making it '64' for future
    extension seems also worthwhile, with 120 bytes for the array. Anything
    larger won't fit into the OOB area on NAND flash.

    With that changed, the warning can be enabled again.

    Only linux-4.19+ contains the breakage, so this is only needed
    as a stable backport if it does not make it into the release.

    Fixes: 02361bc77888 ("lib/bch: Remove VLA usage")
    Reported-by: Ard Biesheuvel
    Cc: stable@vger.kernel.org
    Signed-off-by: Arnd Bergmann
    Signed-off-by: Boris Brezillon

    Arnd Bergmann
     

22 Jun, 2018

1 commit

  • In the quest to remove all stack VLA usage from the kernel[1], this
    allocates a fixed size stack array to cover the range needed for
    bch. This was done instead of a preallocation on the SLAB due to
    performance reasons, shown by Ivan Djelic:

    little-endian, type sizes: int=4 long=8 longlong=8
    cpu: Intel(R) Core(TM) i5 CPU         650  @ 3.20GHz
    calibration: iter=4.9143µs niter=2034 nsamples=200 m=13 t=4

      Buffer allocation |  Encoding throughput (Mbit/s)
    ---------------------------------------------------
     on-stack, VLA      |   3988
     on-stack, fixed    |   4494
     kmalloc            |   1967

    So this change actually improves performance too, it seems.

    The resulting stack allocation can get rather large; without
    CONFIG_BCH_CONST_PARAMS, it will allocate 4096 bytes, which
    trips the stack size checking:

    lib/bch.c: In function ‘encode_bch’:
    lib/bch.c:261:1: warning: the frame size of 4432 bytes is larger than 2048 bytes [-Wframe-larger-than=]

    Even the default case for "allmodconfig" (with CONFIG_BCH_CONST_M=14 and
    CONFIG_BCH_CONST_T=4) would have started throwing a warning:

    lib/bch.c: In function ‘encode_bch’:
    lib/bch.c:261:1: warning: the frame size of 2288 bytes is larger than 2048 bytes [-Wframe-larger-than=]

    But this is how large it's always been; it was just hidden from
    the checker because it was a VLA. So the Makefile has been adjusted to
    silence this warning for anything smaller than 4500 bytes, which should
    provide room for normal cases, but still low enough to catch any future
    pathological situations.

    [1] https://lkml.kernel.org/r/CA+55aFzCG-zNmZwX4A2FQpadafLfEzK6CC=qPXydAacU1RqZWA@mail.gmail.com

    Signed-off-by: Kees Cook
    Reviewed-by: Ivan Djelic
    Tested-by: Ivan Djelic
    Acked-by: Boris Brezillon
    Signed-off-by: Boris Brezillon

    Kees Cook
     

11 Mar, 2011

1 commit

  • This is a new software BCH encoding/decoding library, similar to the shared
    Reed-Solomon library.

    Binary BCH (Bose-Chaudhuri-Hocquenghem) codes are widely used to correct
    errors in NAND flash devices requiring more than 1-bit ecc correction; they
    are generally better suited for NAND flash than RS codes because NAND bit
    errors do not occur in bursts. Latest SLC NAND devices typically require at
    least 4-bit ecc protection per 512 bytes block.

    This library provides software encoding/decoding, but may also be used with
    ASIC/SoC hardware BCH engines to perform error correction. It is being
    currently used for this purpose on an OMAP3630 board (4bit/8bit HW BCH). It
    has also been used to decode raw dumps of NAND devices with on-die BCH ecc
    engines (e.g. Micron 4bit ecc SLC devices).

    Latest NAND devices (including SLC) can exhibit high error rates (typically
    a dozen or more bitflips per hour during stress tests); in order to
    minimize the performance impact of error correction, this library
    implements recently developed algorithms for fast polynomial root finding
    (see bch.c header for details) instead of the traditional exhaustive Chien
    root search; a few performance figures are provided below:

    Platform: arm926ejs @ 468 MHz, 32 KiB icache, 16 KiB dcache
    BCH ecc : 4-bit per 512 bytes

    Encoding average throughput: 250 Mbits/s

    Error correction time (compared with Chien search):

    average worst average (Chien) worst (Chien)
    ----------------------------------------------------------
    1 bit 8.5 µs 11 µs 200 µs 383 µs
    2 bit 9.7 µs 12.5 µs 477 µs 728 µs
    3 bit 18.1 µs 20.6 µs 758 µs 1010 µs
    4 bit 19.5 µs 23 µs 1028 µs 1280 µs

    In the above figures, "worst" is meant in terms of error pattern, not in
    terms of cache miss / page faults effects (not taken into account here).

    The library has been extensively tested on the following platforms: x86,
    x86_64, arm926ejs, omap3630, qemu-ppc64, qemu-mips.

    Signed-off-by: Ivan Djelic
    Signed-off-by: David Woodhouse

    Ivan Djelic