15 Sep, 2017

1 commit

  • Pull zstd support from Chris Mason:
    "Nick Terrell's patch series to add zstd support to the kernel has been
    floating around for a while. After talking with Dave Sterba, Herbert
    and Phillip, we decided to send the whole thing in as one pull
    request.

    zstd is a big win in speed over zlib and in compression ratio over
    lzo, and the compression team here at FB has gotten great results
    using it in production. Nick will continue to update the kernel side
    with new improvements from the open source zstd userland code.

    Nick has a number of benchmarks for the main zstd code in his lib/zstd
    commit:

    I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB
    of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel
    Core i7 processor, 16 GB of RAM, and a SSD. I benchmarked using
    `silesia.tar` [3], which is 211,988,480 B large. Run the following
    commands for the benchmark:

    sudo modprobe zstd_compress_test
    sudo mknod zstd_compress_test c 245 0
    sudo cp silesia.tar zstd_compress_test

    The time is reported by the time of the userland `cp`.
    The MB/s is computed with

    1,536,217,008 B / time(buffer size, hash)

    which includes the time to copy from userland.
    The Adjusted MB/s is computed with

    1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).

    The memory reported is the amount of memory the compressor
    requests.

    | Method | Size (B) | Time (s) | Ratio | MB/s | Adj MB/s | Mem (MB) |
    |----------|----------|----------|-------|---------|----------|----------|
    | none | 11988480 | 0.100 | 1 | 2119.88 | - | - |
    | zstd -1 | 73645762 | 1.044 | 2.878 | 203.05 | 224.56 | 1.23 |
    | zstd -3 | 66988878 | 1.761 | 3.165 | 120.38 | 127.63 | 2.47 |
    | zstd -5 | 65001259 | 2.563 | 3.261 | 82.71 | 86.07 | 2.86 |
    | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 | 16.13 | 13.22 |
    | zstd -15 | 58009756 | 47.601 | 3.654 | 4.45 | 4.46 | 21.61 |
    | zstd -19 | 54014593 | 102.835 | 3.925 | 2.06 | 2.06 | 60.15 |
    | zlib -1 | 77260026 | 2.895 | 2.744 | 73.23 | 75.85 | 0.27 |
    | zlib -3 | 72972206 | 4.116 | 2.905 | 51.50 | 52.79 | 0.27 |
    | zlib -6 | 68190360 | 9.633 | 3.109 | 22.01 | 22.24 | 0.27 |
    | zlib -9 | 67613382 | 22.554 | 3.135 | 9.40 | 9.44 | 0.27 |

    I benchmarked zstd decompression using the same method on the same
    machine. The benchmark file is located in the upstream zstd repo
    under `contrib/linux-kernel/zstd_decompress_test.c` [4]. The
    memory reported is the amount of memory required to decompress
    data compressed with the given compression level. If you know the
    maximum size of your input, you can reduce the memory usage of
    decompression irrespective of the compression level.

    | Method | Time (s) | MB/s | Adjusted MB/s | Memory (MB) |
    |----------|----------|---------|---------------|-------------|
    | none | 0.025 | 8479.54 | - | - |
    | zstd -1 | 0.358 | 592.15 | 636.60 | 0.84 |
    | zstd -3 | 0.396 | 535.32 | 571.40 | 1.46 |
    | zstd -5 | 0.396 | 535.32 | 571.40 | 1.46 |
    | zstd -10 | 0.374 | 566.81 | 607.42 | 2.51 |
    | zstd -15 | 0.379 | 559.34 | 598.84 | 4.61 |
    | zstd -19 | 0.412 | 514.54 | 547.77 | 8.80 |
    | zlib -1 | 0.940 | 225.52 | 231.68 | 0.04 |
    | zlib -3 | 0.883 | 240.08 | 247.07 | 0.04 |
    | zlib -6 | 0.844 | 251.17 | 258.84 | 0.04 |
    | zlib -9 | 0.837 | 253.27 | 287.64 | 0.04 |

    I ran a long series of tests and benchmarks on the btrfs side and the
    gains are very similar to the core benchmarks Nick ran"

    * 'zstd-minimal' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs:
    squashfs: Add zstd support
    btrfs: Add zstd support
    lib: Add zstd modules
    lib: Add xxhash module

    Linus Torvalds
     

18 Aug, 2017

1 commit


16 Aug, 2017

2 commits

  • Add skeleton code for compresison heuristics. Now it iterates over all
    the pages, but in the end always says "yes, compress please", ie it does
    not change the current behaviour.

    In the future we're going to add various heuristics to analyze the data.
    This patch can be used as a baseline for measuring if the effectivness
    and performance.

    Signed-off-by: Timofey Titovets
    Reviewed-by: David Sterba
    [ enhanced changelog, modified comments ]
    Signed-off-by: David Sterba

    Timofey Titovets
     
  • Add zstd compression and decompression support to BtrFS. zstd at its
    fastest level compresses almost as well as zlib, while offering much
    faster compression and decompression, approaching lzo speeds.

    I benchmarked btrfs with zstd compression against no compression, lzo
    compression, and zlib compression. I benchmarked two scenarios. Copying
    a set of files to btrfs, and then reading the files. Copying a tarball
    to btrfs, extracting it to btrfs, and then reading the extracted files.
    After every operation, I call `sync` and include the sync time.
    Between every pair of operations I unmount and remount the filesystem
    to avoid caching. The benchmark files can be found in the upstream
    zstd source repository under
    `contrib/linux-kernel/{btrfs-benchmark.sh,btrfs-extract-benchmark.sh}`
    [1] [2].

    I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
    The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
    16 GB of RAM, and a SSD.

    The first compression benchmark is copying 10 copies of the unzipped
    Silesia corpus [3] into a BtrFS filesystem mounted with
    `-o compress-force=Method`. The decompression benchmark times how long
    it takes to `tar` all 10 copies into `/dev/null`. The compression ratio is
    measured by comparing the output of `df` and `du`. See the benchmark file
    [1] for details. I benchmarked multiple zstd compression levels, although
    the patch uses zstd level 1.

    | Method | Ratio | Compression MB/s | Decompression speed |
    |---------|-------|------------------|---------------------|
    | None | 0.99 | 504 | 686 |
    | lzo | 1.66 | 398 | 442 |
    | zlib | 2.58 | 65 | 241 |
    | zstd 1 | 2.57 | 260 | 383 |
    | zstd 3 | 2.71 | 174 | 408 |
    | zstd 6 | 2.87 | 70 | 398 |
    | zstd 9 | 2.92 | 43 | 406 |
    | zstd 12 | 2.93 | 21 | 408 |
    | zstd 15 | 3.01 | 11 | 354 |

    The next benchmark first copies `linux-4.11.6.tar` [4] to btrfs. Then it
    measures the compression ratio, extracts the tar, and deletes the tar.
    Then it measures the compression ratio again, and `tar`s the extracted
    files into `/dev/null`. See the benchmark file [2] for details.

    | Method | Tar Ratio | Extract Ratio | Copy (s) | Extract (s)| Read (s) |
    |--------|-----------|---------------|----------|------------|----------|
    | None | 0.97 | 0.78 | 0.981 | 5.501 | 8.807 |
    | lzo | 2.06 | 1.38 | 1.631 | 8.458 | 8.585 |
    | zlib | 3.40 | 1.86 | 7.750 | 21.544 | 11.744 |
    | zstd 1 | 3.57 | 1.85 | 2.579 | 11.479 | 9.389 |

    [1] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/btrfs-benchmark.sh
    [2] https://github.com/facebook/zstd/blob/dev/contrib/linux-kernel/btrfs-extract-benchmark.sh
    [3] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
    [4] https://cdn.kernel.org/pub/linux/kernel/v4.x/linux-4.11.6.tar.xz

    zstd source repository: https://github.com/facebook/zstd

    Signed-off-by: Nick Terrell
    Signed-off-by: Chris Mason

    Nick Terrell
     

06 Jul, 2017

1 commit

  • Pull btrfs updates from David Sterba:
    "The core updates improve error handling (mostly related to bios), with
    the usual incremental work on the GFP_NOFS (mis)use removal,
    refactoring or cleanups. Except the two top patches, all have been in
    for-next for an extensive amount of time.

    User visible changes:

    - statx support

    - quota override tunable

    - improved compression thresholds

    - obsoleted mount option alloc_start

    Core updates:

    - bio-related updates:
    - faster bio cloning
    - no allocation failures
    - preallocated flush bios

    - more kvzalloc use, memalloc_nofs protections, GFP_NOFS updates

    - prep work for btree_inode removal

    - dir-item validation

    - qgoup fixes and updates

    - cleanups:
    - removed unused struct members, unused code, refactoring
    - argument refactoring (fs_info/root, caller -> callee sink)
    - SEARCH_TREE ioctl docs"

    * 'for-4.13-part1' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (115 commits)
    btrfs: Remove false alert when fiemap range is smaller than on-disk extent
    btrfs: Don't clear SGID when inheriting ACLs
    btrfs: fix integer overflow in calc_reclaim_items_nr
    btrfs: scrub: fix target device intialization while setting up scrub context
    btrfs: qgroup: Fix qgroup reserved space underflow by only freeing reserved ranges
    btrfs: qgroup: Introduce extent changeset for qgroup reserve functions
    btrfs: qgroup: Fix qgroup reserved space underflow caused by buffered write and quotas being enabled
    btrfs: qgroup: Return actually freed bytes for qgroup release or free data
    btrfs: qgroup: Cleanup btrfs_qgroup_prepare_account_extents function
    btrfs: qgroup: Add quick exit for non-fs extents
    Btrfs: rework delayed ref total_bytes_pinned accounting
    Btrfs: return old and new total ref mods when adding delayed refs
    Btrfs: always account pinned bytes when dropping a tree block ref
    Btrfs: update total_bytes_pinned when pinning down extents
    Btrfs: make BUG_ON() in add_pinned_bytes() an ASSERT()
    Btrfs: make add_pinned_bytes() take an s64 num_bytes instead of u64
    btrfs: fix validation of XATTR_ITEM dir items
    btrfs: Verify dir_item in iterate_object_props
    btrfs: Check name_len before in btrfs_del_root_ref
    btrfs: Check name_len before reading btrfs_get_name
    ...

    Linus Torvalds
     

20 Jun, 2017

1 commit


09 Jun, 2017

1 commit

  • Replace bi_error with a new bi_status to allow for a clear conversion.
    Note that device mapper overloaded bi_error with a private value, which
    we'll have to keep arround at least for now and thus propagate to a
    proper blk_status_t value.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

28 Feb, 2017

5 commits


30 Nov, 2016

1 commit


12 Mar, 2016

1 commit


17 Feb, 2015

1 commit


01 Dec, 2014

1 commit

  • Don Bailey noticed that our page zeroing for compression at end-io time
    isn't complete. This reworks a patch from Linus to push the zeroing
    into the zlib and lzo specific functions instead of trying to handle the
    corners inside btrfs_decompress_buf2page

    Signed-off-by: Chris Mason
    Reviewed-by: Josef Bacik
    Reported-by: Don A. Bailey
    cc: stable@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Chris Mason
     

07 May, 2013

1 commit

  • Big patch, but all it does is add statics to functions which
    are in fact static, then remove the associated dead-code fallout.

    removed functions:

    btrfs_iref_to_path()
    __btrfs_lookup_delayed_deletion_item()
    __btrfs_search_delayed_insertion_item()
    __btrfs_search_delayed_deletion_item()
    find_eb_for_page()
    btrfs_find_block_group()
    range_straddles_pages()
    extent_range_uptodate()
    btrfs_file_extent_length()
    btrfs_scrub_cancel_devid()
    btrfs_start_transaction_lflush()

    btrfs_print_tree() is left because it is used for debugging.
    btrfs_start_transaction_lflush() and btrfs_reada_detach() are
    left for symmetry.

    ulist.c functions are left, another patch will take care of those.

    Signed-off-by: Eric Sandeen
    Signed-off-by: Josef Bacik

    Eric Sandeen
     

22 Mar, 2012

1 commit


02 May, 2011

1 commit


22 Dec, 2010

3 commits

  • Add a common function to copy decompressed data from working buffer
    to bio pages.

    Signed-off-by: Li Zefan

    Li Zefan
     
  • Lzo is a much faster compression algorithm than gzib, so would allow
    more users to enable transparent compression, and some users can
    choose from compression ratio and speed for different applications

    Usage:

    # mount -t btrfs -o compress[=] dev /mnt
    or
    # mount -t btrfs -o compress-force[=] dev /mnt

    "-o compress" without argument is still allowed for compatability.

    Compatibility:

    If we mount a filesystem with lzo compression, it will not be able be
    mounted in old kernels. One reason is, otherwise btrfs will directly
    dump compressed data, which sits in inline extent, to user.

    Performance:

    The test copied a linux source tarball (~400M) from an ext4 partition
    to the btrfs partition, and then extracted it.

    (time in second)
    lzo zlib nocompress
    copy: 10.6 21.7 14.9
    extract: 70.1 94.4 66.6

    (data size in MB)
    lzo zlib nocompress
    copy: 185.87 108.69 394.49
    extract: 193.80 132.36 381.21

    Changelog:

    v1 -> v2:
    - Select LZO_COMPRESS and LZO_DECOMPRESS in btrfs Kconfig.
    - Add incompability flag.
    - Fix error handling in compress code.

    Signed-off-by: Li Zefan

    Li Zefan
     
  • Make the code aware of compression type, instead of always assuming
    zlib compression.

    Also make the zlib workspace function as common code for all
    compression types.

    Signed-off-by: Li Zefan

    Li Zefan
     

30 Oct, 2008

1 commit

  • This is a large change for adding compression on reading and writing,
    both for inline and regular extents. It does some fairly large
    surgery to the writeback paths.

    Compression is off by default and enabled by mount -o compress. Even
    when the -o compress mount option is not used, it is possible to read
    compressed extents off the disk.

    If compression for a given set of pages fails to make them smaller, the
    file is flagged to avoid future compression attempts later.

    * While finding delalloc extents, the pages are locked before being sent down
    to the delalloc handler. This allows the delalloc handler to do complex things
    such as cleaning the pages, marking them writeback and starting IO on their
    behalf.

    * Inline extents are inserted at delalloc time now. This allows us to compress
    the data before inserting the inline extent, and it allows us to insert
    an inline extent that spans multiple pages.

    * All of the in-memory extent representations (extent_map.c, ordered-data.c etc)
    are changed to record both an in-memory size and an on disk size, as well
    as a flag for compression.

    From a disk format point of view, the extent pointers in the file are changed
    to record the on disk size of a given extent and some encoding flags.
    Space in the disk format is allocated for compression encoding, as well
    as encryption and a generic 'other' field. Neither the encryption or the
    'other' field are currently used.

    In order to limit the amount of data read for a single random read in the
    file, the size of a compressed extent is limited to 128k. This is a
    software only limit, the disk format supports u64 sized compressed extents.

    In order to limit the ram consumed while processing extents, the uncompressed
    size of a compressed extent is limited to 256k. This is a software only limit
    and will be subject to tuning later.

    Checksumming is still done on compressed extents, and it is done on the
    uncompressed version of the data. This way additional encodings can be
    layered on without having to figure out which encoding to checksum.

    Compression happens at delalloc time, which is basically singled threaded because
    it is usually done by a single pdflush thread. This makes it tricky to
    spread the compression load across all the cpus on the box. We'll have to
    look at parallel pdflush walks of dirty inodes at a later time.

    Decompression is hooked into readpages and it does spread across CPUs nicely.

    Signed-off-by: Chris Mason

    Chris Mason