02 May, 2011

2 commits


22 Dec, 2010

1 commit


19 Sep, 2009

1 commit


12 Sep, 2009

2 commits

  • Data COW means that whenever we write to a file, we replace any old
    extent pointers with new ones. There was a window where a readpage
    might find the old extent pointers on disk and cache them in the
    extent_map tree in ram in the middle of a given write replacing them.

    Even though both the readpage and the write had their respective bytes
    in the file locked, the extent readpage inserts may cover more bytes than
    it had locked down.

    This commit closes the race by keeping the new extent pinned in the extent
    map tree until after the on-disk btree is properly setup with the new
    extent pointers.

    Signed-off-by: Chris Mason

    Chris Mason
     
  • There are two main users of the extent_map tree. The
    first is regular file inodes, where it is evenly spread
    between readers and writers.

    The second is the chunk allocation tree, which maps blocks from
    logical addresses to phyiscal ones, and it is 99.99% reads.

    The mapping tree is a point of lock contention during heavy IO
    workloads, so this commit switches things to a rw lock.

    Signed-off-by: Chris Mason

    Chris Mason
     

10 Nov, 2008

1 commit

  • The decompress code doesn't take the logical offset in extent
    pointer into account. If the logical offset isn't zero, data
    will be decompressed into wrong pages.

    The solution used here is to record the starting offset of the extent
    in the file separately from the logical start of the extent_map struct.
    This allows us to avoid problems inserting overlapping extents.

    Signed-off-by: Yan Zheng

    Yan Zheng
     

31 Oct, 2008

2 commits

  • This patch updates btrfs-progs for fallocate support.

    fallocate is a little different in Btrfs because we need to tell the
    COW system that a given preallocated extent doesn't need to be
    cow'd as long as there are no snapshots of it. This leverages the
    -o nodatacow checks.

    Signed-off-by: Yan Zheng

    Yan Zheng
     
  • This patch splits the hole insertion code out of btrfs_setattr
    into btrfs_cont_expand and updates btrfs_get_extent to properly
    handle the case that file extent items are not continuous.

    Signed-off-by: Yan Zheng

    Yan Zheng
     

30 Oct, 2008

1 commit

  • This is a large change for adding compression on reading and writing,
    both for inline and regular extents. It does some fairly large
    surgery to the writeback paths.

    Compression is off by default and enabled by mount -o compress. Even
    when the -o compress mount option is not used, it is possible to read
    compressed extents off the disk.

    If compression for a given set of pages fails to make them smaller, the
    file is flagged to avoid future compression attempts later.

    * While finding delalloc extents, the pages are locked before being sent down
    to the delalloc handler. This allows the delalloc handler to do complex things
    such as cleaning the pages, marking them writeback and starting IO on their
    behalf.

    * Inline extents are inserted at delalloc time now. This allows us to compress
    the data before inserting the inline extent, and it allows us to insert
    an inline extent that spans multiple pages.

    * All of the in-memory extent representations (extent_map.c, ordered-data.c etc)
    are changed to record both an in-memory size and an on disk size, as well
    as a flag for compression.

    From a disk format point of view, the extent pointers in the file are changed
    to record the on disk size of a given extent and some encoding flags.
    Space in the disk format is allocated for compression encoding, as well
    as encryption and a generic 'other' field. Neither the encryption or the
    'other' field are currently used.

    In order to limit the amount of data read for a single random read in the
    file, the size of a compressed extent is limited to 128k. This is a
    software only limit, the disk format supports u64 sized compressed extents.

    In order to limit the ram consumed while processing extents, the uncompressed
    size of a compressed extent is limited to 256k. This is a software only limit
    and will be subject to tuning later.

    Checksumming is still done on compressed extents, and it is done on the
    uncompressed version of the data. This way additional encodings can be
    layered on without having to figure out which encoding to checksum.

    Compression happens at delalloc time, which is basically singled threaded because
    it is usually done by a single pdflush thread. This makes it tricky to
    spread the compression load across all the cpus on the box. We'll have to
    look at parallel pdflush walks of dirty inodes at a later time.

    Decompression is hooked into readpages and it does spread across CPUs nicely.

    Signed-off-by: Chris Mason

    Chris Mason
     

25 Sep, 2008

24 commits


11 Sep, 2007

2 commits

  • XFS updates the ondisk inode size only after the data I/O has finished,
    so it needs a hook when the writepage end_bio handler has finished.

    Might not be worth applying as-is as the per-page callback is very
    ineffcient. What XFS really wants is a callback when writeout of a
    whole extent has completed. This delayed i_size updates scheme might
    be worthwile for btrfs aswell, btw.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Chris Mason

    Christoph Hellwig
     
  • generic_bmap is completely trivial, while the extent to bh mapping in
    btrfs is rather complex. So provide a extent_bmap instead that takes
    a get_extent callback and can be used by filesystem using the extent_map
    code.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Chris Mason

    Christoph Hellwig
     

30 Aug, 2007

1 commit


28 Aug, 2007

2 commits