24 Dec, 2009

1 commit


08 Dec, 2009

1 commit


04 Dec, 2009

1 commit

  • That is "success", "unknown", "through", "performance", "[re|un]mapping"
    , "access", "default", "reasonable", "[con]currently", "temperature"
    , "channel", "[un]used", "application", "example","hierarchy", "therefore"
    , "[over|under]flow", "contiguous", "threshold", "enough" and others.

    Signed-off-by: André Goddard Rosa
    Signed-off-by: Jiri Kosina

    André Goddard Rosa
     

03 Dec, 2009

2 commits

  • sparse check finds some endian problem and some other minor issues.
    There is an obsolete function which should be removed.
    So this patch resolve all these.

    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     
  • ocfs2 refcount tree is stored as an extent tree while
    the leaf ocfs2_refcount_rec points to a refcount block.

    The following step can trip a kernel panic.
    mkfs.ocfs2 -b 512 -C 1M --fs-features=refcount $DEVICE
    mount -t ocfs2 $DEVICE $MNT_DIR
    FILE_NAME=$RANDOM
    FILE_NAME_1=$RANDOM
    FILE_REF="${FILE_NAME}_ref"
    FILE_REF_1="${FILE_NAME}_ref_1"
    for((i=0;i> $MNT_DIR/$FILE_NAME
    cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME_1
    done
    for((i=0;i> $MNT_DIR/$FILE_NAME
    done

    for((i=0;i> $MNT_DIR/$FILE_NAME
    cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME_1
    done

    cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME

    for((i=0;i> $MNT_DIR/$FILE_NAME
    cat /mnt/1048576 >> $MNT_DIR/$FILE_NAME_1
    done
    reflink $MNT_DIR/$FILE_NAME $MNT_DIR/$FILE_REF
    # write_f is a program which will write some bytes to a file at offset.
    # write_f -f file_name -l offset -w write_bytes.
    ./write_f -f $MNT_DIR/$FILE_REF -l $[310*1048576] -w 4096
    ./write_f -f $MNT_DIR/$FILE_REF -l $[306*1048576] -w 4096
    ./write_f -f $MNT_DIR/$FILE_REF -l $[311*1048576] -w 4096
    ./write_f -f $MNT_DIR/$FILE_NAME -l $[310*1048576] -w 4096
    ./write_f -f $MNT_DIR/$FILE_NAME -l $[311*1048576] -w 4096
    reflink $MNT_DIR/$FILE_NAME $MNT_DIR/$FILE_REF_1
    ./write_f -f $MNT_DIR/$FILE_NAME -l $[311*1048576] -w 4096
    #kernel panic here.

    The reason is that if the ocfs2_extent_rec is the last record
    in a leaf extent block, the old solution fails to find the
    suitable end cpos. So this patch try to walk through the b-tree,
    find the next sub root and get the c_pos the next sub-tree starts
    from.

    btw, I have runned tristan's test case against the patched kernel
    for several days and this type of kernel panic never happens again.

    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     

29 Oct, 2009

2 commits


23 Sep, 2009

22 commits

  • The ioctl will take 3 parameters: old_path, new_path and
    preserve and call vfs_reflink. It is useful when we backport
    reflink features to old kernels.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • Implement ocfs2_reflink.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • reflink has 2 options for the destination file:
    1. snapshot: reflink will attempt to preserve ownership, permissions,
    and all other security state in order to create a full snapshot.
    2. new file: it will acquire the data extent sharing but will see the
    file's security state and attributes initialized as a new file.

    So add the option to ocfs2.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • Actually the whole reflink will touch refcount tree 2 times:
    1. It will add the clusters in the extent record to the tree if it
    isn't refcounted before.
    2. It will add 1 refcount to these clusters when it add these
    extent records to the tree.

    So actually we shouldn't do merge in the 1st operation since the 2nd
    one will soon be called and we may have to split it again. Do a merge
    first and split soon is a waste of time. So we only merge in the 2nd
    round. This is done by adding a new internal __ocfs2_increase_refcount
    and call it with "not-merge" for 1st refcount operation in reflink.

    This also has a side-effect that we don't need to worry too much about
    the metadata allocation in the 2nd round since it will only merge and
    no split will happen for those records.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • Signed-off-by: Tao Ma

    Tao Ma
     
  • Now with xattr refcount support, we need to check whether
    we have xattr refcounted before we remove the refcount tree.

    Now the mechanism is:
    1) Check whether i_clusters == 0, if no, exit.
    2) check whether we have i_xattr_loc in dinode. if yes, exit.
    2) Check whether we have inline xattr stored outside, if yes, exit.
    4) Remove the tree.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • In ocfs2, when xattr's value is larger than OCFS2_XATTR_INLINE_SIZE,
    it will be kept outside of the blocks we store xattr entry. And they
    are stored in a b-tree also. So this patch try to attach all these
    clusters to refcount tree also.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • In order to make 2 transcation(xattr and cow) independent with each other,
    we CoW the whole xattr out in case we are setting them.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • We currently use pagecache to duplicate clusters in CoW,
    but it isn't suitable for xattr case. So abstract it out
    so that the caller can decide which method it use.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • A reflink creates a snapshot of a file, that means the attributes
    must be identical except for three exceptions - nlink, ino, and ctime.

    As for time changes, Here is a brief description:

    1. Source file:
    1) atime: Ignore. Let the lazy atime code handle that.
    2) mtime: don't touch.
    3) ctime: If we change the tree (adding REFCOUNTED to at least one
    extent), update it.
    2. Destination file:
    1) atime: ignore.
    2) mtime: we want it to appear identical to the source.
    3) ctime: update.

    The idea here is that an ls -l will show the same time for the
    src and target - it shows mtime. Backup software like rsync and tar
    will treat the new file correctly too.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • 2 major functions are added in this patch.

    ocfs2_attach_refcount_tree will create a new refcount tree to the
    old file if it doesn't have one and insert all the extent records
    to the tree if they are not refcounted.

    ocfs2_create_reflink_node will:
    1. set the refcount tree to the new file.
    2. call ocfs2_duplicate_extent_list which will iterate all the
    extents for the old file, insert it to the new file and increase
    the corresponding referennce count.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • When we truncate a file to a specific size which resides in a reflinked
    cluster, we need to CoW it since ocfs2_zero_range_for_truncate will
    zero the space after the size(just another type of write).

    So we add a "max_cpos" in ocfs2_refcount_cow so that it will stop when
    it hit the max cluster offset.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • During CoW, if the old extent record is refcounted, we allocate
    som new clusters and do CoW. Actually we can have some improvement
    here. If the old extent has refcount=1, that means now it is only
    used by this file. So we don't need to allocate new clusters, just
    remove the refcounted flag and it is OK. We also have to remove
    it from the refcount tree while not deleting it.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • This patch try CoW support for a refcounted record.

    the whole process will be:
    1. Calculate how many clusters we need to CoW and where we start.
    Extents that are not completely encompassed by the write will
    be broken on 1MB boundaries.
    2. Do CoW for the clusters with the help of page cache.
    3. Change the b-tree structure with the new allocated clusters.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • Add 'Decrement refcount for delete' in to the normal truncate
    process. So for a refcounted extent record, call refcount rec
    decrementation instead of cluster free.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • Add function ocfs2_mark_extent_refcounted which can mark
    an extent refcounted.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • Given a physical cpos and length, decrement the refcount
    in the tree. If the refcount for any portion of the extent goes
    to zero, that portion is queued for freeing.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • Given a physical cpos and length, increment the refcount
    in the tree. If the extent has not been seen before, a refcount
    record is created for it. Refcount records may be merged or
    split by this operation.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • Add basic refcount tree root operation.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • Implement locking around struct ocfs2_refcount_tree. This protects
    all read/write operations on refcount trees. ocfs2_refcount_tree
    has its own lock and its own caching_info, protecting buffers among
    multiple nodes.

    User must call ocfs2_lock_refcount_tree before his operation on
    the tree and unlock it after that.

    ocfs2_refcount_trees are referenced by the block number of the
    refcount tree root block, So we create an rb-tree on the ocfs2_super
    to look them up.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • refcount tree should use its own caching info so that when
    we downconvert the refcount tree lock, we can drop all the
    cached buffer head.

    Signed-off-by: Tao Ma

    Tao Ma
     
  • Signed-off-by: Tao Ma

    Tao Ma