06 Jan, 2009

8 commits

  • Use the db_check field of ocfs2_dir_block_trailer to crc/ecc the
    dirblocks.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Future ocfs2 features metaecc and indexed directories need to store a
    little bit of data in each dirblock. For compatibility, we place this
    in a trailer at the end of the dirblock. The trailer plays itself as an
    empty dirent, so that if the features are turned off, it can be reused
    without requiring a tunefs scan.

    This code adds the trailer and validates it when the block is read in.

    [ Mark is the original author, but I reinserted this code before his
    dir index work. -- Joel ]

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • The per-metadata-type ocfs2_journal_access_*() functions hook up jbd2
    commit triggers and allow us to compute metadata ecc right before the
    buffers are written out. This commit provides ecc for inodes, extent
    blocks, group descriptors, and quota blocks. It is not safe to use
    extened attributes and metaecc at the same time yet.

    The ocfs2_extent_tree and ocfs2_path abstractions in alloc.c both hide
    the type of block at their root. Before, it didn't matter, but now the
    root block must use the appropriate ocfs2_journal_access_*() function.
    To keep this abstract, the structures now have a pointer to the matching
    journal_access function and a wrapper call to call it.

    A few places use naked ocfs2_write_block() calls instead of adding the
    blocks to the journal. We make sure to calculate their checksum and ecc
    before the write.

    Since we pass around the journal_access functions. Let's typedef them
    in ocfs2.h.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Add quota calls for allocation and freeing of inodes and space, also update
    estimates on number of needed credits for a transaction. Move out inode
    allocation from ocfs2_mknod_locked() because vfs_dq_init() must be called
    outside of a transaction.

    Signed-off-by: Jan Kara
    Signed-off-by: Mark Fasheh

    Jan Kara
     
  • Now that we've centralized the ocfs2_read_virt_blocks() code, let's use
    it in ocfs2_read_dir_block().

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Add an optional validation hook to ocfs2_read_blocks(). Now the
    validation function is only called when a block was actually read off of
    disk. It is not called when the buffer was in cache.

    We add a buffer state bit BH_NeedsValidate to flag these buffers. It
    must always be one higher than the last JBD2 buffer state bit.

    The dinode, dirblock, extent_block, and xattr_block validators are
    lifted to this scheme directly. The group_descriptor validator needs to
    be split into two pieces. The first part only needs the gd buffer and
    is passed to ocfs2_read_block(). The second part requires the dinode as
    well, and is called every time. It's only 3 compares, so it's tiny.
    This also allows us to clean up the non-fatal gd check used by resize.c.
    It now has no magic argument.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • We have ocfs2_bread() as a vestige of the original ext-based dir code.
    It's only used by directories, though. Turn it into
    ocfs2_read_dir_block(), with a prototype matching the other metadata
    read functions. It's set up to validate dirblocks when the time comes.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • The ocfs2 code currently reads inodes off disk with a simple
    ocfs2_read_block() call. Each place that does this has a different set
    of sanity checks it performs. Some check only the signature. A couple
    validate the block number (the block read vs di->i_blkno). A couple
    others check for VALID_FL. Only one place validates i_fs_generation. A
    couple check nothing. Even when an error is found, they don't all do
    the same thing.

    We wrap inode reading into ocfs2_read_inode_block(). This will validate
    all the above fields, going readonly if they are invalid (they never
    should be). ocfs2_read_inode_block_full() is provided for the places
    that want to pass read_block flags. Every caller is passing a struct
    inode with a valid ip_blkno, so we don't need a separate blkno argument
    either.

    We will remove the validation checks from the rest of the code in a
    later commit, as they are no longer necessary.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     

15 Oct, 2008

5 commits

  • ocfs2_read_blocks() currently requires the CACHED flag for cached I/O.
    However, that's the common case. Let's flip it around and provide an
    IGNORE_CACHE flag for the special users. This has the added benefit of
    cleaning up the code some (ignore_cache takes on its special meaning
    earlier in the loop).

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • ocfs2's cached buffer I/O goes through ocfs2_read_block(s)(). dir.c had
    a naked wait_on_buffer() to wait for some readahead, but it should
    use ocfs2_read_block() instead.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • dir.c is the only place using ocfs2_bread(), so let's make it static to
    that file.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • More than 30 callers of ocfs2_read_block() pass exactly OCFS2_BH_CACHED.
    Only six pass a different flag set. Rather than have every caller care,
    let's make ocfs2_read_block() take no flags and always do a cached read.
    The remaining six places can call ocfs2_read_blocks() directly.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Now that synchronous readers are using ocfs2_read_blocks_sync(), all
    callers of ocfs2_read_blocks() are passing an inode. Use it
    unconditionally. Since it's there, we don't need to pass the
    ocfs2_super either.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     

14 Oct, 2008

8 commits

  • This is pointless as brelse() already does the check.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • The original get/put_extent_tree() functions held a reference on
    et_root_bh. However, every single caller already has a safe reference,
    making the get/put cycle irrelevant.

    We change ocfs2_get_*_extent_tree() to ocfs2_init_*_extent_tree(). It
    no longer gets a reference on et_root_bh. ocfs2_put_extent_tree() is
    removed. Callers now have a simpler init+use pattern.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • We now have three different kinds of extent trees in ocfs2: inode data
    (dinode), extended attributes (xattr_tree), and extended attribute
    values (xattr_value). There is a nice abstraction for them,
    ocfs2_extent_tree, but it is hidden in alloc.c. All the calling
    functions have to pick amongst a varied API and pass in type bits and
    often extraneous pointers.

    A better way is to make ocfs2_extent_tree a first-class object.
    Everyone converts their object to an ocfs2_extent_tree() via the
    ocfs2_get_*_extent_tree() calls, then uses the ocfs2_extent_tree for all
    tree calls to alloc.c.

    This simplifies a lot of callers, making for readability. It also
    provides an easy way to add additional extent tree types, as they only
    need to be defined in alloc.c with a ocfs2_get__extent_tree()
    function.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Add some thin wrappers around ocfs2_insert_extent() for each of the 3
    different btree types, ocfs2_inode_insert_extent(),
    ocfs2_xattr_value_insert_extent() and ocfs2_xattr_tree_insert_extent(). The
    last is for the xattr index btree, which will be used in a followup patch.

    All the old callers in file.c etc will call ocfs2_dinode_insert_extent(),
    while the other two handle the xattr issue. And the init of extent tree are
    handled by these functions.

    When storing xattr value which is too large, we will allocate some clusters
    for it and here ocfs2_extent_list and ocfs2_extent_rec will also be used. In
    order to re-use the b-tree operation code, a new parameter named "private"
    is added into ocfs2_extent_tree and it is used to indicate the root of
    ocfs2_exent_list. The reason is that we can't deduce the root from the
    buffer_head now. It may be in an inode, an ocfs2_xattr_block or even worse,
    in any place in an ocfs2_xattr_bucket.

    Signed-off-by: Tao Ma
    Signed-off-by: Mark Fasheh

    Tao Ma
     
  • Factor out the non-inode specifics of ocfs2_do_extend_allocation() into a more generic
    function, ocfs2_do_cluster_allocation(). ocfs2_do_extend_allocation calls
    ocfs2_do_cluster_allocation() now, but the latter can be used for other
    btree types as well.

    Signed-off-by: Tao Ma
    Signed-off-by: Mark Fasheh

    Tao Ma
     
  • In the old extent tree operation, we take the hypothesis that we
    are using the ocfs2_extent_list in ocfs2_dinode as the tree root.
    As xattr will also use ocfs2_extent_list to store large value
    for a xattr entry, we refactor the tree operation so that xattr
    can use it directly.

    The refactoring includes 4 steps:
    1. Abstract set/get of last_eb_blk and update_clusters since they may
    be stored in different location for dinode and xattr.
    2. Add a new structure named ocfs2_extent_tree to indicate the
    extent tree the operation will work on.
    3. Remove all the use of fe_bh and di, use root_bh and root_el in
    extent tree instead. So now all the fe_bh is replaced with
    et->root_bh, el with root_el accordingly.
    4. Make ocfs2_lock_allocators generic. Now it is limited to be only used
    in file extend allocation. But the whole function is useful when we want
    to store large EAs.

    Note: This patch doesn't touch ocfs2_commit_truncate() since it is not used
    for anything other than truncate inode data btrees.

    Signed-off-by: Tao Ma
    Signed-off-by: Mark Fasheh

    Tao Ma
     
  • ocfs2_extend_meta_needed(), ocfs2_calc_extend_credits() and
    ocfs2_reserve_new_metadata() are all useful for extent tree operations. But
    they are all limited to an inode btree because they use a struct
    ocfs2_dinode parameter. Change their parameter to struct ocfs2_extent_list
    (the part of an ocfs2_dinode they actually use) so that the xattr btree code
    can use these functions.

    Signed-off-by: Tao Ma
    Signed-off-by: Mark Fasheh

    Tao Ma
     
  • ocfs2_num_free_extents() is used to find the number of free extent records
    in an inode btree. Hence, it takes an "ocfs2_dinode" parameter. We want to
    use this for extended attribute trees in the future, so genericize the
    interface the take a buffer head. A future patch will allow that buffer_head
    to contain any structure rooting an ocfs2 btree.

    Signed-off-by: Tao Ma
    Signed-off-by: Mark Fasheh

    Tao Ma
     

23 Aug, 2008

2 commits


04 Mar, 2008

1 commit

  • replace all:
    little_endian_variable = cpu_to_leX(leX_to_cpu(little_endian_variable) +
    expression_in_cpu_byteorder);
    with:
    leX_add_cpu(&little_endian_variable, expression_in_cpu_byteorder);
    generated with semantic patch

    Signed-off-by: Marcin Slusarz
    Signed-off-by: Mark Fasheh

    Marcin Slusarz
     

03 Feb, 2008

1 commit


26 Jan, 2008

1 commit


07 Nov, 2007

1 commit


17 Oct, 2007

1 commit

  • Fix f_version type: should be u64 instead of long

    There is a type inconsistency between struct inode i_version and struct file
    f_version.

    fs.h:

    struct inode
    u64 i_version;

    and

    struct file
    unsigned long f_version;

    Users do:

    fs/ext3/dir.c:

    if (filp->f_version != inode->i_version) {

    So why isn't f_version a u64 ? It becomes a problem if versions gets
    higher than 2^32 and we are on an architecture where longs are 32 bits.

    This patch changes the f_version type to u64, and updates the users accordingly.

    It applies to 2.6.23-rc2-mm2.

    Signed-off-by: Mathieu Desnoyers
    Cc: Martin Bligh
    Cc: "Randy.Dunlap"
    Cc: Al Viro
    Cc:
    Cc: Mark Fasheh
    Cc: Christoph Hellwig
    Cc: "J. Bruce Fields"
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     

13 Oct, 2007

11 commits

  • Modify ocfs2_dir_foreach_blk() to optionally return any error from the
    filldir callback. This way ocfs2_dirforeach() can terminate early, as
    opposed to always passing through the entire directory. This fixes a bug
    introduced during a previous code refactor where ocfs2_empty_dir() would
    loop infinitely.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • Create all new directories with OCFS2_INLINE_DATA_FL and the inline data
    bytes formatted as an empty directory. Inode size field reflects the actual
    amount of inline data available, which makes searching for dirent space
    very similar to the regular directory search.

    Inline-data directories are automatically pushed out to extents on any
    insert request which is too large for the available space.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • This splits out extent based directory read support and implements
    inline-data versions of those functions. All knowledge of inline-data versus
    extent based directories is internalized. For lookups the code uses
    ocfs2_find_entry_id(), full dir iterations make use of
    ocfs2_dir_foreach_blk_id().

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • The check to see if a new dirent would fit in an old one is pretty ugly, and
    it's done at least twice. Clean things up by putting this in it's own
    easier-to-read function.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • ocfs2_rename() does direct manipulation of the dirent it's gotten back from
    a directory search. Wrap this manipulation inside of a function so that we
    can transparently change directory update behavior in the future. As an
    added bonus, this gets rid of an ugly macro.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • A couple paths which needed to just match a parent dir + name pair to an
    inode number were a bit messy because they had to deal with
    ocfs2_find_files_on_disk() which returns a larger number of values. Provide
    a convenience function, ocfs2_lookup_ino_from_name() which internalizes all
    the extra accounting.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • We can preserve the behavior of ocfs2_empty_dir(), while getting rid of the
    open coded directory walk by just providing a smart filldir callback. This
    also automatically gets to use the dir readahead code, though in this case
    any advantage is minor at best.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • ocfs2_queue_orphans() has an open coded readdir loop which can easily just
    use a directory accessor function.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • filldir_t can take this, so don't turn de->inode into a 32 bit value. Right
    now this doesn't make a difference since no ocfs2 inodes overflow that, but
    it could be a nasty surprise later on if some kernel code is calling
    ocfs2_dir_foreach_blk() and expecting real inode numbers back...

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • Put this in it's own function so that the functionality can be overridden.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • The code for adding, removing, deleting directory entries was splattered all
    over namei.c. I'd rather have this all centralized, so that it's easier to
    make changes for inline dir data, and eventually indexed directories.

    None of the code in any of the functions was changed. I only removed the
    static keyword from some prototypes so that they could be exported.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     

11 Jul, 2007

1 commit

  • This can now be trivially supported with re-use of our existing extend code.

    ocfs2_allocate_unwritten_extents() takes a start offset and a byte length
    and iterates over the inode, adding extents (marked as unwritten) until len
    is reached. Existing extents are skipped over.

    Signed-off-by: Mark Fasheh

    Mark Fasheh