17 Oct, 2007

1 commit

  • Plug ocfs2 into the ->write_begin and ->write_end aops.

    A bunch of custom code is now gone - the iovec iteration stuff during write
    and the ocfs2 splice write actor.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

13 Oct, 2007

25 commits

  • * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/driver-2.6: (75 commits)
    PM: merge device power-management source files
    sysfs: add copyrights
    kobject: update the copyrights
    kset: add some kerneldoc to help describe what these strange things are
    Driver core: rename ktype_edd and ktype_efivar
    Driver core: rename ktype_driver
    Driver core: rename ktype_device
    Driver core: rename ktype_class
    driver core: remove subsystem_init()
    sysfs: move sysfs file poll implementation to sysfs_open_dirent
    sysfs: implement sysfs_open_dirent
    sysfs: move sysfs_dirent->s_children into sysfs_dirent->s_dir
    sysfs: make sysfs_root a regular directory dirent
    sysfs: open code sysfs_attach_dentry()
    sysfs: make s_elem an anonymous union
    sysfs: make bin attr open get active reference of parent too
    sysfs: kill unnecessary NULL pointer check in sysfs_release()
    sysfs: kill unnecessary sysfs_get() in open paths
    sysfs: reposition sysfs_dirent->s_mode.
    sysfs: kill sysfs_update_file()
    ...

    Linus Torvalds
     
  • A kset should not have its name set directly, so dynamically set the
    name at runtime.

    This is needed to remove the static array in the kobject structure which
    will be changed in a future patch.

    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     
  • Modify ocfs2_dir_foreach_blk() to optionally return any error from the
    filldir callback. This way ocfs2_dirforeach() can terminate early, as
    opposed to always passing through the entire directory. This fixes a bug
    introduced during a previous code refactor where ocfs2_empty_dir() would
    loop infinitely.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • Create all new directories with OCFS2_INLINE_DATA_FL and the inline data
    bytes formatted as an empty directory. Inode size field reflects the actual
    amount of inline data available, which makes searching for dirent space
    very similar to the regular directory search.

    Inline-data directories are automatically pushed out to extents on any
    insert request which is too large for the available space.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • This splits out extent based directory read support and implements
    inline-data versions of those functions. All knowledge of inline-data versus
    extent based directories is internalized. For lookups the code uses
    ocfs2_find_entry_id(), full dir iterations make use of
    ocfs2_dir_foreach_blk_id().

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • This fixes up write, truncate, mmap, and RESVSP/UNRESVP to understand inline
    inode data.

    For the most part, the changes to the core write code can be relied on to do
    the heavy lifting. Any code calling ocfs2_write_begin (including shared
    writeable mmap) can count on it doing the right thing with respect to
    growing inline data to an extent tree.

    Size reducing truncates, including UNRESVP can simply zero that portion of
    the inode block being removed. Size increasing truncatesm, including RESVP
    have to be a little bit smarter and grow the inode to an extent tree if
    necessary.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • This hooks up ocfs2_readpage() to populate a page with data from an inode
    block. Direct IO reads from inline data are modified to fall back to
    buffered I/O. Appropriate checks are also placed in the extent map code to
    avoid reading an extent list when inline data might be stored.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • Add the disk, network and memory structures needed to support data in inode.

    Struct ocfs2_inline_data is defined and embedded in ocfs2_dinode for storing
    inline data.

    A new inode field, i_dyn_features, is added to facilitate tracking of
    dynamic inode state. Since it will be used often, we want to mirror it on
    ocfs2_inode_info, and transfer it via the meta data lvb.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • The check to see if a new dirent would fit in an old one is pretty ugly, and
    it's done at least twice. Clean things up by putting this in it's own
    easier-to-read function.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • ocfs2_rename() does direct manipulation of the dirent it's gotten back from
    a directory search. Wrap this manipulation inside of a function so that we
    can transparently change directory update behavior in the future. As an
    added bonus, this gets rid of an ugly macro.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • A couple paths which needed to just match a parent dir + name pair to an
    inode number were a bit messy because they had to deal with
    ocfs2_find_files_on_disk() which returns a larger number of values. Provide
    a convenience function, ocfs2_lookup_ino_from_name() which internalizes all
    the extra accounting.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • We can preserve the behavior of ocfs2_empty_dir(), while getting rid of the
    open coded directory walk by just providing a smart filldir callback. This
    also automatically gets to use the dir readahead code, though in this case
    any advantage is minor at best.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • ocfs2_queue_orphans() has an open coded readdir loop which can easily just
    use a directory accessor function.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • filldir_t can take this, so don't turn de->inode into a 32 bit value. Right
    now this doesn't make a difference since no ocfs2 inodes overflow that, but
    it could be a nasty surprise later on if some kernel code is calling
    ocfs2_dir_foreach_blk() and expecting real inode numbers back...

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • Put this in it's own function so that the functionality can be overridden.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • The code for adding, removing, deleting directory entries was splattered all
    over namei.c. I'd rather have this all centralized, so that it's easier to
    make changes for inline dir data, and eventually indexed directories.

    None of the code in any of the functions was changed. I only removed the
    static keyword from some prototypes so that they could be exported.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • We'll want to reuse most of this when pushing inline data back out to an
    extent. Keeping this part as a seperate patch helps to keep the upcoming
    changes for write support uncluttered.

    The core portion of ocfs2_zero_cluster_pages() responsible for making sure a
    page is mapped and properly dirtied is abstracted out into it's own
    function, ocfs2_map_and_dirty_page(). Actual functionality doesn't change,
    though zeroing becomes optional.

    We also turn part of ocfs2_free_write_ctxt() into a common function for
    unlocking and freeing a page array. This operation is very common (and
    uniform) for Ocfs2 cluster sizes greater than page size, so it makes sense
    to keep the code in one place.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • By doing this, we can remove any higher level logic which has to have
    knowledge of btree functionality - any callers of ocfs2_write_begin() can
    now expect it to do anything necessary to prepare the inode for new data.

    Signed-off-by: Mark Fasheh
    Reviewed-by: Joel Becker

    Mark Fasheh
     
  • ocfs2-tools added some on-disk fields and flags which are used by
    tunefs.ocfs2.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • Signed-off-by: Denis Cheng
    Signed-off-by: Mark Fasheh

    Denis Cheng
     
  • Implement sops->show_options() so as to allow /proc/mounts to show the mount
    options.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • This is technically harmless (recovery will clean it out later), but leaves
    a bogus entry in the slot_map which really shouldn't be there.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • c_used_tail_recs in struct ocfs2_merge_ctxt is only ever set, so we can
    remove it.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • delete_tail_recs in ocfs2_try_to_merge_extent() was only ever set, remove
    it.

    Signed-off-by: Tao Mao
    Signed-off-by: Mark Fasheh

    Tao Mao
     
  • ocfs2_insert_type->ins_free_records was only used in one place, and was set
    incorrectly in most places. We can free up some memory and lose some code by
    removing this.

    * Small warning fixup contributed by Andrew Mortom

    Signed-off-by: Tao Mao
    Signed-off-by: Mark Fasheh

    Tao Mao
     

12 Oct, 2007

1 commit


10 Oct, 2007

1 commit

  • As bi_end_io is only called once when the reqeust is complete,
    the 'size' argument is now redundant. Remove it.

    Now there is no need for bio_endio to subtract the size completed
    from bi_size. So don't do that either.

    While we are at it, change bi_end_io to return void.

    Signed-off-by: Neil Brown
    Signed-off-by: Jens Axboe

    NeilBrown
     

04 Oct, 2007

1 commit


21 Sep, 2007

4 commits

  • The ocfs2_vote_msg and ocfs2_response_msg structs needed to be
    packed to ensure similar sizeofs in 32-bit and 64-bit arches. Without this,
    we had inadvertantly broken 32/64 bit cross mounts.

    Signed-off-by: Sunil Mushran
    Signed-off-by: Mark Fasheh

    Sunil Mushran
     
  • The target page offsets were being incorrectly set a second time in
    ocfs2_prepare_page_for_write(), which was causing problems on a 16k page
    size kernel. Additionally, ocfs2_write_failure() was incorrectly using those
    parameters instead of the parameters for the individual page being cleaned
    up.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • This was broken for file systems whose cluster size is greater than page
    size. Pos needs to be incremented as we loop through the descriptors, and
    len needs to be capped to the size of a single cluster.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • The ocfs2 write code loops through a page much like the block code, except
    that ocfs2 allocation units can be any size, including larger than page
    size. Typically it's equal to or larger than page size - most kernels run 4k
    pages, the minimum ocfs2 allocation (cluster) size.

    Some changes introduced during 2.6.23 changed the way writes to pages are
    handled, and inadvertantly broke support for > 4k page size. Instead of just
    writing one cluster at a time, we now handle the whole page in one pass.

    This means that multiple (small) seperate allocations might happen in the
    same pass. The allocation code howver typically optimizes by getting the
    maximum which was reserved. This triggered a BUG_ON in the extend code where
    it'd ask for a single bit (for one part of a > 4k page) and get back more
    than it asked for.

    Fix this by providing a variant of the high level allocation function which
    allows the caller to specify a maximum. The traditional function remains and
    just calls the new one with a maximum determined from the initial
    reservation.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     

12 Sep, 2007

3 commits

  • We were setting i_blocks too early - before truncating any allocation.
    Correct things to set i_blocks after the allocation change.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • In ocfs2_alloc_write_write_ctxt, the written clusters length is calculated
    by the byte length only. This may cause some problems if we start to write
    at some position in the end of one cluster and last to a second cluster
    while the "len" is smaller than a cluster size. In that case, we have to
    write 2 clusters actually.
    So we have to take the start position into consideration also.

    Signed-off-by: Tao Ma
    Signed-off-by: Mark Fasheh

    tao.ma@oracle.com
     
  • For some mount option types, ocfs2_parse_options() will try to access
    sb->s_fs_info to get at the ocfs2 private superblock. Unfortunately, that
    hasn't been allocated yet and will cause a kernel crash.

    Fix this by storing options in a struct which can then get pushed into the
    ocfs2_super once it's been allocated later. If we need more options which
    store to the ocfs2_super in the future, we can just fields to this struct.

    Signed-off-by: Tiger Yang
    Signed-off-by: Mark Fasheh

    Tiger Yang
     

10 Aug, 2007

4 commits