26 Mar, 2016

1 commit

  • In update_backups() there exists a problem of crossing the boundary as
    follows:

    we assume that lun will be resized to 1TB(cluster_size is 32kb), it will
    include 0~33554431 cluster, in update_backups func, it will backup super
    block in location of 1TB which is the 33554432th cluster, so the
    phenomenon of crossing the boundary happens.

    Signed-off-by: Yiwen Jiang
    Reviewed-by: Joseph Qi
    Cc: Xue jiufei
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    jiangyiwen
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

30 Dec, 2015

1 commit

  • When resizing, it firstly extends the last gd. Once it should backup
    super in the gd, it calculates new backup super and update the
    corresponding value.

    But it currently doesn't consider the situation that the backup super is
    already done. And in this case, it still sets the bit in gd bitmap and
    then decrease from bg_free_bits_count, which leads to a corrupted gd and
    trigger the BUG in ocfs2_block_group_set_bits:

    BUG_ON(le16_to_cpu(bg->bg_free_bits_count) < num_bits);

    So check whether the backup super is done and then do the updates.

    Signed-off-by: Joseph Qi
    Reviewed-by: Jiufei Xue
    Reviewed-by: Yiwen Jiang
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     

05 Jun, 2014

2 commits

  • Ocfs2 cluster size may be 1MB, which has 20 bits. When resize, the
    input new clusters is mostly the number of clusters in a group
    descriptor(32256).

    Since the input clusters is defined as type int, so it will overflow
    when shift left 20 bits and then lead to incorrect global bitmap i_size.

    Signed-off-by: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     
  • Parameters new_clusters and first_new_cluster are not used in
    ocfs2_update_last_group_and_inode, so remove them.

    Signed-off-by: Joseph Qi
    Reviewed-by: joyce.xue
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     

13 Nov, 2013

2 commits


07 Mar, 2011

1 commit

  • mlog_exit is used to record the exit status of a function.
    But because it is added in so many functions, if we enable it,
    the system logs get filled up quickly and cause too much I/O.
    So actually no one can open it for a production system or even
    for a test.

    This patch just try to remove it or change it. So:
    1. if all the error paths already use mlog_errno, it is just removed.
    Otherwise, it will be replaced by mlog_errno.
    2. if it is used to print some return value, it is replaced with
    mlog(0,...).
    mlog_exit_ptr is changed to mlog(0.
    All those mlog(0,...) will be replaced with trace events later.

    Signed-off-by: Tao Ma

    Tao Ma
     

22 Feb, 2011

2 commits


21 Feb, 2011

1 commit

  • ENTRY is used to record the entry of a function.
    But because it is added in so many functions, if we enable it,
    the system logs get filled up quickly and cause too much I/O.
    So actually no one can open it for a production system or even
    for a test.

    So for mlog_entry_void, we just remove it.
    for mlog_entry(...), we replace it with mlog(0,...), and they
    will be replace by trace event later.

    Signed-off-by: Tao Ma

    Tao Ma
     

06 May, 2010

1 commit

  • jbd[2]_journal_dirty_metadata() only returns 0. It's been returning 0
    since before the kernel moved to git. There is no point in checking
    this error.

    ocfs2_journal_dirty() has been faithfully returning the status since the
    beginning. All over ocfs2, we have blocks of code checking this can't
    fail status. In the past few years, we've tried to avoid adding these
    checks, because they are pointless. But anyone who looks at our code
    assumes they are needed.

    Finally, ocfs2_journal_dirty() is made a void function. All error
    checking is removed from other files. We'll BUG_ON() the status of
    jbd2_journal_dirty_metadata() just in case they change it someday. They
    won't.

    Signed-off-by: Joel Becker

    Joel Becker
     

13 Apr, 2010

2 commits


05 Sep, 2009

2 commits

  • The next step in divorcing metadata I/O management from struct inode is
    to pass struct ocfs2_caching_info to the journal functions. Thus the
    journal locks a metadata cache with the cache io_lock function. It also
    can compare ci_last_trans and ci_created_trans directly.

    This is a large patch because of all the places we change
    ocfs2_journal_access..(handle, inode, ...) to
    ocfs2_journal_access..(handle, INODE_CACHE(inode), ...).

    Signed-off-by: Joel Becker

    Joel Becker
     
  • We are really passing the inode into the ocfs2_read/write_blocks()
    functions to get at the metadata cache. This commit passes the cache
    directly into the metadata block functions, divorcing them from the
    inode.

    Signed-off-by: Joel Becker

    Joel Becker
     

06 Jan, 2009

5 commits

  • The per-metadata-type ocfs2_journal_access_*() functions hook up jbd2
    commit triggers and allow us to compute metadata ecc right before the
    buffers are written out. This commit provides ecc for inodes, extent
    blocks, group descriptors, and quota blocks. It is not safe to use
    extened attributes and metaecc at the same time yet.

    The ocfs2_extent_tree and ocfs2_path abstractions in alloc.c both hide
    the type of block at their root. Before, it didn't matter, but now the
    root block must use the appropriate ocfs2_journal_access_*() function.
    To keep this abstract, the structures now have a pointer to the matching
    journal_access function and a wrapper call to call it.

    A few places use naked ocfs2_write_block() calls instead of adding the
    blocks to the journal. We make sure to calculate their checksum and ecc
    before the write.

    Since we pass around the journal_access functions. Let's typedef them
    in ocfs2.h.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Add an optional validation hook to ocfs2_read_blocks(). Now the
    validation function is only called when a block was actually read off of
    disk. It is not called when the buffer was in cache.

    We add a buffer state bit BH_NeedsValidate to flag these buffers. It
    must always be one higher than the last JBD2 buffer state bit.

    The dinode, dirblock, extent_block, and xattr_block validators are
    lifted to this scheme directly. The group_descriptor validator needs to
    be split into two pieces. The first part only needs the gd buffer and
    is passed to ocfs2_read_block(). The second part requires the dinode as
    well, and is called every time. It's only 3 compares, so it's tiny.
    This also allows us to clean up the non-fatal gd check used by resize.c.
    It now has no magic argument.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • We have a clean call for validating group descriptors, but every place
    that wants the always does a read_block()+validate() call pair. Create
    a toplevel ocfs2_read_group_descriptor() that does the right
    thing. This allows us to leverage the single call point later for
    fancier handling. We also add validation of gd->bg_generation against
    the superblock and gd->bg_blkno against the block we thought we read.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Currently the validation of group descriptors is directly duplicated so
    that one version can error the filesystem and the other (resize) can
    just report the problem. Consolidate to one function that takes a
    boolean. Wrap that function with the old call for the old users.

    This is in preparation for lifting the read+validate step into a
    single function.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Random places in the code would check a dinode bh to see if it was
    valid. Not only did they do different levels of validation, they
    handled errors in different ways.

    The previous commit unified inode block reads, validating all block
    reads in the same place. Thus, these haphazard checks are no longer
    necessary. Rather than eliminate them, however, we change them to
    BUG_ON() checks. This ensures the assumptions remain true. All of the
    code paths to these checks have been audited to ensure they come from a
    validated inode read.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     

15 Oct, 2008

3 commits

  • More than 30 callers of ocfs2_read_block() pass exactly OCFS2_BH_CACHED.
    Only six pass a different flag set. Rather than have every caller care,
    let's make ocfs2_read_block() take no flags and always do a cached read.
    The remaining six places can call ocfs2_read_blocks() directly.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • Now that synchronous readers are using ocfs2_read_blocks_sync(), all
    callers of ocfs2_read_blocks() are passing an inode. Use it
    unconditionally. Since it's there, we don't need to pass the
    ocfs2_super either.

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     
  • The ocfs2_read_blocks() function currently handles sync reads, cached,
    reads, and sometimes cached reads. We're going to add some
    functionality to it, so first we should simplify it. The uncached,
    synchronous reads are much easer to handle as a separate function, so we
    instroduce ocfs2_read_blocks_sync().

    Signed-off-by: Joel Becker
    Signed-off-by: Mark Fasheh

    Joel Becker
     

11 Mar, 2008

1 commit


26 Jan, 2008

3 commits

  • If we know a buffer_head is non-null, then brelse() is unnecessary and
    put_bh() can be used instead. Also, an explicit check for NULL is
    unnecessary when using brelse(). This patch only covers buffer_head_io.c and
    resize.c, which have recently added code which exhibits this problem.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     
  • This patch adds the ability for a userspace program to request that a
    properly formatted cluster group be added to the main allocation bitmap for
    an Ocfs2 file system. The request is made via an ioctl, OCFS2_IOC_GROUP_ADD.
    On a high level, this is similar to ext3, but we use a different ioctl as
    the structure which has to be passed through is different.

    During an online resize, tunefs.ocfs2 will format any new cluster groups
    which must be added to complete the resize, and call OCFS2_IOC_GROUP_ADD on
    each one. Kernel verifies that the core cluster group information is valid
    and then does the work of linking it into the global allocation bitmap.

    Signed-off-by: Tao Ma
    Signed-off-by: Mark Fasheh

    Tao Ma
     
  • This patch adds the ability for a userspace program to request an extend of
    last cluster group on an Ocfs2 file system. The request is made via ioctl,
    OCFS2_IOC_GROUP_EXTEND. This is derived from EXT3_IOC_GROUP_EXTEND, but is
    obviously Ocfs2 specific.

    tunefs.ocfs2 would call this for an online-resize operation if the last
    cluster group isn't full.

    Signed-off-by: Tao Ma
    Signed-off-by: Mark Fasheh

    Tao Ma