06 Apr, 2018

2 commits

  • The two functions are no longer used.

    Link: http://lkml.kernel.org/r/1519609595-26229-1-git-send-email-ge.changwei@h3c.com
    Signed-off-by: Changwei Ge
    Reviewed-by: Andrew Morton
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Cc: Changwei Ge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Changwei Ge
     
  • We could use 'osb' instead of 'OCFS2_SB()' to make code more elegant.

    Link: http://lkml.kernel.org/r/5A702111.7090907@huawei.com
    Signed-off-by: Jun Piao
    Reviewed-by: Yiwen Jiang
    Reviewed-by: Andrew Morton
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Cc: Changwei Ge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    piaojun
     

01 Feb, 2018

2 commits

  • We should unlock bh_stat if bg->bg_free_bits_count > bg->bg_bits

    Link: http://lkml.kernel.org/r/1516843095-23680-1-git-send-email-ge.changwei@h3c.com
    Signed-off-by: Changwei Ge
    Suggested-by: Jan Kara
    Reviewed-by: Andrew Morton
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Cc: Changwei Ge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Changwei Ge
     
  • Stack variable fe is no longer used, so trim it to save some CPU cycles
    and stack space.

    Link: http://lkml.kernel.org/r/63ADC13FD55D6546B7DECE290D39E373F1F5A8DD@H3CMLB14-EX.srv.huawei-3com.com
    Signed-off-by: Changwei Ge
    Reviewed-by: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Changwei Ge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Changwei Ge
     

16 Nov, 2017

1 commit


07 Sep, 2017

1 commit

  • clean up some unused functions and parameters.

    Link: http://lkml.kernel.org/r/598A5E21.2080807@huawei.com
    Signed-off-by: Jun Piao
    Reviewed-by: Alex Chen
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Cc: Joseph Qi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jun Piao
     

20 Sep, 2016

1 commit

  • If ocfs2_reserve_cluster_bitmap_bits() fails with ENOSPC, it will try to
    free truncate log and then retry. Since ocfs2_try_to_free_truncate_log
    will lock/unlock global bitmap inode, we have to unlock it before
    calling this function. But when retry reserve and it fails with no
    global bitmap inode lock taken, it will unlock again in error handling
    branch and BUG.

    This issue also exists if no need retry and then ocfs2_inode_lock fails.
    So fix it.

    Fixes: 2070ad1aebff ("ocfs2: retry on ENOSPC if sufficient space in truncate log")
    Link: http://lkml.kernel.org/r/57D91939.6030809@huawei.com
    Signed-off-by: Joseph Qi
    Signed-off-by: Jiufei Xue
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     

03 Aug, 2016

1 commit

  • The testcase "mmaptruncate" in ocfs2 test suite always fails with ENOSPC
    error on small volume (say less than 10G). This testcase repeatedly
    performs "extend" and "truncate" on a file. Continuously, it truncates
    the file to 1/2 of the size, and then extends to 100% of the size. The
    main bitmap will quickly run out of space because the "truncate" code
    prevent truncate log from being flushed by
    ocfs2_schedule_truncate_log_flush(osb, 1), while truncate log may have
    cached lots of clusters.

    So retry to allocate after flushing truncate log when ENOSPC is
    returned. And we cannot reuse the deleted blocks before the transaction
    committed. Fortunately, we already have a function to do this -
    ocfs2_try_to_free_truncate_log(). Just need to remove the "static"
    modifier and put it into the right place.

    The "unlock"/"lock" code isn't elegant, but there seems to be no better
    option.

    [zren@suse.com: locking fix]
    Link: http://lkml.kernel.org/r/1468031546-4797-1-git-send-email-zren@suse.com
    Link: http://lkml.kernel.org/r/1466586469-5541-1-git-send-email-zren@suse.com
    Signed-off-by: Eric Ren
    Reviewed-by: Gang He
    Reviewed-by: Joseph Qi
    Reviewed-by: Mark Fasheh
    Cc: Joel Becker
    Cc: Junxiao Bi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Ren
     

23 Jan, 2016

1 commit

  • parallel to mutex_{lock,unlock,trylock,is_locked,lock_nested},
    inode_foo(inode) being mutex_foo(&inode->i_mutex).

    Please, use those for access to ->i_mutex; over the coming cycle
    ->i_mutex will become rwsem, with ->lookup() done with it held
    only shared.

    Signed-off-by: Al Viro

    Al Viro
     

06 Nov, 2015

1 commit

  • Currently cluster allocation is always trying to find a victim chain (a
    chian has most space), and this may lead to poor performance because of
    discontiguous allocation in some scenarios.

    Our test case is block size 4k, cluster size 1M and mount option with
    localalloc=2048 (2G), since a gd is 32256M (about 31.5G) and a localalloc
    window is only 2G, creating 50G file will result in 2G from gd0, 2G from
    gd1, ...

    One way to improve performance is enlarge localalloc window size (max
    31104M), but this will make end user feel that about 30G is suddenly
    "missing", and localalloc currently do not support steal, which means one
    node cannot use another node's localalloc even it is not used in fact. So
    using the last gd to record the allocation and continues with the gd if it
    has enough space for a localalloc window can make the allocation as more
    contiguous as possible.

    Our test result is below (evaluated in IOPS), which is using iometer
    running in VM, dynamic vhd virtual disk stored in ocfs2.

    IO model Original After Improved(%)
    16K60%Write100%Random 703 876 24.59%
    8K90%Write100%Random 735 827 12.59%
    4K100%Write100%Random 859 915 6.52%
    4K100%Read100%Random 2092 2600 24.30%

    Signed-off-by: Joseph Qi
    Tested-by: Norton Zhu
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joseph Qi
     

05 Sep, 2015

3 commits


15 Apr, 2015

1 commit


04 Apr, 2014

3 commits

  • In ocfs2_info_handle_freeinode() and ocfs2_test_inode_bit() func, after
    calls ocfs2_get_system_file_inode() to get inode ref, if calls
    ocfs2_info_scan_inode_alloc() or ocfs2_inode_lock() failed, we should
    iput inode alloc to avoid leaking the inode.

    Signed-off-by: jiangyiwen
    Reviewed-by: Joseph Qi
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    jiangyiwen
     
  • After updating alloc_dinode counts in ocfs2_alloc_dinode_update_counts(),
    if ocfs2_alloc_dinode_update_bitmap() failed, there is a rare case that
    some space may be lost.

    So, roll back alloc_dinode counts when ocfs2_block_group_set_bits()
    failed.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Younger Liu
    Reviewed-by: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Younger Liu
     
  • Ensure that ocfs2_update_inode_fsync_trans() is called any time we touch
    an inode in a given transaction. This is a follow-on to the previous
    patch to reduce lock contention and deadlocking during an fsync
    operation.

    Signed-off-by: Darrick J. Wong
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Wengang
    Cc: Greg Marsden
    Cc: Srinivas Eeda
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Darrick J. Wong
     

22 Jan, 2014

1 commit


13 Nov, 2013

1 commit

  • The only reason for sb_getblk() failing is if it can't allocate the
    buffer_head. So return ENOMEM instead when it fails.

    [joseph.qi@huawei.com: ocfs2_symlink_get_block() and ocfs2_read_blocks_sync() and ocfs2_read_blocks() need the same change]
    Signed-off-by: Rui Xiang
    Reviewed-by: Jie Liu
    Reviewed-by: Mark Fasheh
    Cc: Joel Becker
    Cc: Joseph Qi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rui Xiang
     

04 Jul, 2013

2 commits

  • Cc: Jie Liu
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Sunil Mushran
    Cc: Younger Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • In ocfs2_relink_block_group(), we roll back all those changes if notify
    intent to modify buffers for metadata update failed even if the relevant
    buffer has not yet been modified/got dirty at that point, that are not
    quite right because of:

    - None buffer has been modified/dirty if failed to call
    ocfs2_journal_access_gd() against the previous block group buffer

    - Only the previous block group buffer has got dirty if failed to call
    ocfs2_journal_access_gd() against the block group buffer

    - There is no need to roll back the change for file entry buffer at all

    Those problems will not cause anything wrong but unnecessary. This
    patch fix them and kill the useless bg_ptr variable as well.

    Signed-off-by: Jie Liu
    Cc: Younger Liu
    Cc: Sunil Mushran
    Cc: Mark Fasheh
    Cc: Joel Becker
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jie Liu
     

28 Feb, 2013

1 commit

  • ocfs2_block_group_alloc_discontig() disables chain relink by setting
    ac->ac_allow_chain_relink = 0 because it grabs clusters from multiple
    cluster groups.

    It doesn't keep the credits for all chain relink,but
    ocfs2_claim_suballoc_bits overrides this in this call trace:
    ocfs2_block_group_claim_bits()->ocfs2_claim_clusters()->
    __ocfs2_claim_clusters()->ocfs2_claim_suballoc_bits()
    ocfs2_claim_suballoc_bits set ac->ac_allow_chain_relink = 1; then call
    ocfs2_search_chain() one time and disable it again, and then we run out
    of credits.

    Fix is to allow relink by default and disable it in
    ocfs2_block_group_alloc_discontig.

    Without this patch, End-users will run into a crash due to run out of
    credits, backtrace like this:

    RIP: 0010:[] []
    jbd2_journal_dirty_metadata+0x164/0x170 [jbd2]
    RSP: 0018:ffff8801b919b5b8 EFLAGS: 00010246
    RAX: 0000000000000000 RBX: ffff88022139ddc0 RCX: ffff880159f652d0
    RDX: ffff880178aa3000 RSI: ffff880159f652d0 RDI: ffff880087f09bf8
    RBP: ffff8801b919b5e8 R08: 0000000000000000 R09: 0000000000000000
    R10: 0000000000001e00 R11: 00000000000150b0 R12: ffff880159f652d0
    R13: ffff8801a0cae908 R14: ffff880087f09bf8 R15: ffff88018d177800
    FS: 00007fc9b0b6b6e0(0000) GS:ffff88022fd40000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
    CR2: 000000000040819c CR3: 0000000184017000 CR4: 00000000000006e0
    DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
    Process dd (pid: 9945, threadinfo ffff8801b919a000, task ffff880149a264c0)
    Call Trace:
    ocfs2_journal_dirty+0x2f/0x70 [ocfs2]
    ocfs2_relink_block_group+0x111/0x480 [ocfs2]
    ocfs2_search_chain+0x455/0x9a0 [ocfs2]
    ...

    Signed-off-by: Xiaowei.Hu
    Reviewed-by: Srinivas Eeda
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Xiaowei.Hu
     

14 Apr, 2012

1 commit


31 Mar, 2011

1 commit


07 Mar, 2011

1 commit

  • mlog_exit is used to record the exit status of a function.
    But because it is added in so many functions, if we enable it,
    the system logs get filled up quickly and cause too much I/O.
    So actually no one can open it for a production system or even
    for a test.

    This patch just try to remove it or change it. So:
    1. if all the error paths already use mlog_errno, it is just removed.
    Otherwise, it will be replaced by mlog_errno.
    2. if it is used to print some return value, it is replaced with
    mlog(0,...).
    mlog_exit_ptr is changed to mlog(0.
    All those mlog(0,...) will be replaced with trace events later.

    Signed-off-by: Tao Ma

    Tao Ma
     

22 Feb, 2011

2 commits


21 Feb, 2011

1 commit

  • ENTRY is used to record the entry of a function.
    But because it is added in so many functions, if we enable it,
    the system logs get filled up quickly and cause too much I/O.
    So actually no one can open it for a production system or even
    for a test.

    So for mlog_entry_void, we just remove it.
    for mlog_entry(...), we replace it with mlog(0,...), and they
    will be replace by trace event later.

    Signed-off-by: Tao Ma

    Tao Ma
     

02 Nov, 2010

1 commit

  • "gadget", "through", "command", "maintain", "maintain", "controller", "address",
    "between", "initiali[zs]e", "instead", "function", "select", "already",
    "equal", "access", "management", "hierarchy", "registration", "interest",
    "relative", "memory", "offset", "already",

    Signed-off-by: Uwe Kleine-König
    Signed-off-by: Jiri Kosina

    Uwe Kleine-König
     

16 Oct, 2010

1 commit


12 Oct, 2010

1 commit

  • This patch adds a safe check to ensure bg_free_bits_count doesn't exceed
    bg_bits in a group descriptor. This is to avoid on disk corruption that was
    seen recently.

    debugfs: group
    Group Chain: 179 Parent Inode: 11 Generation: 2959379682
    CRC32: 00000000 ECC: 0000
    ## Block# Total Used Free Contig Size
    0 52803072 32256 4294965350 34202 18207 4032
    ......

    Signed-off-by: Srinivas Eeda
    Signed-off-by: Joel Becker

    Srinivas Eeda
     

24 Sep, 2010

1 commit


08 Sep, 2010

4 commits

  • This allows code which needs to know the eventual block number of an inode
    but can't allocate it yet due to transaction or lock ordering. For example,
    ocfs2_create_inode_in_orphan() currently gives a junk blkno for preparation
    of the orphan dir because it can't yet know where the actual inode is placed
    - that code is actually in ocfs2_mknod_locked. This is a problem when the
    orphan dirs are indexed as the junk inode number will create an index entry
    which goes unused (and fails the later removal from the orphan dir). Now
    with these interfaces, ocfs2_create_inode_in_orphan() can run the block
    group search (and get back the inode block number) *before* any actual
    allocation occurs.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • ocfs2_search_chain() makes the same updates as
    ocfs2_alloc_dinode_update_counts to the alloc inode. Instead of open coding
    the bitmap update, use our helper function.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • We were setting ac->ac_last_group in ocfs2_claim_suballoc_bits from
    res->sr_bg_blkno. Unfortunately, res->sr_bg_blkno is going to be zero under
    normal (non-fragmented) circumstances. The discontig block group patches
    effectively turned off that feature. Fix this by correctly calculating what
    the next group hint should be.

    Acked-by: Tao Ma
    Signed-off-by: Mark Fasheh
    Tested-by: Goldwyn Rodrigues
    Signed-off-by: Tao Ma

    Mark Fasheh
     
  • We have added discontig block group now, and now an inode
    can be allocated in an discontig block group. So get
    it in ocfs2_get_suballoc_slot_bit.

    The old ocfs2_test_suballoc_bit gets group block no
    from the allocation inode which is wrong. Fix it by
    passing the right group.

    Acked-by: Mark Fasheh
    Signed-off-by: Tao Ma

    Tao Ma
     

13 Jul, 2010

1 commit

  • In ocfs2_block_group_alloc, we set c_blkno by bg->bg_blkno.
    But actually bg->bg_blkno is already changed to little endian
    in ocfs2_block_group_fill. So remove the extra cpu_to_le64.

    Reported-by: Marcos Matsunaga
    Signed-off-by: Tao Ma
    Signed-off-by: Joel Becker

    Tao Ma
     

19 May, 2010

1 commit


27 Apr, 2010

1 commit

  • ac_last_group is used to record the last block group we
    used during allocation. But the initialization process
    only calls ocfs2_which_suballoc_group and fails to
    use suballoc_loc properly. So let us do it.
    Another function ocfs2_test_suballoc_bit also needs fix.

    I have searched all the callers of ocfs2_which_suballoc_group,
    and all the callers notices suballoc_loc now.

    Signed-off-by: Tao Ma

    Tao Ma
     

22 Mar, 2010

1 commit