07 Feb, 2014

1 commit

  • Even if using the same jbd2 handle, we cannot rollback a transaction.
    So once some error occurs after successfully allocating clusters, the
    allocated clusters will never be used and it means they are lost. For
    example, call ocfs2_claim_clusters successfully when expanding a file,
    but failed in ocfs2_insert_extent. So we need free the allocated
    clusters if they are not used indeed.

    Signed-off-by: Zongxun Wang
    Signed-off-by: Joseph Qi
    Acked-by: Joel Becker
    Cc: Mark Fasheh
    Cc: Li Zefan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Zongxun Wang
     

06 May, 2010

2 commits

  • I have observed that the current size of 8M gives us pretty poor
    fragmentation on multi-threaded workloads which do lots of writes.

    Generally, I can increase the size of local alloc windows and observe a
    marked decrease in fragmentation, even up and beyond window sizes of 512
    megabytes. This makes sense for a couple reasons - larger local alloc means
    more room for reservation windows. On multi-node workloads the larger local
    alloc helps as well because we don't have to do window slides as often.

    Also, I removed the OCFS2_DEFAULT_LOCAL_ALLOC_SIZE constant as it is no
    longer used and the comment above it was out of date.

    To test fragmentation, I used a workload which launched 4 threads that did
    4k writes into a series of about 140 alternating files.

    With resv_level=2, and a 4k/4k file system I observed the following average
    fragmentation for various localalloc= parameters:

    localalloc= avg. fragmentation
    8 48
    32 16
    64 10
    120 7

    On larger cluster sizes, the difference is more dramatic.

    The new default size top out at 256M, which we'll only get for cluster
    sizes of 32K and above.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Joel Becker

    Mark Fasheh
     
  • This patch pulls the local alloc sizing code into localalloc.c and provides
    a callout to it from ocfs2_fill_super(). Behavior is essentially unchanged
    except that I correctly calculate the maximum local alloc size. The old code
    in ocfs2_parse_options() calculated the max size as:

    ocfs2_local_alloc_size(sb) * 8

    which is correct, in bits. Unfortunately though the option passed in is in
    megabytes. Ultimately, this bug made no real difference - the shrink code
    would catch a too-large size and bring it down to something reasonable.
    Still, it's less than efficient as-is.

    Signed-off-by: Mark Fasheh
    Signed-off-by: Joel Becker

    Mark Fasheh
     

14 Oct, 2008

1 commit

  • Ocfs2's local allocator disables itself for the duration of a mount point
    when it has trouble allocating a large enough area from the primary bitmap.
    That can cause performance problems, especially for disks which were only
    temporarily full or fragmented. This patch allows for the allocator to
    shrink it's window first, before being disabled. Later, it can also be
    re-enabled so that any performance drop is minimized.

    To do this, we allow the value of osb->local_alloc_bits to be shrunk when
    needed. The default value is recorded in a mostly read-only variable so that
    we can re-initialize when required.

    Locking had to be updated so that we could protect changes to
    local_alloc_bits. Mostly this involves protecting various local alloc values
    with the osb spinlock. A new state is also added, OCFS2_LA_THROTTLED, which
    is used when the local allocator is has shrunk, but is not disabled. If the
    available space dips below 1 megabyte, the local alloc file is disabled. In
    either case, local alloc is re-enabled 30 seconds after the event, or when
    an appropriate amount of bits is seen in the primary bitmap.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     

21 Sep, 2007

1 commit

  • The ocfs2 write code loops through a page much like the block code, except
    that ocfs2 allocation units can be any size, including larger than page
    size. Typically it's equal to or larger than page size - most kernels run 4k
    pages, the minimum ocfs2 allocation (cluster) size.

    Some changes introduced during 2.6.23 changed the way writes to pages are
    handled, and inadvertantly broke support for > 4k page size. Instead of just
    writing one cluster at a time, we now handle the whole page in one pass.

    This means that multiple (small) seperate allocations might happen in the
    same pass. The allocation code howver typically optimizes by getting the
    maximum which was reserved. This triggered a BUG_ON in the extend code where
    it'd ask for a single bit (for one part of a > 4k page) and get back more
    than it asked for.

    Fix this by providing a variant of the high level allocation function which
    allows the caller to specify a maximum. The traditional function remains and
    just calls the new one with a maximum determined from the initial
    reservation.

    Signed-off-by: Mark Fasheh

    Mark Fasheh
     

02 Dec, 2006

2 commits


04 Jan, 2006

1 commit