19 Jan, 2011

1 commit

  • o Jeff Moyer was doing some testing on a RAM backed disk and
    blkiocg_lookup_group() showed up high overhead after memcpy(). Similarly
    somebody else reported that blkiocg_lookup_group() is eating 6% extra
    cpu. Though looking at the code I can't think why the overhead of
    this function is so high. One thing is that it is called with very high
    frequency (once for every IO).

    o For lot of folks blkio controller will be compiled in but they might
    not have actually created cgroups. Hence optimize the case of root
    cgroup where we can avoid calling blkiocg_lookup_group() if IO is happening
    in root group (common case).

    Reported-by: Jeff Moyer
    Signed-off-by: Vivek Goyal
    Acked-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Vivek Goyal
     

02 Dec, 2010

2 commits

  • o I was discussing what are the variable being updated without spin lock and
    why do we need barriers and Oleg pointed out that location of smp_rmb()
    should be between read of td->limits_changed and tg->limits_changed. This
    patch fixes it.

    o Following is one possible sequence of events. Say cpu0 is executing
    throtl_update_blkio_group_read_bps() and cpu1 is executing
    throtl_process_limit_change().

    cpu0 cpu1

    tg->limits_changed = true;
    smp_mb__before_atomic_inc();
    atomic_inc(&td->limits_changed);

    if (!atomic_read(&td->limits_changed))
    return;

    if (tg->limits_changed)
    do_something;

    If cpu0 has updated tg->limits_changed and td->limits_changed, we want to
    make sure that if update to td->limits_changed is visible on cpu1, then
    update to tg->limits_changed should also be visible.

    Oleg pointed out to ensure that we need to insert an smp_rmb() between
    td->limits_changed read and tg->limits_changed read.

    o I had erroneously put smp_rmb() before atomic_read(&td->limits_changed).
    This patch fixes it.

    Reported-by: Oleg Nesterov
    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • o During some testing I did following and noticed throttling stops working.

    - Put a very low limit on a cgroup, say 1 byte per second.
    - Start some reads, this will set slice_end to a very high value.
    - Change the limit to higher value say 1MB/s
    - Now IO unthrottles and finishes as expected.
    - Try to do the read again but IO is not limited to 1MB/s as expected.

    o What is happening.
    - Initially low value of limit sets slice_end to a very high value.
    - During updation of limit, slice_end is not being truncated.
    - Very high value of slice_end leads to keeping the existing slice
    valid for a very long time and new slice does not start.
    - tg_may_dispatch() is called in blk_throtle_bio(), and trim_slice()
    is not called in this path. So slice_start is some old value and
    practically we are able to do huge amount of IO.

    o There are many ways it can be fixed. I have fixed it by trying to
    adjust/cleanup slice_end in trim_slice(). Generally we extend slices if bio
    is big and can't be dispatched in one slice. After dispatch of bio, readjust
    the slice_end to make sure we don't end up with huge values.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     

16 Nov, 2010

1 commit


02 Oct, 2010

2 commits


01 Oct, 2010

3 commits

  • o Randy Dunlap reported following linux-next failure. This patch fixes it.

    on i386:

    blk-throttle.c:(.text+0x1abb8): undefined reference to `__udivdi3'
    blk-throttle.c:(.text+0x1b1dc): undefined reference to `__udivdi3'

    o bytes_per_second interface is 64bit and I was continuing to do 64 bit
    division even on 32bit platform without help of special macros/functions
    hence the failure.

    Signed-off-by: Vivek Goyal
    Reported-by: Randy Dunlap
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • o Currently any cgroup throttle limit changes are processed asynchronousy and
    the change does not take affect till a new bio is dispatched from same group.

    o It might happen that a user sets a redicuously low limit on throttling.
    Say 1 bytes per second on reads. In such cases simple operations like mount
    a disk can wait for a very long time.

    o Once bio is throttled, there is no easy way to come out of that wait even if
    user increases the read limit later.

    o This patch fixes it. Now if a user changes the cgroup limits, we recalculate
    the bio dispatch time according to new limits.

    o Can't take queueu lock under blkcg_lock, hence after the change I wake
    up the dispatch thread again which recalculates the time. So there are some
    variables being synchronized across two threads without lock and I had to
    make use of barriers. Hoping I have used barriers correctly. Any review of
    memory barrier code especially will help.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     
  • o Currently all the dynamically allocated groups, except root grp is added
    to td->tg_list. This was not a problem so far but in next patch I will
    travel through td->tg_list to process any updates of limits on the group.
    If root group is not in tg_list, then root group's updates are not
    processed.

    o It is better to root group also to tg_list instead of doing special
    processing for it during limit updates.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Vivek Goyal
     

16 Sep, 2010

2 commits