12 Jul, 2011

2 commits

  • Move the variables to do think time check to a sepatate struct. This is
    to prepare adding think time check for service tree and group. No
    functional change.

    Signed-off-by: Shaohua Li
    Acked-by: Vivek Goyal
    Signed-off-by: Jens Axboe

    Shaohua Li
     
  • fs_excl is a poor man's priority inheritance for filesystems to hint to
    the block layer that an operation is important. It was never clearly
    specified, not widely adopted, and will not prevent starvation in many
    cases (like across cgroups).

    fs_excl was introduced with the time sliced CFQ IO scheduler, to
    indicate when a process held FS exclusive resources and thus needed
    a boost.

    It doesn't cover all file systems, and it was never fully complete.
    Lets kill it.

    Signed-off-by: Justin TerAvest
    Signed-off-by: Jens Axboe

    Justin TerAvest
     

11 Jul, 2011

1 commit

  • There is no consistency among filesystems from what bios (or requests)
    are marked as being metadata. It's interesting to expose this in traces,
    but we shouldn't schedule the requests differently based on whether or
    not they're marked as being metadata.

    Signed-off-by: Justin TerAvest
    Signed-off-by: Jens Axboe

    Justin TerAvest
     

08 Jul, 2011

2 commits

  • I'm often confused why not disable preempt when changing blk_plug list. It
    would be better to add comments here in case others have the similar concerns.

    Signed-off-by: Shaohua Li
    Signed-off-by: Jens Axboe

    Shaohua Li
     
  • When I test fio script with big I/O depth, I found the total throughput drops
    compared to some relative small I/O depth. The reason is the thread accumulates
    big requests in its plug list and causes some delays (surely this depends
    on CPU speed).
    I thought we'd better have a threshold for requests. When a threshold reaches,
    this means there is no request merge and queue lock contention isn't severe
    when pushing per-task requests to queue, so the main advantages of blk plug
    don't exist. We can force a plug list flush in this case.
    With this, my test throughput actually increases and almost equals to small
    I/O depth. Another side effect is irq off time decreases in blk_flush_plug_list()
    for big I/O depth.
    The BLK_MAX_REQUEST_COUNT is choosen arbitarily, but 16 is efficiently to
    reduce lock contention to me. But I'm open here, 32 is ok in my test too.

    Signed-off-by: Shaohua Li
    Signed-off-by: Jens Axboe

    Shaohua Li
     

07 Jul, 2011

2 commits

  • Fix headers_check error introduced by 390192b30057:

    include/linux/fd.h:6: included file 'linux/compat.h' is not exported

    Signed-off-by: Johannes Stezenbach
    Signed-off-by: Jens Axboe

    Johannes Stezenbach
     
  • Due to the recently identified overflow in read_capacity_16() it was
    possible for max_discard_sectors to be zero but still have discards
    enabled on the associated device's queue.

    Eliminate the possibility for blkdev_issue_discard to infinitely loop.

    Interestingly this issue wasn't identified until a device, whose
    discard_granularity was 0 due to read_capacity_16 overflow, was consumed
    by blk_stack_limits() to construct limits for a higher-level DM
    multipath device. The multipath device's resulting limits never had the
    discard limits stacked because blk_stack_limits() will only do so if
    the bottom device's discard_granularity != 0. This resulted in the
    multipath device's limits.max_discard_sectors being 0.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Mike Snitzer
     

02 Jul, 2011

1 commit

  • On Linux x86_64 host with 32bit userspace, running
    qemu or even just "qemu-img create -f qcow2 some.img 1G"
    causes a kernel warning:

    ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(00005326){t:'S';sz:0} arg(7fffffff) on some.img
    ioctl32(qemu-img:5296): Unknown cmd fd(3) cmd(801c0204){t:02;sz:28} arg(fff77350) on some.img

    ioctl 00005326 is CDROM_DRIVE_STATUS,
    ioctl 801c0204 is FDGETPRM.

    The warning appears because the Linux compat-ioctl handler for these
    ioctls only applies to block devices, while qemu also uses the ioctls on
    plain files.

    Signed-off-by: Johannes Stezenbach
    Acked-by: Arnd Bergmann
    Signed-off-by: Jens Axboe

    Johannes Stezenbach
     

01 Jul, 2011

2 commits

  • Currently, only open(2) is defined as the 'clearing' point. It has
    two roles - first, it's an acknowledgement from userland indicating
    that the event has been received and kernel can clear pending states
    and proceed to generate more events. Secondly, it's passed on to
    device drivers as a hint indicating that a synchronization point has
    been reached and it might want to take a deeper look at the device.

    The latter currently is only used by sr which uses two different
    mechanisms - GET_EVENT_MEDIA_STATUS_NOTIFICATION and TEST_UNIT_READY
    to discover events, where the former is lighter weight and safe to be
    used repeatedly but may not provide full coverage. Among other
    things, GET_EVENT can't detect media removal while TUR can.

    This patch makes close(2) - blkdev_put() - indicate clearing hint for
    MEDIA_CHANGE to drivers. disk_check_events() is renamed to
    disk_flush_events() and updated to take @mask for events to flush
    which is or'd to ev->clearing and will be passed to the driver on the
    next ->check_events() invocation.

    This change makes sr generate MEDIA_CHANGE when media is ejected from
    userland - e.g. with eject(1).

    Note: Given the current usage, it seems @clearing hint is needlessly
    complex. disk_clear_events() can simply clear all events and the hint
    can be boolean @flush.

    Signed-off-by: Tejun Heo
    Cc: Kay Sievers
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • Conflicts:
    block/blk-throttle.c
    block/cfq-iosched.c

    Signed-off-by: Jens Axboe

    Jens Axboe
     

30 Jun, 2011

8 commits


27 Jun, 2011

4 commits

  • ioc->ioc_data is rcu protectd, so uses correct API to access it.
    This doesn't change any behavior, but just make code consistent.

    Signed-off-by: Shaohua Li
    Cc: stable@kernel.org # after ab4bd22d
    Signed-off-by: Jens Axboe

    Shaohua Li
     
  • I got a rcu warnning at boot. the ioc->ioc_data is rcu_deferenced, but
    doesn't hold rcu_read_lock.

    Signed-off-by: Shaohua Li
    Cc: stable@kernel.org # after ab4bd22d
    Signed-off-by: Jens Axboe

    Shaohua Li
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/sfrench/cifs-2.6:
    cifs: mark CONFIG_CIFS_NFSD_EXPORT as BROKEN
    cifs: free blkcipher in smbhash

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6:
    cifs: propagate errors from cifs_get_root() to mount(2)
    cifs: tidy cifs_do_mount() up a bit
    cifs: more breakage on mount failures
    cifs: close sget() races
    cifs: pull freeing mountdata/dropping nls/freeing cifs_sb into cifs_umount()
    cifs: move cifs_umount() call into ->kill_sb()
    cifs: pull cifs_mount() call up
    sanitize cifs_umount() prototype
    cifs: initialize ->tlink_tree in cifs_setup_cifs_sb()
    cifs: allocate mountdata earlier
    cifs: leak on mount if we share superblock
    cifs: don't pass superblock to cifs_mount()
    cifs: don't leak nls on mount failure
    cifs: double free on mount failure
    take bdi setup/destruction into cifs_mount/cifs_umount

    Acked-by: Steve French

    Linus Torvalds
     

26 Jun, 2011

1 commit


25 Jun, 2011

17 commits