04 May, 2017

12 commits

  • Expose the fifo lists, cached next requests, batching state, and
    dispatch list. It'd also be possible to add the sorted lists, but there
    aren't already seq_file helpers for rbtrees.

    Signed-off-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • Expose the domain token pools, asynchronous sbitmap depth, domain
    request lists, and batching state.

    Signed-off-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • This provides the infrastructure for schedulers to expose their internal
    state through debugfs. We add a list of queue attributes and a list of
    hctx attributes to struct elevator_type and wire them up when switching
    schedulers.

    Signed-off-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke

    Add missing seq_file.h header in blk-mq-debugfs.h

    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • Originally, I tied debugfs registration/unregistration together with
    sysfs. There's no reason to do this, and it's getting in the way of
    letting schedulers define their own debugfs attributes. Instead, tie the
    debugfs registration to the lifetime of the structures themselves.

    The saner lifetimes mean we can also get rid of the extra mq directory
    and move everything one level up. I.e., nvme0n1/mq/hctx0/tags is now
    just nvme0n1/hctx0/tags.

    Signed-off-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • Preparation for adding more declarations.

    Signed-off-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • In commit e869b5462f83 ("blk-mq: Unregister debugfs attributes
    earlier"), we shuffled the debugfs cleanup around so that the "state"
    attribute was removed before we freed the blk-mq data structures.
    However, later changes are going to undo that, so we need to explicitly
    disallow running a dead queue.

    [Omar: rebased and updated commit message]
    Signed-off-by: Omar Sandoval
    Signed-off-by: Bart Van Assche
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • A large part of blk-mq-debugfs.c is file_operations and seq_file
    boilerplate. This sucks as is but will suck even more when schedulers
    can define their own debugfs entries. Factor it all out into a single
    blk_mq_debugfs_fops which multiplexes as needed. We store the
    request_queue, blk_mq_hw_ctx, or blk_mq_ctx in the parent directory
    dentry, which is kind of hacky, but it works.

    Signed-off-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • It's not clear what these numbered directories represent unless you
    consult the code. We're about to get rid of the intermediate "mq"
    directory, so these would be even more confusing without that context.

    Signed-off-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • Slightly more readable, plus we also strip leading spaces.

    Signed-off-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • blk_queue_flags_store() currently truncates and returns a short write if
    the operation being written is too long. This can give us weird results,
    like here:

    $ echo "run bar"
    echo: write error: invalid argument
    $ dmesg
    [ 1103.075435] blk_queue_flags_store: unsupported operation bar. Use either 'run' or 'start'

    Instead, return an error if the user does this. While we're here, make
    the argument names consistent with everywhere else in this file.

    Signed-off-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • Make sure the spelled out flag names match the definition. This also
    adds a missing hctx state, BLK_MQ_S_START_ON_RUN, and a missing
    cmd_flag, __REQ_NOUNMAP.

    Signed-off-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • This reads more naturally than spaces.

    Signed-off-by: Omar Sandoval
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Omar Sandoval
     

27 Apr, 2017

6 commits


21 Apr, 2017

1 commit


11 Apr, 2017

2 commits


22 Mar, 2017

2 commits

  • Currently, statistics are gathered in ~0.13s windows, and users grab the
    statistics whenever they need them. This is not ideal for both in-tree
    users:

    1. Writeback throttling wants its own dynamically sized window of
    statistics. Since the blk-stats statistics are reset after every
    window and the wbt windows don't line up with the blk-stats windows,
    wbt doesn't see every I/O.
    2. Polling currently grabs the statistics on every I/O. Again, depending
    on how the window lines up, we may miss some I/Os. It's also
    unnecessary overhead to get the statistics on every I/O; the hybrid
    polling heuristic would be just as happy with the statistics from the
    previous full window.

    This reworks the blk-stats infrastructure to be callback-based: users
    register a callback that they want called at a given time with all of
    the statistics from the window during which the callback was active.
    Users can dynamically bucketize the statistics. wbt and polling both
    currently use read vs. write, but polling can be extended to further
    subdivide based on request size.

    The callbacks are kept on an RCU list, and each callback has percpu
    stats buffers. There will only be a few users, so the overhead on the
    I/O completion side is low. The stats flushing is also simplified
    considerably: since the timer function is responsible for clearing the
    statistics, we don't have to worry about stale statistics.

    wbt is a trivial conversion. After the conversion, the windowing problem
    mentioned above is fixed.

    For polling, we register an extra callback that caches the previous
    window's statistics in the struct request_queue for the hybrid polling
    heuristic to use.

    Since we no longer have a single stats buffer for the request queue,
    this also removes the sysfs and debugfs stats entries. To replace those,
    we add a debugfs entry for the poll statistics.

    Signed-off-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Omar Sandoval
     
  • The stats buckets will become generic soon, so make the existing users
    use the common READ and WRITE definitions instead of one internal to
    blk-stat.

    Signed-off-by: Omar Sandoval
    Signed-off-by: Jens Axboe

    Omar Sandoval
     

03 Feb, 2017

1 commit


02 Feb, 2017

4 commits


01 Feb, 2017

1 commit

  • Instead of keeping two levels of indirection for requests types, fold it
    all into the operations. The little caveat here is that previously
    cmd_type only applied to struct request, while the request and bio op
    fields were set to plain REQ_OP_READ/WRITE even for passthrough
    operations.

    Instead this patch adds new REQ_OP_* for SCSI passthrough and driver
    private requests, althought it has to add two for each so that we
    can communicate the data in/out nature of the request.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

27 Jan, 2017

9 commits