01 Mar, 2013

1 commit

  • Pull block IO core bits from Jens Axboe:
    "Below are the core block IO bits for 3.9. It was delayed a few days
    since my workstation kept crashing every 2-8h after pulling it into
    current -git, but turns out it is a bug in the new pstate code (divide
    by zero, will report separately). In any case, it contains:

    - The big cfq/blkcg update from Tejun and and Vivek.

    - Additional block and writeback tracepoints from Tejun.

    - Improvement of the should sort (based on queues) logic in the plug
    flushing.

    - _io() variants of the wait_for_completion() interface, using
    io_schedule() instead of schedule() to contribute to io wait
    properly.

    - Various little fixes.

    You'll get two trivial merge conflicts, which should be easy enough to
    fix up"

    Fix up the trivial conflicts due to hlist traversal cleanups (commit
    b67bfe0d42ca: "hlist: drop the node parameter from iterators").

    * 'for-3.9/core' of git://git.kernel.dk/linux-block: (39 commits)
    block: remove redundant check to bd_openers()
    block: use i_size_write() in bd_set_size()
    cfq: fix lock imbalance with failed allocations
    drivers/block/swim3.c: fix null pointer dereference
    block: don't select PERCPU_RWSEM
    block: account iowait time when waiting for completion of IO request
    sched: add wait_for_completion_io[_timeout]
    writeback: add more tracepoints
    block: add block_{touch|dirty}_buffer tracepoint
    buffer: make touch_buffer() an exported function
    block: add @req to bio_{front|back}_merge tracepoints
    block: add missing block_bio_complete() tracepoint
    block: Remove should_sort judgement when flush blk_plug
    block,elevator: use new hashtable implementation
    cfq-iosched: add hierarchical cfq_group statistics
    cfq-iosched: collect stats from dead cfqgs
    cfq-iosched: separate out cfqg_stats_reset() from cfq_pd_reset_stats()
    blkcg: make blkcg_print_blkgs() grab q locks instead of blkcg lock
    block: RCU free request_queue
    blkcg: implement blkg_[rw]stat_recursive_sum() and blkg_[rw]stat_merge()
    ...

    Linus Torvalds
     

15 Feb, 2013

1 commit

  • Using wait_for_completion() for waiting for a IO request to be executed
    results in wrong iowait time accounting. For example, a system having
    the only task doing write() and fdatasync() on a block device can be
    reported being idle instead of iowaiting as it should because
    blkdev_issue_flush() calls wait_for_completion() which in turn calls
    schedule() that does not increment the iowait proc counter and thus does
    not turn on iowait time accounting.

    The patch makes block layer use wait_for_completion_io() instead of
    wait_for_completion() where appropriate to account iowait time
    correctly.

    Signed-off-by: Vladimir Davydov
    Signed-off-by: Jens Axboe

    Vladimir Davydov
     

08 Feb, 2013

1 commit

  • Move the sysctl-related bits from include/linux/sched.h into
    a new file: include/linux/sched/sysctl.h. Then update source
    files requiring access to those bits by including the new
    header file.

    Signed-off-by: Clark Williams
    Cc: Peter Zijlstra
    Cc: Steven Rostedt
    Link: http://lkml.kernel.org/r/20130207094659.06dced96@riff.lan
    Signed-off-by: Ingo Molnar

    Clark Williams
     

18 Dec, 2012

1 commit

  • Pull block layer core updates from Jens Axboe:
    "Here are the core block IO bits for 3.8. The branch contains:

    - The final version of the surprise device removal fixups from Bart.

    - Don't hide EFI partitions under advanced partition types. It's
    fairly wide spread these days. This is especially dangerous for
    systems that have both msdos and efi partition tables, where you
    want to keep them in sync.

    - Cleanup of using -1 instead of the proper NUMA_NO_NODE

    - Export control of bdi flusher thread CPU mask and default to using
    the home node (if known) from Jeff.

    - Export unplug tracepoint for MD.

    - Core improvements from Shaohua. Reinstate the recursive merge, as
    the original bug has been fixed. Add plugging for discard and also
    fix a problem handling non pow-of-2 discard limits.

    There's a trivial merge in block/blk-exec.c due to a fix that went
    into 3.7-rc at a later point than -rc4 where this is based."

    * 'for-3.8/core' of git://git.kernel.dk/linux-block:
    block: export block_unplug tracepoint
    block: add plug for blkdev_issue_discard
    block: discard granularity might not be power of 2
    deadline: Allow 0ms deadline latency, increase the read speed
    partitions: enable EFI/GPT support by default
    bsg: Remove unused function bsg_goose_queue()
    block: Make blk_cleanup_queue() wait until request_fn finished
    block: Avoid scheduling delayed work on a dead queue
    block: Avoid that request_fn is invoked on a dead queue
    block: Let blk_drain_queue() caller obtain the queue lock
    block: Rename queue dead flag
    bdi: add a user-tunable cpu_list for the bdi flusher threads
    block: use NUMA_NO_NODE instead of -1
    block: recursive merge requests
    block CFQ: avoid moving request to different queue

    Linus Torvalds
     

06 Dec, 2012

2 commits

  • A block driver may start cleaning up resources needed by its
    request_fn as soon as blk_cleanup_queue() finished, so request_fn
    must not be invoked after draining finished. This is important
    when blk_run_queue() is invoked without any requests in progress.
    As an example, if blk_drain_queue() and scsi_run_queue() run in
    parallel, blk_drain_queue() may have finished all requests after
    scsi_run_queue() has taken a SCSI device off the starved list but
    before that last function has had a chance to run the queue.

    Signed-off-by: Bart Van Assche
    Cc: James Bottomley
    Cc: Mike Christie
    Cc: Chanho Min
    Acked-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Bart Van Assche
     
  • QUEUE_FLAG_DEAD is used to indicate that queuing new requests must
    stop. After this flag has been set queue draining starts. However,
    during the queue draining phase it is still safe to invoke the
    queue's request_fn, so QUEUE_FLAG_DYING is a better name for this
    flag.

    This patch has been generated by running the following command
    over the kernel source tree:

    git grep -lEw 'blk_queue_dead|QUEUE_FLAG_DEAD' |
    xargs sed -i.tmp -e 's/blk_queue_dead/blk_queue_dying/g' \
    -e 's/QUEUE_FLAG_DEAD/QUEUE_FLAG_DYING/g'; \
    sed -i.tmp -e "s/QUEUE_FLAG_DYING$(printf \\t)*5/QUEUE_FLAG_DYING$(printf \\t)5/g" \
    include/linux/blkdev.h; \
    sed -i.tmp -e 's/ DEAD/ DYING/g' -e 's/dead queue/a dying queue/' \
    -e 's/Dead queue/A dying queue/' block/blk-core.c

    Signed-off-by: Bart Van Assche
    Acked-by: Tejun Heo
    Cc: James Bottomley
    Cc: Mike Christie
    Cc: Jens Axboe
    Cc: Chanho Min
    Signed-off-by: Jens Axboe

    Bart Van Assche
     

23 Nov, 2012

1 commit

  • After we've done __elv_add_request() and __blk_run_queue() in
    blk_execute_rq_nowait(), the request might finish and be freed
    immediately. Therefore checking if the type is REQ_TYPE_PM_RESUME
    isn't safe afterwards, because if it isn't, rq might be gone.
    Instead, check beforehand and stash the result in a temporary.

    This fixes crashes in blk_execute_rq_nowait() I get occasionally when
    running with lots of memory debugging options enabled -- I think this
    race is usually harmless because the window for rq to be reallocated
    is so small.

    Signed-off-by: Roland Dreier
    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    Roland Dreier
     

20 Jul, 2012

1 commit

  • If the queue is dead blk_execute_rq_nowait() doesn't invoke the done()
    callback function. That will result in blk_execute_rq() being stuck
    in wait_for_completion(). Avoid this by initializing rq->end_io to the
    done() callback before we check the queue state. Also, make sure the
    queue lock is held around the invocation of the done() callback. Found
    this through source code review.

    Signed-off-by: Muthukumar Ratty
    Signed-off-by: Bart Van Assche
    Reviewed-by: Tejun Heo
    Acked-by: Jens Axboe
    Signed-off-by: James Bottomley

    Muthukumar Ratty
     

14 Dec, 2011

2 commits

  • blk_insert_cloned_request(), blk_execute_rq_nowait() and
    blk_flush_plug_list() either didn't check whether the queue was dead
    or did it without holding queue_lock. Update them so that dead state
    is checked while holding queue_lock.

    AFAICS, this plugs all holes (requeue doesn't matter as the request is
    transitioning atomically from in_flight to queued).

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     
  • There are a number of QUEUE_FLAG_DEAD tests. Add blk_queue_dead()
    macro and use it.

    This patch doesn't introduce any functional difference.

    Signed-off-by: Tejun Heo
    Signed-off-by: Jens Axboe

    Tejun Heo
     

22 Jul, 2011

1 commit

  • USB surprise removal of sr is triggering an oops in
    scsi_dispatch_command(). What seems to be happening is that USB is
    hanging on to a queue reference until the last close of the upper
    device, so the crash is caused by surprise remove of a mounted CD
    followed by attempted unmount.

    The problem is that USB doesn't issue its final commands as part of
    the SCSI teardown path, but on last close when the block queue is long
    gone. The long term fix is probably to make sr do the teardown in the
    same way as sd (so remove all the lower bits on ejection, but keep the
    upper disk alive until last close of user space). However, the
    current oops can be simply fixed by not allowing any commands to be
    sent to a dead queue.

    Cc: stable@kernel.org
    Signed-off-by: James Bottomley

    James Bottomley
     

06 May, 2011

1 commit


18 Apr, 2011

1 commit

  • Instead of overloading __blk_run_queue to force an offload to kblockd
    add a new blk_run_queue_async helper to do it explicitly. I've kept
    the blk_queue_stopped check for now, but I suspect it's not needed
    as the check we do when the workqueue items runs should be enough.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

10 Mar, 2011

2 commits


24 Sep, 2010

1 commit

  • During long I/O operations, the hang_check timer may fire,
    trigger stack dumps that unnecessarily alarm the user.

    Eg. hdparm --security-erase NULL /dev/sdb ## can take *hours* to complete

    So, if hang_check is armed, we should wake up periodically
    to prevent it from triggering. This patch uses a wake-up interval
    equal to half the hang_check timer period, which keeps overhead low enough.

    Signed-off-by: Mark Lord
    Signed-off-by: Jens Axboe

    Mark Lord
     

08 Aug, 2010

1 commit


28 Apr, 2009

1 commit

  • RQ_NOMERGE_FLAGS already clears defines which REQ flags aren't
    mergeable. There is no reason to specify it superflously. It only
    adds to confusion. Don't set REQ_NOMERGE for barriers and requests
    with specific queueing directive. REQ_NOMERGE is now exclusively used
    by the merging code.

    [ Impact: cleanup ]

    Signed-off-by: Tejun Heo

    Tejun Heo
     

09 Oct, 2008

1 commit


16 Jul, 2008

2 commits

  • All the users of blk_end_sync_rq has gone (they are converted to use
    blk_execute_rq). This unexports blk_end_sync_rq.

    Signed-off-by: FUJITA Tomonori
    Cc: Borislav Petkov
    Signed-off-by: Jens Axboe
    Signed-off-by: Bartlomiej Zolnierkiewicz

    FUJITA Tomonori
     
  • For blk_pm_resume_request() requests (which are used only by IDE subsystem
    currently) the queue is stopped so we need to call ->request_fn explicitly.

    Thanks to:
    - Rafael for reporting/bisecting the bug
    - Borislav/Rafael for testing the fix

    This is a preparation for converting IDE to use blk_execute_rq().

    Cc: FUJITA Tomonori
    Cc: Borislav Petkov
    Cc: Jens Axboe
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     

01 Feb, 2008

1 commit


30 Jan, 2008

1 commit