02 May, 2017

1 commit

  • Pull block layer updates from Jens Axboe:

    - Add BFQ IO scheduler under the new blk-mq scheduling framework. BFQ
    was initially a fork of CFQ, but subsequently changed to implement
    fairness based on B-WF2Q+, a modified variant of WF2Q. BFQ is meant
    to be used on desktop type single drives, providing good fairness.
    From Paolo.

    - Add Kyber IO scheduler. This is a full multiqueue aware scheduler,
    using a scalable token based algorithm that throttles IO based on
    live completion IO stats, similary to blk-wbt. From Omar.

    - A series from Jan, moving users to separately allocated backing
    devices. This continues the work of separating backing device life
    times, solving various problems with hot removal.

    - A series of updates for lightnvm, mostly from Javier. Includes a
    'pblk' target that exposes an open channel SSD as a physical block
    device.

    - A series of fixes and improvements for nbd from Josef.

    - A series from Omar, removing queue sharing between devices on mostly
    legacy drivers. This helps us clean up other bits, if we know that a
    queue only has a single device backing. This has been overdue for
    more than a decade.

    - Fixes for the blk-stats, and improvements to unify the stats and user
    windows. This both improves blk-wbt, and enables other users to
    register a need to receive IO stats for a device. From Omar.

    - blk-throttle improvements from Shaohua. This provides a scalable
    framework for implementing scalable priotization - particularly for
    blk-mq, but applicable to any type of block device. The interface is
    marked experimental for now.

    - Bucketized IO stats for IO polling from Stephen Bates. This improves
    efficiency of polled workloads in the presence of mixed block size
    IO.

    - A few fixes for opal, from Scott.

    - A few pulls for NVMe, including a lot of fixes for NVMe-over-fabrics.
    From a variety of folks, mostly Sagi and James Smart.

    - A series from Bart, improving our exposed info and capabilities from
    the blk-mq debugfs support.

    - A series from Christoph, cleaning up how handle WRITE_ZEROES.

    - A series from Christoph, cleaning up the block layer handling of how
    we track errors in a request. On top of being a nice cleanup, it also
    shrinks the size of struct request a bit.

    - Removal of mg_disk and hd (sorry Linus) by Christoph. The former was
    never used by platforms, and the latter has outlived it's usefulness.

    - Various little bug fixes and cleanups from a wide variety of folks.

    * 'for-4.12/block' of git://git.kernel.dk/linux-block: (329 commits)
    block: hide badblocks attribute by default
    blk-mq: unify hctx delay_work and run_work
    block: add kblock_mod_delayed_work_on()
    blk-mq: unify hctx delayed_run_work and run_work
    nbd: fix use after free on module unload
    MAINTAINERS: bfq: Add Paolo as maintainer for the BFQ I/O scheduler
    blk-mq-sched: alloate reserved tags out of normal pool
    mtip32xx: use runtime tag to initialize command header
    scsi: Implement blk_mq_ops.show_rq()
    blk-mq: Add blk_mq_ops.show_rq()
    blk-mq: Show operation, cmd_flags and rq_flags names
    blk-mq: Make blk_flags_show() callers append a newline character
    blk-mq: Move the "state" debugfs attribute one level down
    blk-mq: Unregister debugfs attributes earlier
    blk-mq: Only unregister hctxs for which registration succeeded
    blk-mq-debugfs: Rename functions for registering and unregistering the mq directory
    blk-mq: Let blk_mq_debugfs_register() look up the queue name
    blk-mq: Register /queue/mq after having registered /queue
    ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset
    ide-pm: always pass 0 error to __blk_end_request_all
    ..

    Linus Torvalds
     

27 Apr, 2017

1 commit

  • mtip32xx supposes that 'request_idx' passed to .init_request()
    is tag of the request, and use that as request's tag to initialize
    command header.

    After MQ IO scheduler is in, request tag assigned isn't same with
    the request index anymore, so cause strange hardware failure on
    mtip32xx, even whole system panic is triggered.

    This patch fixes the issue by initializing command header via
    request's real tag.

    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     

21 Apr, 2017

3 commits


20 Apr, 2017

1 commit

  • The recent introduced MQ IO scheduler breaks mtip32xx in the
    following way.

    mtip32xx use the 'request_index' passed to .init_request() as
    hardware tag index for initializing hardware queue, and it
    actually require that rq->tag is always same with 'request_index'
    passed to .init_request(). Current blk-mq IO scheduler can't
    guarantee this point, so this patch passes BLK_MQ_F_NO_SCHED
    and at least make mtip32xx working.

    This patch fixes the following strange hardware failure. The
    issue can be triggered easily when doing I/O with mq-deadline
    enabled.

    [ 186.972578] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 32993
    [ 186.972578] {1}[Hardware Error]: event severity: fatal
    [ 186.972579] {1}[Hardware Error]: Error 0, type: fatal
    [ 186.972580] {1}[Hardware Error]: section_type: PCIe error
    [ 186.972580] {1}[Hardware Error]: port_type: 0, PCIe end point
    [ 186.972581] {1}[Hardware Error]: version: 1.0
    [ 186.972581] {1}[Hardware Error]: command: 0x0407, status: 0x0010
    [ 186.972582] {1}[Hardware Error]: device_id: 0000:07:00.0
    [ 186.972582] {1}[Hardware Error]: slot: 4
    [ 186.972583] {1}[Hardware Error]: secondary_bus: 0x00
    [ 186.972583] {1}[Hardware Error]: vendor_id: 0x1344, device_id: 0x5150
    [ 186.972584] {1}[Hardware Error]: class_code: 008001
    [ 186.972585] Kernel panic - not syncing: Fatal hardware error!

    Reported-by: Jozef Mikovic
    Signed-off-by: Ming Lei
    Signed-off-by: Jens Axboe

    Ming Lei
     

09 Apr, 2017

1 commit


31 Mar, 2017

1 commit


29 Mar, 2017

1 commit

  • As the .q_usage_counter is used by both legacy and
    mq path, we need to block new I/O if queue becomes
    dead in blk_queue_enter().

    So rename it and we can use this function in both
    paths.

    Reviewed-by: Bart Van Assche
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Ming Lei
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Jens Axboe

    Ming Lei
     

01 Dec, 2016

1 commit

  • Fix bug https://bugzilla.kernel.org/show_bug.cgi?id=188531. In function
    mtip_block_initialize(), variable rv takes the return value, and its
    value should be negative on errors. rv is initialized as 0 and is not
    reset when the call to ida_pre_get() fails. So 0 may be returned.
    The return value 0 indicates that there is no error, which may be
    inconsistent with the execution status. This patch fixes the bug by
    explicitly assigning -ENOMEM to rv on the branch that ida_pre_get()
    fails.

    Signed-off-by: Pan Bian
    Signed-off-by: Jens Axboe

    Pan Bian
     

12 Nov, 2016

1 commit


15 Sep, 2016

1 commit

  • All drivers use the default, so provide an inline version of it. If we
    ever need other queue mapping we can add an optional method back,
    although supporting will also require major changes to the queue setup
    code.

    This provides better code generation, and better debugability as well.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Keith Busch
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

29 Aug, 2016

1 commit

  • We get 1 warning when biuld kernel with W=1:
    drivers/block/mtip32xx/mtip32xx.c:3689:6: warning: no previous prototype for
    'mtip_block_release' [-Wmissing-prototypes]

    In fact, this function is only used in the file in which it is declared
    and don't need a declaration, but can be made static.
    so this patch marks it 'static'.

    Signed-off-by: Baoyou Xie
    Signed-off-by: Jens Axboe

    Baoyou Xie
     

28 Jun, 2016

1 commit

  • For block drivers that specify a parent device, convert them to use
    device_add_disk().

    This conversion was done with the following semantic patch:

    @@
    struct gendisk *disk;
    expression E;
    @@

    - disk->driverfs_dev = E;
    ...
    - add_disk(disk);
    + device_add_disk(E, disk);

    @@
    struct gendisk *disk;
    expression E1, E2;
    @@

    - disk->driverfs_dev = E1;
    ...
    E2 = disk;
    ...
    - add_disk(E2);
    + device_add_disk(E1, E2);

    ...plus some manual fixups for a few missed conversions.

    Cc: Jens Axboe
    Cc: Keith Busch
    Cc: Michael S. Tsirkin
    Cc: David Woodhouse
    Cc: David S. Miller
    Cc: James Bottomley
    Cc: Ross Zwisler
    Cc: Konrad Rzeszutek Wilk
    Cc: Martin K. Petersen
    Reviewed-by: Christoph Hellwig
    Reviewed-by: Johannes Thumshirn
    Signed-off-by: Dan Williams

    Dan Williams
     

08 Jun, 2016

1 commit

  • The req operation REQ_OP is separated from the rq_flag_bits
    definition. This converts the block layer drivers to
    use req_op to get the op from the request struct.

    Signed-off-by: Mike Christie
    Reviewed-by: Hannes Reinecke
    Signed-off-by: Jens Axboe

    Mike Christie
     

13 Apr, 2016

2 commits


19 Mar, 2016

1 commit


04 Mar, 2016

10 commits


22 Jan, 2016

1 commit

  • Pull block driver updates from Jens Axboe:
    "This is the block driver pull request for 4.5, with the exception of
    NVMe, which is in a separate branch and will be posted after this one.

    This pull request contains:

    - A set of bcache stability fixes, which have been acked by Kent.
    These have been used and tested for more than a year by the
    community, so it's about time that they got in.

    - A set of drbd updates from the drbd team (Andreas, Lars, Philipp)
    and Markus Elfring, Oleg Drokin.

    - A set of fixes for xen blkback/front from the usual suspects, (Bob,
    Konrad) as well as community based fixes from Kiri, Julien, and
    Peng.

    - A 2038 time fix for sx8 from Shraddha, with a fix from me.

    - A small mtip32xx cleanup from Zhu Yanjun.

    - A null_blk division fix from Arnd"

    * 'for-4.5/drivers' of git://git.kernel.dk/linux-block: (71 commits)
    null_blk: use sector_div instead of do_div
    mtip32xx: restrict variables visible in current code module
    xen/blkfront: Fix crash if backend doesn't follow the right states.
    xen/blkback: Fix two memory leaks.
    xen/blkback: make st_ statistics per ring
    xen/blkfront: Handle non-indirect grant with 64KB pages
    xen-blkfront: Introduce blkif_ring_get_request
    xen-blkback: clear PF_NOFREEZE for xen_blkif_schedule()
    xen/blkback: Free resources if connect_ring failed.
    xen/blocks: Return -EXX instead of -1
    xen/blkback: make pool of persistent grants and free pages per-queue
    xen/blkback: get the number of hardware queues/rings from blkfront
    xen/blkback: pseudo support for multi hardware queues/rings
    xen/blkback: separate ring information out of struct xen_blkif
    xen/blkfront: correct setting for xen_blkif_max_ring_order
    xen/blkfront: make persistent grants pool per-queue
    xen/blkfront: Remove duplicate setting of ->xbdev.
    xen/blkfront: Cleanup of comments, fix unaligned variables, and syntax errors.
    xen/blkfront: negotiate number of queues/rings to be used with backend
    xen/blkfront: split per device io_lock
    ...

    Linus Torvalds
     

20 Jan, 2016

1 commit

  • Pull core block updates from Jens Axboe:
    "We don't have a lot of core changes this time around, it's mostly in
    drivers, which will come in a subsequent pull.

    The cores changes include:

    - blk-mq
    - Prep patch from Christoph, changing blk_mq_alloc_request() to
    take flags instead of just using gfp_t for sleep/nosleep.
    - Doc patch from me, clarifying the difference between legacy
    and blk-mq for timer usage.
    - Fixes from Raghavendra for memory-less numa nodes, and a reuse
    of CPU masks.

    - Cleanup from Geliang Tang, using offset_in_page() instead of open
    coding it.

    - From Ilya, rename request_queue slab to it reflects what it holds,
    and a fix for proper use of bdgrab/put.

    - A real fix for the split across stripe boundaries from Keith. We
    yanked a broken version of this from 4.4-rc final, this one works.

    - From Mike Krinkin, emit a trace message when we split.

    - From Wei Tang, two small cleanups, not explicitly clearing memory
    that is already cleared"

    * 'for-4.5/core' of git://git.kernel.dk/linux-block:
    block: use bd{grab,put}() instead of open-coding
    block: split bios to max possible length
    block: add call to split trace point
    blk-mq: Avoid memoryless numa node encoded in hctx numa_node
    blk-mq: Reuse hardware context cpumask for tags
    blk-mq: add a flags parameter to blk_mq_alloc_request
    Revert "blk-flush: Queue through IO scheduler when flush not required"
    block: clarify blk_add_timer() use case for blk-mq
    bio: use offset_in_page macro
    block: do not initialise statics to 0 or NULL
    block: do not initialise globals to 0 or NULL
    block: rename request_queue slab cache

    Linus Torvalds
     

09 Jan, 2016

1 commit


06 Jan, 2016

1 commit


02 Dec, 2015

1 commit


20 Nov, 2015

1 commit


07 Nov, 2015

1 commit

  • __GFP_WAIT was used to signal that the caller was in atomic context and
    could not sleep. Now it is possible to distinguish between true atomic
    context and callers that are not willing to sleep. The latter should
    clear __GFP_DIRECT_RECLAIM so kswapd will still wake. As clearing
    __GFP_WAIT behaves differently, there is a risk that people will clear the
    wrong flags. This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
    indicate what it does -- setting it allows all reclaim activity, clearing
    them prevents it.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Mel Gorman
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Acked-by: Johannes Weiner
    Cc: Christoph Lameter
    Acked-by: David Rientjes
    Cc: Vitaly Wool
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

26 Aug, 2015

1 commit

  • Hi,

    After commit f70ced091707 (blk-mq: support per-distpatch_queue flush
    machinery), the mtip32xx driver may oops upon module load due to walking
    off the end of an array in mtip_init_cmd. On initialization of the
    flush_rq, init_request is called with request_index >= the maximum queue
    depth the driver supports. For mtip32xx, this value is used to index
    into an array. What this means is that the driver will walk off the end
    of the array, and either oops or cause random memory corruption.

    The problem is easily reproduced by doing modprobe/rmmod of the mtip32xx
    driver in a loop. I can typically reproduce the problem in about 30
    seconds.

    Now, in the case of mtip32xx, it actually doesn't support flush/fua, so
    I think we can simply return without doing anything. In addition, no
    other mq-enabled driver does anything with the request_index passed into
    init_request(), so no other driver is affected. However, I'm not really
    sure what is expected of drivers. Ming, what did you envision drivers
    would do when initializing the flush requests?

    Signed-off-by: Jeff Moyer
    Signed-off-by: Jens Axboe

    Jeff Moyer
     

24 Jun, 2015

1 commit

  • In mtip_pci_remove(), driver data 'dd' is accessed after freeing it. This
    is a residue of SRSI code cleanup in the patch 016a41c38821 "mtip32xx: fix
    crash on surprise removal of the drive". Removed the bit flags
    MTIP_DDF_REMOVE_DONE_BIT and MTIP_PF_SR_CLEANUP_BIT.

    Reported-by: Julia Lawall
    Signed-off-by: Vignesh Gunasekaran
    Signed-off-by: Selvan Mani
    Signed-off-by: Asai Thambi S P
    Signed-off-by: Jens Axboe

    Selvan Mani
     

16 Jun, 2015

3 commits