03 Apr, 2014

2 commits

  • Pull networking updates from David Miller:
    "Here is my initial pull request for the networking subsystem during
    this merge window:

    1) Support for ESN in AH (RFC 4302) from Fan Du.

    2) Add full kernel doc for ethtool command structures, from Ben
    Hutchings.

    3) Add BCM7xxx PHY driver, from Florian Fainelli.

    4) Export computed TCP rate information in netlink socket dumps, from
    Eric Dumazet.

    5) Allow IPSEC SA to be dumped partially using a filter, from Nicolas
    Dichtel.

    6) Convert many drivers to pci_enable_msix_range(), from Alexander
    Gordeev.

    7) Record SKB timestamps more efficiently, from Eric Dumazet.

    8) Switch to microsecond resolution for TCP round trip times, also
    from Eric Dumazet.

    9) Clean up and fix 6lowpan fragmentation handling by making use of
    the existing inet_frag api for it's implementation.

    10) Add TX grant mapping to xen-netback driver, from Zoltan Kiss.

    11) Auto size SKB lengths when composing netlink messages based upon
    past message sizes used, from Eric Dumazet.

    12) qdisc dumps can take a long time, add a cond_resched(), From Eric
    Dumazet.

    13) Sanitize netpoll core and drivers wrt. SKB handling semantics.
    Get rid of never-used-in-tree netpoll RX handling. From Eric W
    Biederman.

    14) Support inter-address-family and namespace changing in VTI tunnel
    driver(s). From Steffen Klassert.

    15) Add Altera TSE driver, from Vince Bridgers.

    16) Optimizing csum_replace2() so that it doesn't adjust the checksum
    by checksumming the entire header, from Eric Dumazet.

    17) Expand BPF internal implementation for faster interpreting, more
    direct translations into JIT'd code, and much cleaner uses of BPF
    filtering in non-socket ocntexts. From Daniel Borkmann and Alexei
    Starovoitov"

    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1976 commits)
    netpoll: Use skb_irq_freeable to make zap_completion_queue safe.
    net: Add a test to see if a skb is freeable in irq context
    qlcnic: Fix build failure due to undefined reference to `vxlan_get_rx_port'
    net: ptp: move PTP classifier in its own file
    net: sxgbe: make "core_ops" static
    net: sxgbe: fix logical vs bitwise operation
    net: sxgbe: sxgbe_mdio_register() frees the bus
    Call efx_set_channels() before efx->type->dimension_resources()
    xen-netback: disable rogue vif in kthread context
    net/mlx4: Set proper build dependancy with vxlan
    be2net: fix build dependency on VxLAN
    mac802154: make csma/cca parameters per-wpan
    mac802154: allow only one WPAN to be up at any given time
    net: filter: minor: fix kdoc in __sk_run_filter
    netlink: don't compare the nul-termination in nla_strcmp
    can: c_can: Avoid led toggling for every packet.
    can: c_can: Simplify TX interrupt cleanup
    can: c_can: Store dlc private
    can: c_can: Reduce register access
    can: c_can: Make the code readable
    ...

    Linus Torvalds
     
  • Pull trivial tree updates from Jiri Kosina:
    "Usual rocket science -- mostly documentation and comment updates"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
    sparse: fix comment
    doc: fix double words
    isdn: capi: fix "CAPI_VERSION" comment
    doc: DocBook: Fix typos in xml and template file
    Bluetooth: add module name for btwilink
    driver core: unexport static function create_syslog_header
    mmc: core: typo fix in printk specifier
    ARM: spear: clean up editing mistake
    net-sysfs: fix comment typo 'CONFIG_SYFS'
    doc: Insert MODULE_ in module-signing macros
    Documentation: update URL to hfsplus Technote 1150
    gpio: update path to documentation
    ixgbe: Fix format string in ixgbe_fcoe.
    Kconfig: Remove useless "default N" lines
    user_namespace.c: Remove duplicated word in comment
    CREDITS: fix formatting
    treewide: Fix typo in Documentation/DocBook
    mm: Fix warning on make htmldocs caused by slab.c
    ata: ata-samsung_cf: cleanup in header file
    idr: remove unused prototype of idr_free()

    Linus Torvalds
     

02 Apr, 2014

1 commit

  • Pull core block layer updates from Jens Axboe:
    "This is the pull request for the core block IO bits for the 3.15
    kernel. It's a smaller round this time, it contains:

    - Various little blk-mq fixes and additions from Christoph and
    myself.

    - Cleanup of the IPI usage from the block layer, and associated
    helper code. From Frederic Weisbecker and Jan Kara.

    - Duplicate code cleanup in bio-integrity from Gu Zheng. This will
    give you a merge conflict, but that should be easy to resolve.

    - blk-mq notify spinlock fix for RT from Mike Galbraith.

    - A blktrace partial accounting bug fix from Roman Pen.

    - Missing REQ_SYNC detection fix for blk-mq from Shaohua Li"

    * 'for-3.15/core' of git://git.kernel.dk/linux-block: (25 commits)
    blk-mq: add REQ_SYNC early
    rt,blk,mq: Make blk_mq_cpu_notify_lock a raw spinlock
    blk-mq: support partial I/O completions
    blk-mq: merge blk_mq_insert_request and blk_mq_run_request
    blk-mq: remove blk_mq_alloc_rq
    blk-mq: don't dump CPU -> hw queue map on driver load
    blk-mq: fix wrong usage of hctx->state vs hctx->flags
    blk-mq: allow blk_mq_init_commands() to return failure
    block: remove old blk_iopoll_enabled variable
    blktrace: fix accounting of partially completed requests
    smp: Rename __smp_call_function_single() to smp_call_function_single_async()
    smp: Remove wait argument from __smp_call_function_single()
    watchdog: Simplify a little the IPI call
    smp: Move __smp_call_function_single() below its safe version
    smp: Consolidate the various smp_call_function_single() declensions
    smp: Teach __smp_call_function_single() to check for offline cpus
    smp: Remove unused list_head from csd
    smp: Iterate functions through llist_for_each_entry_safe()
    block: Stop abusing rq->csd.list in blk-softirq
    block: Remove useless IPI struct initialization
    ...

    Linus Torvalds
     

26 Mar, 2014

1 commit


21 Mar, 2014

7 commits

  • Add REQ_SYNC early, so rq_dispatched[] in blk_mq_rq_ctx_init
    is set correctly.

    Signed-off-by: Shaohua Li
    Signed-off-by: Jens Axboe

    Shaohua Li
     
  • [ 365.164040] BUG: sleeping function called from invalid context at kernel/rtmutex.c:674
    [ 365.164041] in_atomic(): 1, irqs_disabled(): 1, pid: 26, name: migration/1
    [ 365.164043] no locks held by migration/1/26.
    [ 365.164044] irq event stamp: 6648
    [ 365.164056] hardirqs last enabled at (6647): [] restore_args+0x0/0x30
    [ 365.164062] hardirqs last disabled at (6648): [] multi_cpu_stop+0x9d/0x120
    [ 365.164070] softirqs last enabled at (0): [] copy_process.part.28+0x6fc/0x1920
    [ 365.164072] softirqs last disabled at (0): [< (null)>] (null)
    [ 365.164076] CPU: 1 PID: 26 Comm: migration/1 Tainted: GF N 3.12.12-rt19-0.gcb6c4a2-rt #3
    [ 365.164078] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011
    [ 365.164091] 0000000000000001 ffff880a42ea7c30 ffffffff815367e6 ffffffff81a086c0
    [ 365.164099] ffff880a42ea7c40 ffffffff8108919c ffff880a42ea7c60 ffffffff8153c24f
    [ 365.164107] ffff880a42ea91f0 00000000ffffffe1 ffff880a42ea7c88 ffffffff81297ec0
    [ 365.164108] Call Trace:
    [ 365.164119] [] try_stack_unwind+0x191/0x1a0
    [ 365.164127] [] dump_trace+0x92/0x360
    [ 365.164133] [] show_trace_log_lvl+0x48/0x60
    [ 365.164138] [] show_stack_log_lvl+0xd8/0x1d0
    [ 365.164143] [] show_stack+0x20/0x50
    [ 365.164153] [] dump_stack+0x54/0x9a
    [ 365.164163] [] __might_sleep+0xfc/0x140
    [ 365.164173] [] rt_spin_lock+0x1f/0x70
    [ 365.164182] [] blk_mq_main_cpu_notify+0x20/0x70
    [ 365.164191] [] notifier_call_chain+0x4c/0x70
    [ 365.164201] [] __raw_notifier_call_chain+0x9/0x10
    [ 365.164207] [] cpu_notify+0x1e/0x40
    [ 365.164217] [] take_cpu_down+0x22/0x40
    [ 365.164223] [] multi_cpu_stop+0xd6/0x120
    [ 365.164229] [] cpu_stopper_thread+0xd7/0x1e0
    [ 365.164235] [] smpboot_thread_fn+0x203/0x380
    [ 365.164241] [] kthread+0xc8/0xd0
    [ 365.164250] [] ret_from_fork+0x7c/0xb0
    [ 365.164429] smpboot: CPU 1 is now offline

    Signed-off-by: Mike Galbraith
    Signed-off-by: Jens Axboe

    Mike Galbraith
     
  • Add a new blk_mq_end_io_partial function to partially complete requests
    as needed by the SCSI layer. We do this by reusing blk_update_request
    to advance the bio instead of having a simplified version of it in
    the blk-mq code.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • It's almost identical to blk_mq_insert_request, so fold the two into one
    slightly more generic function by making the flush special case a bit
    smarted.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • There's only one caller, which is a straight wrapper and fits the naming
    scheme of the related functions a lot better.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Commit 7982e90c3a57 ("block: fix q->flush_rq NULL pointer crash on
    dm-mpath flush") moved an allocation to blk_init_allocated_queue(), but
    neglected to free that allocation on the error paths that follow.

    Signed-off-by: Dave Jones
    Acked-by: Mike Snitzer
    Signed-off-by: Jens Axboe
    Signed-off-by: Linus Torvalds

    Dave Jones
     
  • Now that we are out of initial debug/bringup mode, remove
    the verbose dump of the mapping table.

    Provide the mapping table in sysfs, under the hardware queue
    directory, in the cpu_list file.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

20 Mar, 2014

1 commit

  • BLK_MQ_F_* flags are for hctx->flags, and are non-atomic and
    set at registration time. BLK_MQ_S_* flags are dynamic and
    atomic, and are accessed through hctx->state.

    Some of the BLK_MQ_S_STOPPED uses were wrong. Additionally,
    the header file should not use a bit shift for the _S_ flags,
    as they are done through the set/test_bit functions.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

15 Mar, 2014

2 commits

  • Replace the bh safe variant with the hard irq safe variant.

    We need a hard irq safe variant to deal with netpoll transmitting
    packets from hard irq context, and we need it in most if not all of
    the places using the bh safe variant.

    Except on 32bit uni-processor the code is exactly the same so don't
    bother with a bh variant, just have a hard irq safe variant that
    everyone can use.

    Signed-off-by: "Eric W. Biederman"
    Signed-off-by: David S. Miller

    Eric W. Biederman
     
  • If drivers do dynamic allocation in the hardware command init
    path, then we need to be able to handle and return failures.

    And if they do allocations or mappings in the init command path,
    then we need a cleanup function to free up that space at exit
    time. So add blk_mq_free_commands() as the cleanup function.

    This is required for the mtip32xx driver conversion to blk-mq.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

13 Mar, 2014

1 commit

  • This was a debugging measure to toggle enabled/disabled
    when testing. But for real production setups, it's not
    safe to toggle this setting without either reloading
    drivers of quiescing IO first. Neither of which the toggle
    enforces.

    Additionally, it makes drivers deal with the conditional
    state.

    Remove it completely. It's up to the driver whether iopoll
    is enabled or not.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

09 Mar, 2014

2 commits

  • Commit 18741986 inadvertently changed the rq flush insertion
    from a head to a tail insertion. Fix that back up.

    Signed-off-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Mike Snitzer
     
  • Commit 1874198 ("blk-mq: rework flush sequencing logic") switched
    ->flush_rq from being an embedded member of the request_queue structure
    to being dynamically allocated in blk_init_queue_node().

    Request-based DM multipath doesn't use blk_init_queue_node(), instead it
    uses blk_alloc_queue_node() + blk_init_allocated_queue(). Because
    commit 1874198 placed the dynamic allocation of ->flush_rq in
    blk_init_queue_node() any flush issued to a dm-mpath device would crash
    with a NULL pointer, e.g.:

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [] blk_rq_init+0x1e/0xb0
    PGD bb3c7067 PUD bb01d067 PMD 0
    Oops: 0002 [#1] SMP
    ...
    CPU: 5 PID: 5028 Comm: dt Tainted: G W O 3.14.0-rc3.snitm+ #10
    ...
    task: ffff88032fb270e0 ti: ffff880079564000 task.ti: ffff880079564000
    RIP: 0010:[] [] blk_rq_init+0x1e/0xb0
    RSP: 0018:ffff880079565c98 EFLAGS: 00010046
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000030
    RDX: ffff880260c74048 RSI: 0000000000000000 RDI: 0000000000000000
    RBP: ffff880079565ca8 R08: ffff880260aa1e98 R09: 0000000000000001
    R10: ffff88032fa78500 R11: 0000000000000246 R12: 0000000000000000
    R13: ffff880260aa1de8 R14: 0000000000000650 R15: 0000000000000000
    FS: 00007f8d36a2a700(0000) GS:ffff88033fca0000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 0000000000000000 CR3: 0000000079b36000 CR4: 00000000000007e0
    Stack:
    0000000000000000 ffff880260c74048 ffff880079565cd8 ffffffff81257a47
    ffff880260aa1de8 ffff880260c74048 0000000000000001 0000000000000000
    ffff880079565d08 ffffffff81257c2d 0000000000000000 ffff880260aa1de8
    Call Trace:
    [] blk_flush_complete_seq+0x2d7/0x2e0
    [] blk_insert_flush+0x1dd/0x210
    [] __elv_add_request+0x1f9/0x320
    [] ? blk_account_io_start+0x111/0x190
    [] blk_queue_bio+0x25b/0x330
    [] dm_request+0x35/0x40 [dm_mod]
    [] generic_make_request+0xc0/0x100
    [] submit_bio+0x73/0x140
    [] submit_bio_wait+0x5d/0x80
    [] blkdev_issue_flush+0x78/0xa0
    [] blkdev_fsync+0x3f/0x60
    [] vfs_fsync_range+0x1e/0x20
    [] vfs_fsync+0x1c/0x20
    [] do_fsync+0x41/0x80
    [] ? SyS_lseek+0x7e/0x80
    [] SyS_fsync+0x10/0x20
    [] system_call_fastpath+0x16/0x1b

    Fix this by moving the ->flush_rq allocation from blk_init_queue_node()
    to blk_init_allocated_queue(). blk_init_queue_node() also calls
    blk_init_allocated_queue() so this change is functionality equivalent
    for all blk_init_queue_node() callers.

    Reported-by: Hannes Reinecke
    Reported-by: Christoph Hellwig
    Signed-off-by: Mike Snitzer
    Signed-off-by: Jens Axboe

    Mike Snitzer
     

07 Mar, 2014

1 commit


06 Mar, 2014

1 commit

  • trace_block_rq_complete does not take into account that request can
    be partially completed, so we can get the following incorrect output
    of blkparser:

    C R 232 + 240 [0]
    C R 240 + 232 [0]
    C R 248 + 224 [0]
    C R 256 + 216 [0]

    but should be:

    C R 232 + 8 [0]
    C R 240 + 8 [0]
    C R 248 + 8 [0]
    C R 256 + 8 [0]

    Also, the whole output summary statistics of completed requests and
    final throughput will be incorrect.

    This patch takes into account real completion size of the request and
    fixes wrong completion accounting.

    Signed-off-by: Roman Pen
    CC: Steven Rostedt
    CC: Frederic Weisbecker
    CC: Ingo Molnar
    CC: linux-kernel@vger.kernel.org
    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    Roman Pen
     

04 Mar, 2014

1 commit

  • [ 365.164040] BUG: sleeping function called from invalid context at kernel/rtmutex.c:674
    [ 365.164041] in_atomic(): 1, irqs_disabled(): 1, pid: 26, name: migration/1
    [ 365.164043] no locks held by migration/1/26.
    [ 365.164044] irq event stamp: 6648
    [ 365.164056] hardirqs last enabled at (6647): [] restore_args+0x0/0x30
    [ 365.164062] hardirqs last disabled at (6648): [] multi_cpu_stop+0x9d/0x120
    [ 365.164070] softirqs last enabled at (0): [] copy_process.part.28+0x6fc/0x1920
    [ 365.164072] softirqs last disabled at (0): [< (null)>] (null)
    [ 365.164076] CPU: 1 PID: 26 Comm: migration/1 Tainted: GF N 3.12.12-rt19-0.gcb6c4a2-rt #3
    [ 365.164078] Hardware name: QCI QSSC-S4R/QSSC-S4R, BIOS QSSC-S4R.QCI.01.00.S013.032920111005 03/29/2011
    [ 365.164091] 0000000000000001 ffff880a42ea7c30 ffffffff815367e6 ffffffff81a086c0
    [ 365.164099] ffff880a42ea7c40 ffffffff8108919c ffff880a42ea7c60 ffffffff8153c24f
    [ 365.164107] ffff880a42ea91f0 00000000ffffffe1 ffff880a42ea7c88 ffffffff81297ec0
    [ 365.164108] Call Trace:
    [ 365.164119] [] try_stack_unwind+0x191/0x1a0
    [ 365.164127] [] dump_trace+0x92/0x360
    [ 365.164133] [] show_trace_log_lvl+0x48/0x60
    [ 365.164138] [] show_stack_log_lvl+0xd8/0x1d0
    [ 365.164143] [] show_stack+0x20/0x50
    [ 365.164153] [] dump_stack+0x54/0x9a
    [ 365.164163] [] __might_sleep+0xfc/0x140
    [ 365.164173] [] rt_spin_lock+0x1f/0x70
    [ 365.164182] [] blk_mq_main_cpu_notify+0x20/0x70
    [ 365.164191] [] notifier_call_chain+0x4c/0x70
    [ 365.164201] [] __raw_notifier_call_chain+0x9/0x10
    [ 365.164207] [] cpu_notify+0x1e/0x40
    [ 365.164217] [] take_cpu_down+0x22/0x40
    [ 365.164223] [] multi_cpu_stop+0xd6/0x120
    [ 365.164229] [] cpu_stopper_thread+0xd7/0x1e0
    [ 365.164235] [] smpboot_thread_fn+0x203/0x380
    [ 365.164241] [] kthread+0xc8/0xd0
    [ 365.164250] [] ret_from_fork+0x7c/0xb0
    [ 365.164429] smpboot: CPU 1 is now offline

    Signed-off-by: Mike Galbraith
    Signed-off-by: Jens Axboe

    Mike Galbraith
     

25 Feb, 2014

4 commits

  • The name __smp_call_function_single() doesn't tell much about the
    properties of this function, especially when compared to
    smp_call_function_single().

    The comments above the implementation are also misleading. The main
    point of this function is actually not to be able to embed the csd
    in an object. This is actually a requirement that result from the
    purpose of this function which is to raise an IPI asynchronously.

    As such it can be called with interrupts disabled. And this feature
    comes at the cost of the caller who then needs to serialize the
    IPIs on this csd.

    Lets rename the function and enhance the comments so that they reflect
    these properties.

    Suggested-by: Christoph Hellwig
    Cc: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Ingo Molnar
    Cc: Jan Kara
    Cc: Jens Axboe
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Jens Axboe

    Frederic Weisbecker
     
  • The main point of calling __smp_call_function_single() is to send
    an IPI in a pure asynchronous way. By embedding a csd in an object,
    a caller can send the IPI without waiting for a previous one to complete
    as is required by smp_call_function_single() for example. As such,
    sending this kind of IPI can be safe even when irqs are disabled.

    This flexibility comes at the expense of the caller who then needs to
    synchronize the csd lifecycle by himself and make sure that IPIs on a
    single csd are serialized.

    This is how __smp_call_function_single() works when wait = 0 and this
    usecase is relevant.

    Now there don't seem to be any usecase with wait = 1 that can't be
    covered by smp_call_function_single() instead, which is safer. Lets look
    at the two possible scenario:

    1) The user calls __smp_call_function_single(wait = 1) on a csd embedded
    in an object. It looks like a nice and convenient pattern at the first
    sight because we can then retrieve the object from the IPI handler easily.

    But actually it is a waste of memory space in the object since the csd
    can be allocated from the stack by smp_call_function_single(wait = 1)
    and the object can be passed an the IPI argument.

    Besides that, embedding the csd in an object is more error prone
    because the caller must take care of the serialization of the IPIs
    for this csd.

    2) The user calls __smp_call_function_single(wait = 1) on a csd that
    is allocated on the stack. It's ok but smp_call_function_single()
    can do it as well and it already takes care of the allocation on the
    stack. Again it's more simple and less error prone.

    Therefore, using the underscore prepend API version with wait = 1
    is a bad pattern and a sign that the caller can do safer and more
    simple.

    There was a single user of that which has just been converted.
    So lets remove this option to discourage further users.

    Cc: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Ingo Molnar
    Cc: Jan Kara
    Cc: Jens Axboe
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Jens Axboe

    Frederic Weisbecker
     
  • Abusing rq->csd.list for a list of requests to complete is rather ugly.
    We use rq->queuelist instead which is much cleaner. It is safe because
    queuelist is used by the block layer only for requests waiting to be
    submitted to a device. Thus it is unused when irq reports the request IO
    is finished.

    Signed-off-by: Jan Kara
    Cc: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Ingo Molnar
    Cc: Jens Axboe
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Jens Axboe

    Jan Kara
     
  • Block layer currently abuses rq->csd.list.next for storing fifo_time.
    That is a terrible hack and completely unnecessary as well. Union
    achieves the same space saving in a cleaner way.

    Signed-off-by: Jan Kara
    Cc: Andrew Morton
    Cc: Christoph Hellwig
    Cc: Ingo Molnar
    Cc: Jens Axboe
    Signed-off-by: Frederic Weisbecker
    Signed-off-by: Jens Axboe

    Jan Kara
     

22 Feb, 2014

3 commits


20 Feb, 2014

1 commit


19 Feb, 2014

3 commits


15 Feb, 2014

1 commit

  • Pull block IO fixes from Jens Axboe:
    "Second round of updates and fixes for 3.14-rc2. Most of this stuff
    has been queued up for a while. The notable exception is the blk-mq
    changes, which are naturally a bit more in flux still.

    The pull request contains:

    - Two bug fixes for the new immutable vecs, causing crashes with raid
    or swap. From Kent.

    - Various blk-mq tweaks and fixes from Christoph. A fix for
    integrity bio's from Nic.

    - A few bcache fixes from Kent and Darrick Wong.

    - xen-blk{front,back} fixes from David Vrabel, Matt Rushton, Nicolas
    Swenson, and Roger Pau Monne.

    - Fix for a vec miscount with integrity vectors from Martin.

    - Minor annotations or fixes from Masanari Iida and Rashika Kheria.

    - Tweak to null_blk to do more normal FIFO processing of requests
    from Shlomo Pongratz.

    - Elevator switching bypass fix from Tejun.

    - Softlockup in blkdev_issue_discard() fix when !CONFIG_PREEMPT from
    me"

    * 'for-linus' of git://git.kernel.dk/linux-block: (31 commits)
    block: add cond_resched() to potentially long running ioctl discard loop
    xen-blkback: init persistent_purge_work work_struct
    blk-mq: pair blk_mq_start_request / blk_mq_requeue_request
    blk-mq: dont assume rq->errors is set when returning an error from ->queue_rq
    block: Fix cloning of discard/write same bios
    block: Fix type mismatch in ssize_t_blk_mq_tag_sysfs_show
    blk-mq: rework flush sequencing logic
    null_blk: use blk_complete_request and blk_mq_complete_request
    virtio_blk: use blk_mq_complete_request
    blk-mq: rework I/O completions
    fs: Add prototype declaration to appropriate header file include/linux/bio.h
    fs: Mark function as static in fs/bio-integrity.c
    block/null_blk: Fix completion processing from LIFO to FIFO
    block: Explicitly handle discard/write same segments
    block: Fix nr_vecs for inline integrity vectors
    blk-mq: Add bio_integrity setup to blk_mq_make_request
    blk-mq: initialize sg_reserved_size
    blk-mq: handle dma_drain_size
    blk-mq: divert __blk_put_request for MQ ops
    blk-mq: support at_head inserations for blk_execute_rq
    ...

    Linus Torvalds
     

13 Feb, 2014

1 commit

  • When mkfs issues a full device discard and the device only
    supports discards of a smallish size, we can loop in
    blkdev_issue_discard() for a long time. If preempt isn't enabled,
    this can turn into a softlock situation and the kernel will
    start complaining.

    Add an explicit cond_resched() at the end of the loop to avoid
    that.

    Cc: stable@kernel.org
    Signed-off-by: Jens Axboe

    Jens Axboe
     

12 Feb, 2014

2 commits

  • Make sure we have a proper pairing between starting and requeueing
    requests. Move the dma drain and REQ_END setup into blk_mq_start_request,
    and make sure blk_mq_requeue_request properly undoes them, giving us
    a pair of function to prepare and unprepare a request without leaving
    side effects.

    Together this ensures we always clean up properly after
    BLK_MQ_RQ_QUEUE_BUSY returns from ->queue_rq.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • rq->errors never has been part of the communication protocol between drivers
    and the block stack and most drivers will not have initialized it.

    Return -EIO to upper layers when the driver returns BLK_MQ_RQ_QUEUE_ERROR
    unconditionally. If a driver want to return a different error it can easily
    do so by returning success after calling blk_mq_end_io itself.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

11 Feb, 2014

3 commits

  • cppcheck detected following format string mismatch.
    [blk-mq-tag.c:201]: (warning) %u in format string (no. 1) requires
    'unsigned int' but the argument type is 'int'.

    Change "cpu" from int to unsigned int, because the cpu
    never become minus value.

    Signed-off-by: Masanari Iida
    Signed-off-by: Jens Axboe

    Masanari Iida
     
  • Witch to using a preallocated flush_rq for blk-mq similar to what's done
    with the old request path. This allows us to set up the request properly
    with a tag from the actually allowed range and ->rq_disk as needed by
    some drivers. To make life easier we also switch to dynamic allocation
    of ->flush_rq for the old path.

    This effectively reverts most of

    "blk-mq: fix for flush deadlock"

    and

    "blk-mq: Don't reserve a tag for flush request"

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     
  • Rework I/O completions to work more like the old code path. blk_mq_end_io
    now stays out of the business of deferring completions to others CPUs
    and calling blk_mark_rq_complete. The latter is very important to allow
    completing requests that have timed out and thus are already marked completed,
    the former allows using the IPI callout even for driver specific completions
    instead of having to reimplement them.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

08 Feb, 2014

2 commits

  • Immutable biovecs changed the way biovecs are interpreted - drivers no
    longer use bi_vcnt, they have to go by bi_iter.bi_size (to allow for
    using part of an existing segment without modifying it).

    This breaks with discards and write_same bios, since for those bi_size
    has nothing to do with segments in the biovec. So for now, we need a
    fairly gross hack - we fortunately know that there will never be more
    than one segment for the entire request, so we can special case
    discard/write_same.

    Signed-off-by: Kent Overstreet
    Tested-by: Hugh Dickins
    Signed-off-by: Jens Axboe

    Kent Overstreet
     
  • This patch adds the missing bio_integrity_enabled() +
    bio_integrity_prep() setup into blk_mq_make_request()
    in order to use DIF protection with scsi-mq.

    Cc: Martin K. Petersen
    Signed-off-by: Nicholas Bellinger
    Signed-off-by: Jens Axboe

    Nicholas Bellinger