Eric Lee / smarc-fsl-linux-kernel

04 May, 2017

1 commit

e5021876c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md ... Browse Code »

Pull MD updates from Shaohua Li:

- Add Partial Parity Log (ppl) feature found in Intel IMSM raid array
by Artur Paszkiewicz. This feature is another way to close RAID5
writehole. The Linux implementation is also available for normal
RAID5 array if specific superblock bit is set.

- A number of md-cluser fixes and enabling md-cluster array resize from
Guoqing Jiang

- A bunch of patches from Ming Lei and Neil Brown to rewrite MD bio
handling related code. Now MD doesn't directly access bio bvec,
bi_phys_segments and uses modern bio API for bio split.

- Improve RAID5 IO pattern to improve performance for hard disk based
RAID5/6 from me.

- Several patches from Song Liu to speed up raid5-cache recovery and
allow raid5 cache feature disabling in runtime.

- Fix a performance regression in raid1 resync from Xiao Ni.

- Other cleanup and fixes from various people.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md: (84 commits)
md/raid10: skip spare disk as 'first' disk
md/raid1: Use a new variable to count flighting sync requests
md: clear WantReplacement once disk is removed
md/raid1/10: remove unused queue
md: handle read-only member devices better.
md/raid10: wait up frozen array in handle_write_completed
uapi: fix linux/raid/md_p.h userspace compilation error
md-cluster: Fix a memleak in an error handling path
md: support disabling of create-on-open semantics.
md: allow creation of mdNNN arrays via md_mod/parameters/new_array
raid5-ppl: use a single mempool for ppl_io_unit and header_page
md/raid0: fix up bio splitting.
md/linear: improve bio splitting.
md/raid5: make chunk_aligned_read() split bios more cleanly.
md/raid10: simplify handle_read_error()
md/raid10: simplify the splitting of requests.
md/raid1: factor out flush_bio_list()
md/raid1: simplify handle_read_error().
Revert "block: introduce bio_copy_data_partial"
md/raid1: simplify alloc_behind_master_bio()
...

Linus Torvalds
2017-05-04 01:05:38 +0800

03 May, 2017

1 commit

c58d4055c Merge tag 'docs-4.12' of git://git.lwn.net/linux ... Browse Code »

Pull documentation update from Jonathan Corbet:
"A reasonably busy cycle for documentation this time around. There is a
new guide for user-space API documents, rather sparsely populated at
the moment, but it's a start. Markus improved the infrastructure for
converting diagrams. Mauro has converted much of the USB documentation
over to RST. Plus the usual set of fixes, improvements, and tweaks.

There's a bit more than the usual amount of reaching out of
Documentation/ to fix comments elsewhere in the tree; I have acks for
those where I could get them"

* tag 'docs-4.12' of git://git.lwn.net/linux: (74 commits)
docs: Fix a couple typos
docs: Fix a spelling error in vfio-mediated-device.txt
docs: Fix a spelling error in ioctl-number.txt
MAINTAINERS: update file entry for HSI subsystem
Documentation: allow installing man pages to a user defined directory
Doc/PM: Sync with intel_powerclamp code behavior
zr364xx.rst: usb/devices is now at /sys/kernel/debug/
usb.rst: move documentation from proc_usb_info.txt to USB ReST book
convert philips.txt to ReST and add to media docs
docs-rst: usb: update old usbfs-related documentation
arm: Documentation: update a path name
docs: process/4.Coding.rst: Fix a couple of document refs
docs-rst: fix usb cross-references
usb: gadget.h: be consistent at kernel doc macros
usb: composite.h: fix two warnings when building docs
usb: get rid of some ReST doc build errors
usb.rst: get rid of some Sphinx errors
usb/URB.txt: convert to ReST and update it
usb/persist.txt: convert to ReST and add to driver-api book
usb/hotplug.txt: convert to ReST and add to driver-api book
...

Linus Torvalds
2017-05-03 01:21:17 +0800

02 May, 2017

3 commits

5db6db0d4 Merge branch 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull uaccess unification updates from Al Viro:
"This is the uaccess unification pile. It's _not_ the end of uaccess
work, but the next batch of that will go into the next cycle. This one
mostly takes copy_from_user() and friends out of arch/* and gets the
zero-padding behaviour in sync for all architectures.

Dealing with the nocache/writethrough mess is for the next cycle;
fortunately, that's x86-only. Same for cleanups in iov_iter.c (I am
sold on access_ok() in there, BTW; just not in this pile), same for
reducing __copy_... callsites, strn*... stuff, etc. - there will be a
pile about as large as this one in the next merge window.

This one sat in -next for weeks. -3KLoC"

* 'work.uaccess' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (96 commits)
HAVE_ARCH_HARDENED_USERCOPY is unconditional now
CONFIG_ARCH_HAS_RAW_COPY_USER is unconditional now
m32r: switch to RAW_COPY_USER
hexagon: switch to RAW_COPY_USER
microblaze: switch to RAW_COPY_USER
get rid of padding, switch to RAW_COPY_USER
ia64: get rid of copy_in_user()
ia64: sanitize __access_ok()
ia64: get rid of 'segment' argument of __do_{get,put}_user()
ia64: get rid of 'segment' argument of __{get,put}_user_check()
ia64: add extable.h
powerpc: get rid of zeroing, switch to RAW_COPY_USER
esas2r: don't open-code memdup_user()
alpha: fix stack smashing in old_adjtimex(2)
don't open-code kernel_setsockopt()
mips: switch to RAW_COPY_USER
mips: get rid of tail-zeroing in primitives
mips: make copy_from_user() zero tail explicitly
mips: clean and reorder the forest of macros...
mips: consolidate __invoke_... wrappers
...

Linus Torvalds
2017-05-02 05:41:04 +0800
e265eb3a3 Merge branch 'md-next' into md-linus Browse Code »

Shaohua Li
2017-05-02 05:09:21 +0800
694752922 Merge branch 'for-4.12/block' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block layer updates from Jens Axboe:

- Add BFQ IO scheduler under the new blk-mq scheduling framework. BFQ
was initially a fork of CFQ, but subsequently changed to implement
fairness based on B-WF2Q+, a modified variant of WF2Q. BFQ is meant
to be used on desktop type single drives, providing good fairness.
From Paolo.

- Add Kyber IO scheduler. This is a full multiqueue aware scheduler,
using a scalable token based algorithm that throttles IO based on
live completion IO stats, similary to blk-wbt. From Omar.

- A series from Jan, moving users to separately allocated backing
devices. This continues the work of separating backing device life
times, solving various problems with hot removal.

- A series of updates for lightnvm, mostly from Javier. Includes a
'pblk' target that exposes an open channel SSD as a physical block
device.

- A series of fixes and improvements for nbd from Josef.

- A series from Omar, removing queue sharing between devices on mostly
legacy drivers. This helps us clean up other bits, if we know that a
queue only has a single device backing. This has been overdue for
more than a decade.

- Fixes for the blk-stats, and improvements to unify the stats and user
windows. This both improves blk-wbt, and enables other users to
register a need to receive IO stats for a device. From Omar.

- blk-throttle improvements from Shaohua. This provides a scalable
framework for implementing scalable priotization - particularly for
blk-mq, but applicable to any type of block device. The interface is
marked experimental for now.

- Bucketized IO stats for IO polling from Stephen Bates. This improves
efficiency of polled workloads in the presence of mixed block size
IO.

- A few fixes for opal, from Scott.

- A few pulls for NVMe, including a lot of fixes for NVMe-over-fabrics.
From a variety of folks, mostly Sagi and James Smart.

- A series from Bart, improving our exposed info and capabilities from
the blk-mq debugfs support.

- A series from Christoph, cleaning up how handle WRITE_ZEROES.

- A series from Christoph, cleaning up the block layer handling of how
we track errors in a request. On top of being a nice cleanup, it also
shrinks the size of struct request a bit.

- Removal of mg_disk and hd (sorry Linus) by Christoph. The former was
never used by platforms, and the latter has outlived it's usefulness.

- Various little bug fixes and cleanups from a wide variety of folks.

* 'for-4.12/block' of git://git.kernel.dk/linux-block: (329 commits)
block: hide badblocks attribute by default
blk-mq: unify hctx delay_work and run_work
block: add kblock_mod_delayed_work_on()
blk-mq: unify hctx delayed_run_work and run_work
nbd: fix use after free on module unload
MAINTAINERS: bfq: Add Paolo as maintainer for the BFQ I/O scheduler
blk-mq-sched: alloate reserved tags out of normal pool
mtip32xx: use runtime tag to initialize command header
scsi: Implement blk_mq_ops.show_rq()
blk-mq: Add blk_mq_ops.show_rq()
blk-mq: Show operation, cmd_flags and rq_flags names
blk-mq: Make blk_flags_show() callers append a newline character
blk-mq: Move the "state" debugfs attribute one level down
blk-mq: Unregister debugfs attributes earlier
blk-mq: Only unregister hctxs for which registration succeeded
blk-mq-debugfs: Rename functions for registering and unregistering the mq directory
blk-mq: Let blk_mq_debugfs_register() look up the queue name
blk-mq: Register /queue/mq after having registered /queue
ide-pm: always pass 0 error to ide_complete_rq in ide_do_devset
ide-pm: always pass 0 error to __blk_end_request_all
..

Linus Torvalds
2017-05-02 01:39:57 +0800

28 Apr, 2017

4 commits

9438b3e08 block: hide badblocks attribute by default ... Browse Code »

Commit 99e6608c9e74 "block: Add badblock management for gendisks"
allowed for drivers like pmem and software-raid to advertise a list of
bad media areas. However, it inadvertently added a 'badblocks' to all
block devices. Lets clean this up by having the 'badblocks' attribute
not be visible when the driver has not populated a 'struct badblocks'
instance in the gendisk.

Cc: Jens Axboe
Cc: Christoph Hellwig
Cc: Martin K. Petersen
Reported-by: Vishal Verma
Signed-off-by: Dan Williams
Tested-by: Vishal Verma
Signed-off-by: Jens Axboe

Dan Williams
2017-04-28 22:26:42 +0800
21c6e939a blk-mq: unify hctx delay_work and run_work ... Browse Code »

The only difference between ->run_work and ->delay_work, is that
the latter is used to defer running a queue. This is done by
marking the queue stopped, and scheduling ->delay_work to run
sometime in the future. While the queue is stopped, direct runs
or runs through ->run_work will not run the queue.

If we combine the handlers, then we need to handle two things:

1) If a delayed/stopped run is scheduled, then we should not run
the queue before that has been completed.
2) If a queue is delayed/stopped, the handler needs to restart
the queue. Normally a run of a queue with the stopped bit set
would be a no-op.

Case 1 is handled by modifying a currently pending queue run
to the deadline set by the caller of blk_mq_delay_queue().
Subsequent attempts to queue a queue run will find the work
item already pending, and direct runs will see a stopped queue
as before.

Case 2 is handled by adding a new bit, BLK_MQ_S_START_ON_RUN,
that tells the work handler that it should clear a stopped
queue and run the handler.

Reviewed-by: Bart Van Assche
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-28 22:11:43 +0800
818cd1cba block: add kblock_mod_delayed_work_on() ... Browse Code »

This modifies (or adds, if not currently pending) an existing
delayed work item.

Reviewed-by: Christoph Hellwig
Reviewed-by: Bart Van Assche
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-28 22:10:15 +0800
9f9937379 blk-mq: unify hctx delayed_run_work and run_work ... Browse Code »

They serve the exact same purpose. Get rid of the non-delayed
work variant, and just run it without delay for the normal case.

Reviewed-by: Christoph Hellwig
Reviewed-by: Bart Van Assche
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-28 22:10:15 +0800

27 Apr, 2017

10 commits

339318080 blk-mq-sched: alloate reserved tags out of normal pool ... Browse Code »

At least one driver, mtip32xx, has a hard coded dependency on
the value of the reserved tag used for internal commands. While
that should really be fixed up, for now let's ensure that we just
bypass the scheduler tags an allocation marked as reserved. They
are used for house keeping or error handling, so we can safely
ignore them in the scheduler.

Tested-by: Ming Lei
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-27 21:45:46 +0800
2836ee4b1 blk-mq: Add blk_mq_ops.show_rq() ... Browse Code »

This new callback function will be used in the next patch to show
more information about SCSI requests.

Signed-off-by: Bart Van Assche
Reviewed-by: Omar Sandoval
Cc: Hannes Reinecke
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-27 05:09:04 +0800
8658dca8b blk-mq: Show operation, cmd_flags and rq_flags names ... Browse Code »

Show the operation name, .cmd_flags and .rq_flags as names instead
of numbers.

Signed-off-by: Bart Van Assche
Reviewed-by: Omar Sandoval
Reviewed-by: Hannes Reinecke
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-27 05:09:04 +0800
fd07dc818 blk-mq: Make blk_flags_show() callers append a newline character ... Browse Code »

This patch does not change any functionality but makes it possible
to produce a single line of output with multiple flag-to-name
translations.

Signed-off-by: Bart Van Assche
Reviewed-by: Omar Sandoval
Reviewed-by: Hannes Reinecke
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-27 05:09:04 +0800
65ca1ca32 blk-mq: Move the "state" debugfs attribute one level down ... Browse Code »

Move the "state" attribute from the top level to the "mq" directory
as requested by Omar.

Signed-off-by: Bart Van Assche
Reviewed-by: Omar Sandoval
Reviewed-by: Hannes Reinecke
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-27 05:09:04 +0800
e869b5462 blk-mq: Unregister debugfs attributes earlier ... Browse Code »

We currently call blk_mq_free_queue() from blk_cleanup_queue()
before we unregister the debugfs attributes for that queue in
blk_release_queue(). This leaves a window open during which
accessing most of the mq debugfs attributes would cause a
use-after-free. Additionally, the "state" attribute allows
running the queue, which we should not do after the queue has
entered the "dead" state. Fix both cases by unregistering the
debugfs attributes before freeing queue resources starts.

Signed-off-by: Bart Van Assche
Reviewed-by: Hannes Reinecke
Reviewed-by: Omar Sandoval
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-27 05:09:04 +0800
f05d1ba78 blk-mq: Only unregister hctxs for which registration succeeded ... Browse Code »

Hctx unregistration involves calling kobject_del(). kobject_del()
must not be called if kobject_add() has not been called. Hence in
the error path only unregister hctxs for which registration succeeded.

Signed-off-by: Bart Van Assche
Cc: Omar Sandoval
Cc: Hannes Reinecke
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-27 05:09:04 +0800
62d6c9496 blk-mq-debugfs: Rename functions for registering and unregistering the mq directory ... Browse Code »

Since the blk_mq_debugfs_*register_hctxs() functions register and
unregister all attributes under the "mq" directory, rename these
into blk_mq_debugfs_*register_mq().

Signed-off-by: Bart Van Assche
Reviewed-by: Hannes Reinecke
Reviewed-by: Omar Sandoval
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-27 05:09:04 +0800
4c9e4019f blk-mq: Let blk_mq_debugfs_register() look up the queue name ... Browse Code »

A later patch will move the call of blk_mq_debugfs_register() to
a function to which the queue name is not passed as an argument.
To avoid having to add a 'name' argument to multiple callers, let
blk_mq_debugfs_register() look up the queue name.

Signed-off-by: Bart Van Assche
Reviewed-by: Omar Sandoval
Reviewed-by: Hannes Reinecke
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-27 05:09:04 +0800
2d0364c8c blk-mq: Register <dev>/queue/mq after having registered <dev>/queue ... Browse Code »

A later patch in this series will modify blk_mq_debugfs_register()
such that it uses q->kobj.parent to determine the name of a
request queue. Hence make sure that that pointer is initialized
before blk_mq_debugfs_register() is called. To avoid lock inversion,
protect sysfs / debugfs registration with the queue sysfs_lock
instead of the global mutex all_q_mutex.

Signed-off-by: Bart Van Assche
Reviewed-by: Hannes Reinecke
Reviewed-by: Omar Sandoval
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-27 05:09:04 +0800

24 Apr, 2017

1 commit

2859323e3 block: fix blk_integrity_register to use template's interval_exp if not 0 ... Browse Code »

When registering an integrity profile: if the template's interval_exp is
not 0 use it, otherwise use the ilog2() of logical block size of the
provided gendisk.

This fixes a long-standing DM linear target bug where it cannot pass
integrity data to the underlying device if its logical block size
conflicts with the underlying device's logical block size.

Cc: stable@vger.kernel.org
Reported-by: Mikulas Patocka
Signed-off-by: Mike Snitzer
Acked-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Mike Snitzer
2017-04-24 02:59:56 +0800

22 Apr, 2017

2 commits

19b7ccf86 block: get rid of blk_integrity_revalidate() ... Browse Code »

Commit 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk")
introduced blk_integrity_revalidate(), which seems to assume ownership
of the stable pages flag and unilaterally clears it if no blk_integrity
profile is registered:

if (bi->profile)
disk->queue->backing_dev_info->capabilities |=
BDI_CAP_STABLE_WRITES;
else
disk->queue->backing_dev_info->capabilities &=
~BDI_CAP_STABLE_WRITES;

It's called from revalidate_disk() and rescan_partitions(), making it
impossible to enable stable pages for drivers that support partitions
and don't use blk_integrity: while the call in revalidate_disk() can be
trivially worked around (see zram, which doesn't support partitions and
hence gets away with zram_revalidate_disk()), rescan_partitions() can
be triggered from userspace at any time. This breaks rbd, where the
ceph messenger is responsible for generating/verifying CRCs.

Since blk_integrity_{un,}register() "must" be used for (un)registering
the integrity profile with the block layer, move BDI_CAP_STABLE_WRITES
setting there. This way drivers that call blk_integrity_register() and
use integrity infrastructure won't interfere with drivers that don't
but still want stable pages.

Fixes: 25520d55cdb6 ("block: Inline blk_integrity in struct gendisk")
Cc: "Martin K. Petersen"
Cc: Christoph Hellwig
Cc: Mike Snitzer
Cc: stable@vger.kernel.org # 4.4+, needs backporting
Tested-by: Dan Williams
Signed-off-by: Ilya Dryomov
Signed-off-by: Jens Axboe

Ilya Dryomov
2017-04-22 04:17:27 +0800
abc25a693 blk-mq: Fix preempt count imbalance ... Browse Code »

Avoid that the following kernel bug gets triggered:

BUG: sleeping function called from invalid context at ./include/linux/buffer_head.h:349
in_atomic(): 1, irqs_disabled(): 0, pid: 8019, name: find
CPU: 10 PID: 8019 Comm: find Tainted: G W I 4.11.0-rc4-dbg+ #2
Call Trace:
dump_stack+0x68/0x93
___might_sleep+0x16e/0x230
__might_sleep+0x4a/0x80
__ext4_get_inode_loc+0x1e0/0x4e0
ext4_iget+0x70/0xbc0
ext4_iget_normal+0x2f/0x40
ext4_lookup+0xb6/0x1f0
lookup_slow+0x104/0x1e0
walk_component+0x19a/0x330
path_lookupat+0x4b/0x100
filename_lookup+0x9a/0x110
user_path_at_empty+0x36/0x40
vfs_statx+0x67/0xc0
SYSC_newfstatat+0x20/0x40
SyS_newfstatat+0xe/0x10
entry_SYSCALL_64_fastpath+0x18/0xad

This happens since the big if/else in blk_mq_make_request() doesn't
have final else section that also drops the ctx. Add that.

Fixes: b00c53e8f411 ("blk-mq: fix schedule-while-atomic with scheduler attached")
Signed-off-by: Bart Van Assche
Cc: Omar Sandoval

Added a bit more to the commit log.

Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-22 02:00:40 +0800

21 Apr, 2017

13 commits

99c749a4c blk-stat: kill blk_stat_rq_ddir() ... Browse Code »

No point in providing and exporting this helper. There's just
one (real) user of it, just use rq_data_dir().

Reviewed-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-21 21:56:23 +0800
246665db3 blk-mq: Remove blk_mq_sched_move_to_dispatch() ... Browse Code »

commit c13660a08c8b ("blk-mq-sched: change ->dispatch_requests()
to ->dispatch_request()") removed the last user of this function.
Hence also remove the function itself.

Signed-off-by: Bart Van Assche
Cc: Omar Sandoval
Cc: Hannes Reinecke
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-21 07:28:30 +0800
5feeacdd4 blk-mq: add might_sleep check to blk_mq_get_driver_tag() ... Browse Code »

If the caller passes in wait=true, it has to be able to block
for a driver tag. We just had a bug where flush insertion
would block on tag allocation, while we had preempt disabled.
Ensure that we catch cases like that earlier next time.

Reviewed-by: Bart Van Assche
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-21 07:23:13 +0800
0206319fd blk-mq: Fix poll_stat for new size-based bucketing. ... Browse Code »

Fixes an issue where the size of the poll_stat array in request_queue
does not match the size expected by the new size based bucketing for
IO completion polling.

Fixes: 720b8ccc4500 ("blk-mq: Add a polling specific stats function")
Signed-off-by: Stephen Bates
Signed-off-by: Jens Axboe

Stephen Bates
2017-04-21 07:10:48 +0800
b00c53e8f blk-mq: fix schedule-while-atomic with scheduler attached ... Browse Code »

We must have dropped the ctx before we call
blk_mq_sched_insert_request() with can_block=true, otherwise we risk
that a flush request can block on insertion if we are currently out of
tags.

[ 47.667190] BUG: scheduling while atomic: jbd2/sda2-8/2089/0x00000002
[ 47.674493] Modules linked in: x86_pkg_temp_thermal btrfs xor zlib_deflate raid6_pq sr_mod cdre
[ 47.690572] Preemption disabled at:
[ 47.690584] [] blk_mq_sched_get_request+0x6c/0x280
[ 47.701764] CPU: 1 PID: 2089 Comm: jbd2/sda2-8 Not tainted 4.11.0-rc7+ #271
[ 47.709630] Hardware name: Dell Inc. PowerEdge T630/0NT78X, BIOS 2.3.4 11/09/2016
[ 47.718081] Call Trace:
[ 47.720903] dump_stack+0x4f/0x73
[ 47.724694] ? blk_mq_sched_get_request+0x6c/0x280
[ 47.730137] __schedule_bug+0x6c/0xc0
[ 47.734314] __schedule+0x559/0x780
[ 47.738302] schedule+0x3b/0x90
[ 47.741899] io_schedule+0x11/0x40
[ 47.745788] blk_mq_get_tag+0x167/0x2a0
[ 47.750162] ? remove_wait_queue+0x70/0x70
[ 47.754901] blk_mq_get_driver_tag+0x92/0xf0
[ 47.759758] blk_mq_sched_insert_request+0x134/0x170
[ 47.765398] ? blk_account_io_start+0xd0/0x270
[ 47.770679] blk_mq_make_request+0x1b2/0x850
[ 47.775766] generic_make_request+0xf7/0x2d0
[ 47.780860] submit_bio+0x5f/0x120
[ 47.784979] ? submit_bio+0x5f/0x120
[ 47.789631] submit_bh_wbc.isra.46+0x10d/0x130
[ 47.794902] submit_bh+0xb/0x10
[ 47.798719] journal_submit_commit_record+0x190/0x210
[ 47.804686] ? _raw_spin_unlock+0x13/0x30
[ 47.809480] jbd2_journal_commit_transaction+0x180a/0x1d00
[ 47.815925] kjournald2+0xb6/0x250
[ 47.820022] ? kjournald2+0xb6/0x250
[ 47.824328] ? remove_wait_queue+0x70/0x70
[ 47.829223] kthread+0x10e/0x140
[ 47.833147] ? commit_timeout+0x10/0x10
[ 47.837742] ? kthread_create_on_node+0x40/0x40
[ 47.843122] ret_from_fork+0x29/0x40

Fixes: a4d907b6a33b ("blk-mq: streamline blk_mq_make_request")
Reviewed-by: Omar Sandoval
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-21 06:42:02 +0800
720b8ccc4 blk-mq: Add a polling specific stats function ... Browse Code »

Rather than bucketing IO statisics based on direction only we also
bucket based on the IO size. This leads to improved polling
performance. Update the bucket callback function and use it in the
polling latency estimation.

Signed-off-by: Stephen Bates
Signed-off-by: Jens Axboe

Stephen Bates
2017-04-21 05:29:40 +0800
a37244e4c blk-stat: convert blk-stat bucket callback to signed ... Browse Code »

In order to allow for filtering of IO based on some other properties
of the request than direction we allow the bucket function to return
an int.

If the bucket callback returns a negative do no count it in the stats
accumulation.

Signed-off-by: Stephen Bates

Fixed up Kyber scheduler stat callback.

Signed-off-by: Jens Axboe

Stephen Bates
2017-04-21 05:29:16 +0800
3a07bb1d7 blk-mq: fix potential oops with polling and blk-mq scheduler ... Browse Code »

If we have a scheduler attached, blk_mq_tag_to_rq() on the
scheduled tags will return NULL if a request is no longer
in flight. This is different than using the normal tags,
where it will always return the fixed request. Check for
this condition for polling, in case we happen to enter
polling for a completed request.

The request address remains valid, so this check and return
should be perfectly safe.

Fixes: bd166ef183c2 ("blk-mq-sched: add framework for MQ capable IO schedulers")
Tested-by: Stephen Bates
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-21 04:53:28 +0800
caf7df122 block: remove the errors field from struct request ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Bart Van Assche
Acked-by: Roger Pau Monné
Reviewed-by: Konrad Rzeszutek Wilk
Signed-off-by: Jens Axboe

Christoph Hellwig
2017-04-21 02:16:10 +0800
453f83418 blk-mq: simplify __blk_mq_complete_request ... Browse Code »

Merge blk_mq_ipi_complete_request and blk_mq_stat_add into their only
caller.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2017-04-21 02:16:10 +0800
08e0029aa blk-mq: remove the error argument to blk_mq_complete_request ... Browse Code »

Now that all drivers that call blk_mq_complete_requests have a
->complete callback we can remove the direct call to blk_mq_end_request,
as well as the error argument to blk_mq_complete_request.

Signed-off-by: Christoph Hellwig
Reviewed-by: Johannes Thumshirn
Reviewed-by: Bart Van Assche
Signed-off-by: Jens Axboe

Christoph Hellwig
2017-04-21 02:16:10 +0800
17d5363b8 scsi: introduce a result field in struct scsi_request ... Browse Code »

This passes on the scsi_cmnd result field to users of passthrough
requests. Currently we abuse req->errors for this purpose, but that
field will go away in its current form.

Note that the old IDE code abuses the errors field in very creative
ways and stores all kinds of different values in it. I didn't dare
to touch this magic, so the abuses are brought forward 1:1.

Signed-off-by: Christoph Hellwig
Reviewed-by: Martin K. Petersen
Reviewed-by: Bart Van Assche
Signed-off-by: Jens Axboe

Christoph Hellwig
2017-04-21 02:16:10 +0800
b7819b925 block: remove the blk_execute_rq return value ... Browse Code »

The function only returns -EIO if rq->errors is non-zero, which is not
very useful and lets a large number of callers ignore the return value.

Just let the callers figure out their error themselves.

Signed-off-by: Christoph Hellwig
Reviewed-by: Johannes Thumshirn
Reviewed-by: Bart Van Assche
Signed-off-by: Jens Axboe

Christoph Hellwig
2017-04-21 02:16:10 +0800

20 Apr, 2017

5 commits

2bc19cd5f blk-throttle: fix unused variable warning with BLK_DEV_THROTTLING_LOW=n ... Browse Code »

We trigger this warning:

block/blk-throttle.c: In function ‘blk_throtl_bio’:
block/blk-throttle.c:2042:6: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable]
int ret;
^~~

since we only assign 'ret' if BLK_DEV_THROTTLING_LOW is off, we never
check it.

Reported-by: Bart Van Assche
Reviewed-by: Bart Van Assche
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-20 23:41:36 +0800
659b3394e bfq: fix compile error if CONFIG_CGROUPS=n ... Browse Code »

If we don't have CGROUPS enabled, the compile ends in the
following misery:

In file included from ../block/bfq-iosched.c:105:0:
../block/bfq-iosched.h:819:22: error: array type has incomplete element type
extern struct cftype bfq_blkcg_legacy_files[];
^
../block/bfq-iosched.h:820:22: error: array type has incomplete element type
extern struct cftype bfq_blkg_files[];
^

Move the declarations under the right ifdef.

Reported-by: Randy Dunlap
Signed-off-by: Jens Axboe

Jens Axboe
2017-04-20 23:39:12 +0800
8c9ff1add block, bfq: don't dereference bic before null checking it ... Browse Code »

The call to bfq_check_ioprio_change will dereference bic, however,
the null check for bic is after this call. Move the the null
check on bic to before the call to avoid any potential null
pointer dereference issues.

Detected by CoverityScan, CID#1430138 ("Dereference before null check")

Signed-off-by: Colin Ian King
Signed-off-by: Jens Axboe

Colin Ian King
2017-04-20 22:19:23 +0800
9a87182c4 block: Optimize ioprio_best() ... Browse Code »

Since ioprio_best() translates IOPRIO_CLASS_NONE into IOPRIO_CLASS_BE
and since lower numerical priority values represent a higher priority
a simple numerical comparison is sufficient.

Signed-off-by: Bart Van Assche
Reviewed-by: Adam Manzanares
Tested-by: Adam Manzanares
Reviewed-by: Christoph Hellwig
Cc: Matias Bjørling
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-20 07:38:36 +0800
0be0dee64 block: Inline blk_rq_set_prio() ... Browse Code »

Since only a single caller remains, inline blk_rq_set_prio(). Initialize
req->ioprio even if no I/O priority has been set in the bio nor in the
I/O context.

Signed-off-by: Bart Van Assche
Reviewed-by: Adam Manzanares
Tested-by: Adam Manzanares
Reviewed-by: Christoph Hellwig
Cc: Matias Bjørling
Signed-off-by: Jens Axboe

Bart Van Assche
2017-04-20 07:38:34 +0800