Eric Lee / smarc-fsl-linux-kernel

17 Apr, 2015

2 commits

569fd0ce9 blk-mq: fix iteration of busy bitmap ... Browse Code »

Commit 889fa31f00b2 was a bit too eager in reducing the loop count,
so we ended up missing queues in some configurations. Ensure that
our division rounds up, so that's not the case.

Reported-by: Guenter Roeck
Fixes: 889fa31f00b2 ("blk-mq: reduce unnecessary software queue looping")
Signed-off-by: Jens Axboe

Jens Axboe
2015-04-17 22:31:12 +0800
d82312c80 Merge branch 'for-4.1/core' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block layer core bits from Jens Axboe:
"This is the core pull request for 4.1. Not a lot of stuff in here for
this round, mostly little fixes or optimizations. This pull request
contains:

- An optimization that speeds up queue runs on blk-mq, especially for
the case where there's a large difference between nr_cpu_ids and
the actual mapped software queues on a hardware queue. From Chong
Yuan.

- Honor node local allocations for requests on legacy devices. From
David Rientjes.

- Cleanup of blk_mq_rq_to_pdu() from me.

- exit_aio() fixup from me, greatly speeding up exiting multiple IO
contexts off exit_group(). For my particular test case, fio exit
took ~6 seconds. A typical case of both exposing RCU grace periods
to user space, and serializing exit of them.

- Make blk_mq_queue_enter() honor the gfp mask passed in, so we only
wait if __GFP_WAIT is set. From Keith Busch.

- blk-mq exports and two added helpers from Mike Snitzer, which will
be used by the dm-mq code.

- Cleanups of blk-mq queue init from Wei Fang and Xiaoguang Wang"

* 'for-4.1/core' of git://git.kernel.dk/linux-block:
blk-mq: reduce unnecessary software queue looping
aio: fix serial draining in exit_aio()
blk-mq: cleanup blk_mq_rq_to_pdu()
blk-mq: put blk_queue_rq_timeout together in blk_mq_init_queue()
block: remove redundant check about 'set->nr_hw_queues' in blk_mq_alloc_tag_set()
block: allocate request memory local to request queue
blk-mq: don't wait in blk_mq_queue_enter() if __GFP_WAIT isn't set
blk-mq: export blk_mq_run_hw_queues
blk-mq: add blk_mq_init_allocated_queue and export blk_mq_register_disk

Linus Torvalds
2015-04-17 09:49:16 +0800

16 Apr, 2015

1 commit

889fa31f0 blk-mq: reduce unnecessary software queue looping ... Browse Code »

In flush_busy_ctxs() and blk_mq_hctx_has_pending(), regardless of how many
ctxs assigned to one hctx, they will all loop hctx->ctx_map.map_size
times. Here hctx->ctx_map.map_size is a const ALIGN(nr_cpu_ids, 8) / 8.
Especially, flush_busy_ctxs() is in hot code path. And it's unnecessary.
Change ->map_size to contain the actually mapped software queues, so we
only loop for as many iterations as we have to.

And remove cpumask setting and nr_ctx count in blk_mq_init_cpu_queues()
since they are all re-done in blk_mq_map_swqueue().
blk_mq_map_swqueue().

Signed-off-by: Chong Yuan
Reviewed-by: Wenbo Wang

Updated by me for formatting and commenting.

Signed-off-by: Jens Axboe

Chong Yuan
2015-04-16 01:39:29 +0800

15 Apr, 2015

1 commit

ca2ec3265 Merge branch 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs update from Al Viro:
"Part one:

- struct filename-related cleanups

- saner iov_iter_init() replacements (and switching the syscalls to
use of those)

- ntfs switch to ->write_iter() (Anton)

- aio cleanups and splitting iocb into common and async parts
(Christoph)

- assorted fixes (me, bfields, Andrew Elble)

There's a lot more, including the completion of switchover to
->{read,write}_iter(), d_inode/d_backing_inode annotations, f_flags
race fixes, etc, but that goes after #for-davem merge. David has
pulled it, and once it's in I'll send the next vfs pull request"

* 'for-linus-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (35 commits)
sg_start_req(): use import_iovec()
sg_start_req(): make sure that there's not too many elements in iovec
blk_rq_map_user(): use import_single_range()
sg_io(): use import_iovec()
process_vm_access: switch to {compat_,}import_iovec()
switch keyctl_instantiate_key_common() to iov_iter
switch {compat_,}do_readv_writev() to {compat_,}import_iovec()
aio_setup_vectored_rw(): switch to {compat_,}import_iovec()
vmsplice_to_user(): switch to import_iovec()
kill aio_setup_single_vector()
aio: simplify arguments of aio_setup_..._rw()
aio: lift iov_iter_init() into aio_setup_..._rw()
lift iov_iter into {compat_,}do_readv_writev()
NFS: fix BUG() crash in notify_change() with patch to chown_common()
dcache: return -ESTALE not -EBUSY on distributed fs race
NTFS: Version 2.1.32 - Update file write from aio_write to write_iter.
VFS: Add iov_iter_fault_in_multipages_readable()
drop bogus check in file_open_root()
switch security_inode_getattr() to struct path *
constify tomoyo_realpath_from_path()
...

Linus Torvalds
2015-04-15 06:31:03 +0800

12 Apr, 2015

3 commits

8f7e885a4 blk_rq_map_user(): use import_single_range() ... Browse Code »

Signed-off-by: Al Viro

Al Viro
2015-04-12 10:27:13 +0800
e272b89ff sg_io(): use import_iovec() ... Browse Code »

... and don't skip access_ok() validation.

Signed-off-by: Al Viro

Al Viro
2015-04-12 10:27:13 +0800
ac2111753 blk-mq: initialize 'struct request' and associated data to zero ... Browse Code »

Jan Engelhardt reports a strange oops with an invalid ->sense_buffer
pointer in scsi_init_cmd_errh() with the blk-mq code.

The sense_buffer pointer should have been initialized by the call to
scsi_init_request() from blk_mq_init_rq_map(), but there seems to be
some non-repeatable memory corruptor.

This patch makes sure we initialize the whole struct request allocation
(and the associated 'struct scsi_cmnd' for the SCSI case) to zero, by
using __GFP_ZERO in the allocation. The old code initialized a couple
of individual fields, leaving the rest undefined (although many of them
are then initialized in later phases, like blk_mq_rq_ctx_init() etc.

It's not entirely clear why this matters, but it's the rigth thing to do
regardless, and with 4.0 imminent this is the defensive "let's just make
sure everything is initialized properly" patch.

Tested-by: Jan Engelhardt
Acked-by: Jens Axboe
Signed-off-by: Linus Torvalds

Linus Torvalds
2015-04-12 04:42:16 +0800

31 Mar, 2015

1 commit

e9637415a block: fix blk_stack_limits() regression due to lcm() change ... Browse Code »

Linux 3.19 commit 69c953c ("lib/lcm.c: lcm(n,0)=lcm(0,n) is 0, not n")
caused blk_stack_limits() to not properly stack queue_limits for stacked
devices (e.g. DM).

Fix this regression by establishing lcm_not_zero() and switching
blk_stack_limits() over to using it.

DM uses blk_set_stacking_limits() to establish the initial top-level
queue_limits that are then built up based on underlying devices' limits
using blk_stack_limits(). In the case of optimal_io_size (io_opt)
blk_set_stacking_limits() establishes a default value of 0. With commit
69c953c, lcm(0, n) is no longer n, which compromises proper stacking of
the underlying devices' io_opt.

Test:
$ modprobe scsi_debug dev_size_mb=10 num_tgts=1 opt_blks=1536
$ cat /sys/block/sde/queue/optimal_io_size
786432
$ dmsetup create node --table "0 100 linear /dev/sde 0"

Before this fix:
$ cat /sys/block/dm-5/queue/optimal_io_size
0

After this fix:
$ cat /sys/block/dm-5/queue/optimal_io_size
786432

Signed-off-by: Mike Snitzer
Cc: stable@vger.kernel.org # 3.19+
Acked-by: Martin K. Petersen
Signed-off-by: Jens Axboe

Mike Snitzer
2015-03-31 23:45:50 +0800

30 Mar, 2015

2 commits

c76cbbcf4 blk-mq: put blk_queue_rq_timeout together in blk_mq_init_queue() ... Browse Code »

Don't assign ->rq_timeout twice.

Signed-off-by: Wei Fang
Signed-off-by: Jens Axboe

Wei Fang
2015-03-30 23:07:00 +0800
f9018ac93 block: remove redundant check about 'set->nr_hw_queues' in blk_mq_alloc_tag_set() ... Browse Code »

At the beginning of blk_mq_alloc_tag_set(), we have already checked whether
'set->nr_hw_queues' is zero, so here remove this redundant check.

Signed-off-by: Xiaoguang Wang
Signed-off-by: Jens Axboe

Xiaoguang Wang
2015-03-30 23:04:27 +0800

25 Mar, 2015

1 commit

271508dba block: allocate request memory local to request queue ... Browse Code »

blk_init_rl() allocates a mempool using mempool_create_node() with node
local memory. This only allocates the mempool and element list locally
to the requeue queue node.

What we really want to do is allocate the request itself local to the
queue. To do this, we need our own alloc and free functions that will
allocate from request_cachep and pass the request queue node in to prefer
node local memory.

Acked-by: Tejun Heo
Signed-off-by: David Rientjes
Signed-off-by: Jens Axboe

David Rientjes
2015-03-25 10:00:07 +0800

20 Mar, 2015

1 commit

7ee8e4f39 Fix bug in blk_rq_merge_ok ... Browse Code »

Use the right array index to reference the last
element of rq->biotail->bi_io_vec[]

Signed-off-by: Wenbo Wang
Reviewed-by: Chong Yuan
Fixes: 66cb45aa41315 ("block: add support for limiting gaps in SG lists")
Cc: stable@kernel.org
Signed-off-by: Jens Axboe

Wenbo Wang
2015-03-20 22:50:41 +0800

19 Mar, 2015

1 commit

bc188d818 blkmq: Fix NULL pointer deref when all reserved tags in ... Browse Code »

When allocating from the reserved tags pool, bt_get() is called with
a NULL hctx. If all tags are in use, the hw queue is kicked to push
out any pending IO, potentially freeing tags, and tag allocation is
retried. The problem is that blk_mq_run_hw_queue() doesn't check for
a NULL hctx. So we avoid it with a simple NULL hctx test.

Tested by hammering mtip32xx with concurrent smartctl/hdparm.

Signed-off-by: Sam Bradshaw
Signed-off-by: Selvan Mani
Fixes: b32232073e80 ("blk-mq: fix hang in bt_get()")
Cc: stable@kernel.org

Added appropriate comment.

Signed-off-by: Jens Axboe

Sam Bradshaw
2015-03-19 07:06:18 +0800

13 Mar, 2015

4 commits

bfd343aa1 blk-mq: don't wait in blk_mq_queue_enter() if __GFP_WAIT isn't set ... Browse Code »

Return -EBUSY if we're unable to enter a queue immediately when
allocating a blk-mq request without __GFP_WAIT.

Signed-off-by: Keith Busch
Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Keith Busch
2015-03-13 22:30:55 +0800
b94ec2964 blk-mq: export blk_mq_run_hw_queues ... Browse Code »

Rename blk_mq_run_queues to blk_mq_run_hw_queues, add async argument,
and export it.

DM's suspend support must be able to run the queue without starting
stopped hw queues.

Signed-off-by: Mike Snitzer
Signed-off-by: Jens Axboe

Mike Snitzer
2015-03-13 22:28:33 +0800
b62c21b71 blk-mq: add blk_mq_init_allocated_queue and export blk_mq_register_disk ... Browse Code »

Add a variant of blk_mq_init_queue that allows a previously allocated
queue to be initialized. blk_mq_init_allocated_queue models
blk_init_allocated_queue -- which was also created for DM's use.

DM's approach to device creation requires a placeholder request_queue be
allocated for use with alloc_dev() but the decision about what type of
request_queue will be ultimately created is deferred until all component
devices referenced in the DM table are processed to determine the table
type (request-based, blk-mq request-based, or bio-based).

Also, because of DM's late finalization of the request_queue type
the call to blk_mq_register_disk() doesn't happen during alloc_dev().
Must export blk_mq_register_disk() so that DM can backfill the 'mq' dir
once the blk-mq queue is fully allocated.

Signed-off-by: Mike Snitzer
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe

Mike Snitzer
2015-03-13 22:26:53 +0800
9a30b096b blk-mq: fix use of incorrect goto label in blk_mq_init_queue error path ... Browse Code »

If percpu_ref_init() fails the allocated q and hctxs must get cleaned
up; using 'err_map' doesn't allow that to happen.

Signed-off-by: Mike Snitzer
Reviewed-by: Ming Lei
Cc: stable@kernel.org
Signed-off-by: Jens Axboe

Mike Snitzer
2015-03-13 22:22:32 +0800

21 Feb, 2015

1 commit

045c47ca3 blk-throttle: check stats_cpu before reading it from sysfs ... Browse Code »

When reading blkio.throttle.io_serviced in a recently created blkio
cgroup, it's possible to race against the creation of a throttle policy,
which delays the allocation of stats_cpu.

Like other functions in the throttle code, just checking for a NULL
stats_cpu prevents the following oops caused by that race.

[ 1117.285199] Unable to handle kernel paging request for data at address 0x7fb4d0020
[ 1117.285252] Faulting instruction address: 0xc0000000003efa2c
[ 1137.733921] Oops: Kernel access of bad area, sig: 11 [#1]
[ 1137.733945] SMP NR_CPUS=2048 NUMA PowerNV
[ 1137.734025] Modules linked in: bridge stp llc kvm_hv kvm binfmt_misc autofs4
[ 1137.734102] CPU: 3 PID: 5302 Comm: blkcgroup Not tainted 3.19.0 #5
[ 1137.734132] task: c000000f1d188b00 ti: c000000f1d210000 task.ti: c000000f1d210000
[ 1137.734167] NIP: c0000000003efa2c LR: c0000000003ef9f0 CTR: c0000000003ef980
[ 1137.734202] REGS: c000000f1d213500 TRAP: 0300 Not tainted (3.19.0)
[ 1137.734230] MSR: 9000000000009032 CR: 42008884 XER: 20000000
[ 1137.734325] CFAR: 0000000000008458 DAR: 00000007fb4d0020 DSISR: 40000000 SOFTE: 0
GPR00: c0000000003ed3a0 c000000f1d213780 c000000000c59538 0000000000000000
GPR04: 0000000000000800 0000000000000000 0000000000000000 0000000000000000
GPR08: ffffffffffffffff 00000007fb4d0020 00000007fb4d0000 c000000000780808
GPR12: 0000000022000888 c00000000fdc0d80 0000000000000000 0000000000000000
GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
GPR20: 000001003e120200 c000000f1d5b0cc0 0000000000000200 0000000000000000
GPR24: 0000000000000001 c000000000c269e0 0000000000000020 c000000f1d5b0c80
GPR28: c000000000ca3a08 c000000000ca3dec c000000f1c667e00 c000000f1d213850
[ 1137.734886] NIP [c0000000003efa2c] .tg_prfill_cpu_rwstat+0xac/0x180
[ 1137.734915] LR [c0000000003ef9f0] .tg_prfill_cpu_rwstat+0x70/0x180
[ 1137.734943] Call Trace:
[ 1137.734952] [c000000f1d213780] [d000000005560520] 0xd000000005560520 (unreliable)
[ 1137.734996] [c000000f1d2138a0] [c0000000003ed3a0] .blkcg_print_blkgs+0xe0/0x1a0
[ 1137.735039] [c000000f1d213960] [c0000000003efb50] .tg_print_cpu_rwstat+0x50/0x70
[ 1137.735082] [c000000f1d2139e0] [c000000000104b48] .cgroup_seqfile_show+0x58/0x150
[ 1137.735125] [c000000f1d213a70] [c0000000002749dc] .kernfs_seq_show+0x3c/0x50
[ 1137.735161] [c000000f1d213ae0] [c000000000218630] .seq_read+0xe0/0x510
[ 1137.735197] [c000000f1d213bd0] [c000000000275b04] .kernfs_fop_read+0x164/0x200
[ 1137.735240] [c000000f1d213c80] [c0000000001eb8e0] .__vfs_read+0x30/0x80
[ 1137.735276] [c000000f1d213cf0] [c0000000001eb9c4] .vfs_read+0x94/0x1b0
[ 1137.735312] [c000000f1d213d90] [c0000000001ebb38] .SyS_read+0x58/0x100
[ 1137.735349] [c000000f1d213e30] [c000000000009218] syscall_exit+0x0/0x98
[ 1137.735383] Instruction dump:
[ 1137.735405] 7c6307b4 7f891800 409d00b8 60000000 60420000 3d420004 392a63b0 786a1f24
[ 1137.735471] 7d49502a e93e01c8 7d495214 7d2ad214 e9090008 e9490010 e9290018

And here is one code that allows to easily reproduce this, although this
has first been found by running docker.

void run(pid_t pid)
{
int n;
int status;
int fd;
char *buffer;
buffer = memalign(BUFFER_ALIGN, BUFFER_SIZE);
n = snprintf(buffer, BUFFER_SIZE, "%d\n", pid);
fd = open(CGPATH "/test/tasks", O_WRONLY);
write(fd, buffer, n);
close(fd);
if (fork() > 0) {
fd = open("/dev/sda", O_RDONLY | O_DIRECT);
read(fd, buffer, 512);
close(fd);
wait(&status);
} else {
fd = open(CGPATH "/test/blkio.throttle.io_serviced", O_RDONLY);
n = read(fd, buffer, BUFFER_SIZE);
close(fd);
}
free(buffer);
exit(0);
}

void test(void)
{
int status;
mkdir(CGPATH "/test", 0666);
if (fork() > 0)
wait(&status);
else
run(getpid());
rmdir(CGPATH "/test");
}

int main(int argc, char **argv)
{
int i;
for (i = 0; i < NR_TESTS; i++)
test();
return 0;
}

Reported-by: Ricardo Marin Matinata
Signed-off-by: Thadeu Lima de Souza Cascardo
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe

Thadeu Lima de Souza Cascardo
2015-02-21 14:11:58 +0800

13 Feb, 2015

3 commits

8494bcf5b Merge branch 'for-3.20/drivers' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block driver changes from Jens Axboe:
"This contains:

- The 4k/partition fixes for brd from Boaz/Matthew.

- A few xen front/back block fixes from David Vrabel and Roger Pau
Monne.

- Floppy changes from Takashi, cleaning the device file creation.

- Switching libata to use the new blk-mq tagging policy, removing
code (and a suboptimal implementation) from libata. This will
throw you a merge conflict, since a bug in the original libata
tagging code was fixed since this code was branched. Trivial.
From Shaohua.

- Conversion of loop to blk-mq, from Ming Lei.

- Cleanup of the io_schedule() handling in bsg from Peter Zijlstra.
He claims it improves on unreadable code, which will cost him a
beer.

- Maintainer update or NDB, now handled by Markus Pargmann.

- NVMe:
- Optimization from me that avoids a kmalloc/kfree per IO for
smaller (t handle REQ_FUA explicitly
block: loop: introduce lo_discard() and lo_req_flush()
block: loop: say goodby to bio
block: loop: improve performance via blk-mq

Linus Torvalds
2015-02-13 06:30:53 +0800
3e12cefbe Merge branch 'for-3.20/core' of git://git.kernel.dk/linux-block ... Browse Code »

Pull core block IO changes from Jens Axboe:
"This contains:

- A series from Christoph that cleans up and refactors various parts
of the REQ_BLOCK_PC handling. Contributions in that series from
Dongsu Park and Kent Overstreet as well.

- CFQ:
- A bug fix for cfq for realtime IO scheduling from Jeff Moyer.
- A stable patch fixing a potential crash in CFQ in OOM
situations. From Konstantin Khlebnikov.

- blk-mq:
- Add support for tag allocation policies, from Shaohua. This is
a prep patch enabling libata (and other SCSI parts) to use the
blk-mq tagging, instead of rolling their own.
- Various little tweaks from Keith and Mike, in preparation for
DM blk-mq support.
- Minor little fixes or tweaks from me.
- A double free error fix from Tony Battersby.

- The partition 4k issue fixes from Matthew and Boaz.

- Add support for zero+unprovision for blkdev_issue_zeroout() from
Martin"

* 'for-3.20/core' of git://git.kernel.dk/linux-block: (27 commits)
block: remove unused function blk_bio_map_sg
block: handle the null_mapped flag correctly in blk_rq_map_user_iov
blk-mq: fix double-free in error path
block: prevent request-to-request merging with gaps if not allowed
blk-mq: make blk_mq_run_queues() static
dm: fix multipath regression due to initializing wrong request
cfq-iosched: handle failure of cfq group allocation
block: Quiesce zeroout wrapper
block: rewrite and split __bio_copy_iov()
block: merge __bio_map_user_iov into bio_map_user_iov
block: merge __bio_map_kern into bio_map_kern
block: pass iov_iter to the BLOCK_PC mapping functions
block: add a helper to free bio bounce buffer pages
block: use blk_rq_map_user_iov to implement blk_rq_map_user
block: simplify bio_map_kern
block: mark blk-mq devices as stackable
block: keep established cmd_flags when cloning into a blk-mq request
block: add blk-mq support to blk_insert_cloned_request()
block: require blk_rq_prep_clone() be given an initialized clone request
blk-mq: add tag allocation policy
...

Linus Torvalds
2015-02-13 06:13:23 +0800
6bec00352 Merge branch 'for-3.20/bdi' of git://git.kernel.dk/linux-block ... Browse Code »

Pull backing device changes from Jens Axboe:
"This contains a cleanup of how the backing device is handled, in
preparation for a rework of the life time rules. In this part, the
most important change is to split the unrelated nommu mmap flags from
it, but also removing a backing_dev_info pointer from the
address_space (and inode), and a cleanup of other various minor bits.

Christoph did all the work here, I just fixed an oops with pages that
have a swap backing. Arnd fixed a missing export, and Oleg killed the
lustre backing_dev_info from staging. Last patch was from Al,
unexporting parts that are now no longer needed outside"

* 'for-3.20/bdi' of git://git.kernel.dk/linux-block:
Make super_blocks and sb_lock static
mtd: export new mtd_mmap_capabilities
fs: make inode_to_bdi() handle NULL inode
staging/lustre/llite: get rid of backing_dev_info
fs: remove default_backing_dev_info
fs: don't reassign dirty inodes to default_backing_dev_info
nfs: don't call bdi_unregister
ceph: remove call to bdi_unregister
fs: remove mapping->backing_dev_info
fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info
nilfs2: set up s_bdi like the generic mount_bdev code
block_dev: get bdev inode bdi directly from the block device
block_dev: only write bdev inode on close
fs: introduce f_op->mmap_capabilities for nommu mmap support
fs: kill BDI_CAP_SWAP_BACKED
fs: deduplicate noop_backing_dev_info

Linus Torvalds
2015-02-13 05:50:21 +0800

12 Feb, 2015

4 commits

d427e3c82 block: remove unused function blk_bio_map_sg ... Browse Code »

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-02-12 02:24:14 +0800
a0763b27b block: handle the null_mapped flag correctly in blk_rq_map_user_iov ... Browse Code »

The tape drivers (and the sg driver in a special case that doesn't matter
here) use the null_mapped flag to tell blk_rq_map_user to not copy around
any data into or out of the bounce buffers. blk_rq_map_user_iov never
got that treatment, which didn't matter until I refactored blk_rq_map_user
to be implemented in terms of blk_rq_map_user_iov.

Signed-off-by: Christoph Hellwig
Fixes: ddad8dd0a162 ("block: use blk_rq_map_user_iov to implement blk_rq_map_user")
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-02-12 02:24:12 +0800
564e559f2 blk-mq: fix double-free in error path ... Browse Code »

If the allocation of bt->bs fails, then bt->map can be freed twice, once
in blk_mq_init_bitmap_tags() -> bt_alloc(), and once in
blk_mq_init_bitmap_tags() -> bt_free(). Fix by setting the pointer to
NULL after the first free.

Cc:
Signed-off-by: Tony Battersby
Signed-off-by: Jens Axboe

Tony Battersby
2015-02-12 00:35:21 +0800
854fbb9c6 block: prevent request-to-request merging with gaps if not allowed ... Browse Code »

If the queue has SG_GAPS set, we must not merge across an sg gap.
This is caught for the bio case, but currently not for the
more rare case of merging two requests directly.

Signed-off-by: Keith Busch

Cut the dm bits, those will go through the dm tree, and fixed
the test_bit() test.

Signed-off-by: Jens Axboe

Keith Busch
2015-02-12 00:23:52 +0800

11 Feb, 2015

1 commit

201f201c3 blk-mq: make blk_mq_run_queues() static ... Browse Code »

We no longer use it outside of blk-mq.c, so we can make it static
and stop exporting it. Additionally, kill the 'async' argument, as
there's only one used of it.

Signed-off-by: Jens Axboe

Jens Axboe
2015-02-11 04:31:34 +0800

10 Feb, 2015

2 commits

072bc448c Merge branch 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull EFI updates from Ingo Molnar:
"Main changes:

- Move efivarfs from the misc filesystem section to pseudo filesystem

- Expose firmware platform size in sysfs

- Improve robustness of get_memory_map() by removing assumptions on
the size of efi_memory_desc_t.

- various cleanups and fixes

The biggest risk is the get_memory_map() change, which changes the way
that both the arm64 and x86 EFI boot stub build the early memory map.
There are no known regressions with it at the moment, BYMMV"

* 'x86-efi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: Don't look for chosen@0 node on DT platforms
firmware: efi: Remove unneeded guid unparse
efi/libstub: Call get_memory_map() to obtain map and desc sizes
efi: Small leak on error in runtime map code
efi: rtc-efi: Mark UIE as unsupported
arm64/efi: efistub: Apply __init annotation
efi: Expose underlying UEFI firmware platform size to userland
efi: Rename efi_guid_unparse to efi_guid_to_str
efi: Update the URLs for efibootmgr
fs: Make efivarfs a pseudo filesystem, built by default with EFI

Linus Torvalds
2015-02-10 09:53:53 +0800
69abaffec cfq-iosched: handle failure of cfq group allocation ... Browse Code »

Cfq_lookup_create_cfqg() allocates struct blkcg_gq using GFP_ATOMIC.
In cfq_find_alloc_queue() possible allocation failure is not handled.
As a result kernel oopses on NULL pointer dereference when
cfq_link_cfqq_cfqg() calls cfqg_get() for NULL pointer.

Bug was introduced in v3.5 in commit cd1604fab4f9 ("blkcg: factor
out blkio_group creation"). Prior to that commit cfq group lookup
had returned pointer to root group as fallback.

This patch handles this error using existing fallback oom_cfqq.

Signed-off-by: Konstantin Khlebnikov
Acked-by: Tejun Heo
Acked-by: Vivek Goyal
Fixes: cd1604fab4f9 ("blkcg: factor out blkio_group creation")
Cc: stable@kernel.org
Signed-off-by: Jens Axboe

Konstantin Khlebnikov
2015-02-10 01:22:39 +0800

06 Feb, 2015

8 commits

9f9ee1f2b block: Quiesce zeroout wrapper ... Browse Code »

blkdev_issue_zeroout() printed a warning if a device failed a discard or
write same request despite advertising support for these. That's fine
for SCSI since we'll disable these commands if we get an error back from
the disk saying that they are not supported. And consequently the
warning only gets printed once.

There are other types of block devices that support discard, however,
and these may return -EOPNOTSUPP for each command but leave discard
enabled in the queue limits. This will cause a warning message for every
blkdev_issue_zeroout() invocation.

Remove the offending warning messages.

Reported-by: Sedat Dilek
Signed-off-by: Martin K. Petersen
Tested-by: Sedat Dilek
Signed-off-by: Jens Axboe

Martin K. Petersen
2015-02-06 01:14:54 +0800
9124d3fe2 block: rewrite and split __bio_copy_iov() ... Browse Code »

Rewrite __bio_copy_iov using the copy_page_{from,to}_iter helpers, and
split it into two simpler functions.

This commit should contain only literal replacements, without
functional changes.

Cc: Kent Overstreet
Cc: Jens Axboe
Cc: Al Viro
Signed-off-by: Dongsu Park
[hch: removed the __bio_copy_iov wrapper]
Signed-off-by: Christoph Hellwig
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe

Dongsu Park
2015-02-06 00:30:44 +0800
37f19e57a block: merge __bio_map_user_iov into bio_map_user_iov ... Browse Code »

And also remove the unused bdev argument.

Signed-off-by: Christoph Hellwig
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-02-06 00:30:43 +0800
75c72b836 block: merge __bio_map_kern into bio_map_kern ... Browse Code »

This saves a little code, and allow to simplify the error handling.

Signed-off-by: Christoph Hellwig
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-02-06 00:30:42 +0800
26e49cfc7 block: pass iov_iter to the BLOCK_PC mapping functions ... Browse Code »

Make use of a new interface provided by iov_iter, backed by
scatter-gather list of iovec, instead of the old interface based on
sg_iovec. Also use iov_iter_advance() instead of manual iteration.

This commit should contain only literal replacements, without
functional changes.

Cc: Christoph Hellwig
Cc: Jens Axboe
Cc: Doug Gilbert
Cc: "James E.J. Bottomley"
Signed-off-by: Kent Overstreet
[dpark: add more description in commit message]
Signed-off-by: Dongsu Park
[hch: fixed to do a deep clone of the iov_iter, and to properly use
the iov_iter direction]
Signed-off-by: Christoph Hellwig
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe

Kent Overstreet
2015-02-06 00:30:40 +0800
1dfa0f68c block: add a helper to free bio bounce buffer pages ... Browse Code »

The code sniplet to walk all bio_vecs and free their pages is opencoded in
way to many places, so factor it into a helper. Also convert the slightly
more complex cases in bio_kern_endio and __bio_copy_iov where we break
the freeing from an existing loop into a separate one.

Signed-off-by: Christoph Hellwig
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-02-06 00:30:39 +0800
ddad8dd0a block: use blk_rq_map_user_iov to implement blk_rq_map_user ... Browse Code »

Signed-off-by: Christoph Hellwig
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-02-06 00:30:37 +0800
42d2683a2 block: simplify bio_map_kern ... Browse Code »

Just open code the trivial mapping from a kernel virtual address to
a bio instead of going through the complex user address mapping
machinery.

Signed-off-by: Christoph Hellwig
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-02-06 00:30:35 +0800

05 Feb, 2015

1 commit

2c5612465 block: Simplify bsg complete all ... Browse Code »

It took me a few tries to figure out what this code did; lets rewrite
it into a more regular form.

The thing that makes this one 'special' is the BSG_F_BLOCK flag, if
that is not set we're not supposed/allowed to block and should spin
wait for completion.

The (new) io_wait_event() will never see a false condition in case of
the spinning and we will therefore not block.

Cc: Linus Torvalds
Signed-off-by: Peter Zijlstra (Intel)
Signed-off-by: Jens Axboe

Peter Zijlstra
2015-02-05 00:57:52 +0800

30 Jan, 2015

3 commits

3c01b74e8 Merge tag 'efi-next' of git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi into x86/efi ... Browse Code »

Pull EFI updates from Matt Fleming:

" - Move efivarfs from the misc filesystem section to pseudo filesystem,
since that's a more logical and accurate place - Leif Lindholm

- Update efibootmgr URL in Kconfig help - Peter Jones

- Improve accuracy of EFI guid function names - Borislav Petkov

- Expose firmware platform size in sysfs for the benefit of EFI boot
loader installers and other utilities - Steve McIntyre

- Cleanup __init annotations for arm64/efi code - Ard Biesheuvel

- Mark the UIE as unsupported for rtc-efi - Ard Biesheuvel

- Fix memory leak in error code path of runtime map code - Dan Carpenter

- Improve robustness of get_memory_map() by removing assumptions on the
size of efi_memory_desc_t (which could change in future spec
versions) and querying the firmware instead of guessing about the
memmap size - Ard Biesheuvel

- Remove superfluous guid unparse calls - Ivan Khoronzhuk

- Delete unnecessary chosen@0 DT node FDT code since was duplicated
from code in drivers/of and is entirely unnecessary - Leif Lindholm

There's nothing super scary, mainly cleanups, and a merge from Ricardo who
kindly picked up some patches from the linux-efi mailing list while I
was out on annual leave in December.

Perhaps the biggest risk is the get_memory_map() change from Ard, which
changes the way that both the arm64 and x86 EFI boot stub build the
early memory map. It would be good to have it bake in linux-next for a
while.
"

Signed-off-by: Ingo Molnar

Ingo Molnar
2015-01-30 02:16:40 +0800
e09aae7ed blk-mq: release mq's kobjects in blk_release_queue() ... Browse Code »

The kobject memory inside blk-mq hctx/ctx shouldn't have been freed
before the kobject is released because driver core can access it freely
before its release.

We can't do that in all ctx/hctx/mq_kobj's release handler because
it can be run before blk_cleanup_queue().

Given mq_kobj shouldn't have been introduced, this patch simply moves
mq's release into blk_release_queue().

Reported-by: Sasha Levin
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2015-01-30 00:30:51 +0800
74170118b Revert "blk-mq: fix hctx/ctx kobject use-after-free" ... Browse Code »

This reverts commit 76d697d10769048e5721510100bf3a9413a56385.

The commit 76d697d10769048 causes general protection fault
reported from Bart Van Assche:

https://lkml.org/lkml/2015/1/28/334

Reported-by: Bart Van Assche
Signed-off-by: Ming Lei
Signed-off-by: Jens Axboe

Ming Lei
2015-01-30 00:30:49 +0800