Eric Lee / smarc-fsl-linux-kernel

10 Feb, 2017

1 commit

5584ea250 xen: modify xenstore watch event interface ... Browse Code »

Today a Xenstore watch event is delivered via a callback function
declared as:

void (*callback)(struct xenbus_watch *,
const char **vec, unsigned int len);

As all watch events only ever come with two parameters (path and token)
changing the prototype to:

void (*callback)(struct xenbus_watch *,
const char *path, const char *token);

is the natural thing to do.

Apply this change and adapt all users.

Cc: konrad.wilk@oracle.com
Cc: roger.pau@citrix.com
Cc: wei.liu2@citrix.com
Cc: paul.durrant@citrix.com
Cc: netdev@vger.kernel.org

Signed-off-by: Juergen Gross
Reviewed-by: Paul Durrant
Reviewed-by: Wei Liu
Reviewed-by: Roger Pau Monné
Reviewed-by: Boris Ostrovsky
Signed-off-by: Boris Ostrovsky

Juergen Gross
2017-02-10 00:26:49 +0800

14 Dec, 2016

1 commit

aa3ecf388 Merge tag 'for-linus-4.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip ... Browse Code »

Pull xen updates from Juergen Gross:
"Xen features and fixes for 4.10

These are some fixes, a move of some arm related headers to share them
between arm and arm64 and a series introducing a helper to make code
more readable.

The most notable change is David stepping down as maintainer of the
Xen hypervisor interface. This results in me sending you the pull
requests for Xen related code from now on"

* tag 'for-linus-4.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (29 commits)
xen/balloon: Only mark a page as managed when it is released
xenbus: fix deadlock on writes to /proc/xen/xenbus
xen/scsifront: don't request a slot on the ring until request is ready
xen/x86: Increase xen_e820_map to E820_X_MAX possible entries
x86: Make E820_X_MAX unconditionally larger than E820MAX
xen/pci: Bubble up error and fix description.
xen: xenbus: set error code on failure
xen: set error code on failures
arm/xen: Use alloc_percpu rather than __alloc_percpu
arm/arm64: xen: Move shared architecture headers to include/xen/arm
xen/events: use xen_vcpu_id mapping for EVTCHNOP_status
xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing
xen-scsifront: Add a missing call to kfree
MAINTAINERS: update XEN HYPERVISOR INTERFACE
xenfs: Use proc_create_mount_point() to create /proc/xen
xen-platform: use builtin_pci_driver
xen-netback: fix error handling output
xen: make use of xenbus_read_unsigned() in xenbus
xen: make use of xenbus_read_unsigned() in xen-pciback
xen: make use of xenbus_read_unsigned() in xen-fbfront
...

Linus Torvalds
2016-12-14 08:07:55 +0800

07 Nov, 2016

1 commit

8235777b2 xen: make use of xenbus_read_unsigned() in xen-blkback ... Browse Code »

Use xenbus_read_unsigned() instead of xenbus_scanf() when possible.
This requires to change the type of one read from int to unsigned,
but this case has been wrong before: negative values are not allowed
for the modified case.

Cc: konrad.wilk@oracle.com
Cc: roger.pau@citrix.com

Signed-off-by: Juergen Gross
Acked-by: David Vrabel

Juergen Gross
2016-11-07 20:55:07 +0800

01 Nov, 2016

1 commit

70fd76140 block,fs: use REQ_* flags directly ... Browse Code »

Remove the WRITE_* and READ_SYNC wrappers, and just use the flags
directly. Where applicable this also drops usage of the
bio_set_op_attrs wrapper.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2016-11-01 23:43:26 +0800

28 Jul, 2016

1 commit

08fd8c176 Merge tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip ... Browse Code »

Pull xen updates from David Vrabel:
"Features and fixes for 4.8-rc0:

- ACPI support for guests on ARM platforms.
- Generic steal time support for arm and x86.
- Support cases where kernel cpu is not Xen VCPU number (e.g., if
in-guest kexec is used).
- Use the system workqueue instead of a custom workqueue in various
places"

* tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (47 commits)
xen: add static initialization of steal_clock op to xen_time_ops
xen/pvhvm: run xen_vcpu_setup() for the boot CPU
xen/evtchn: use xen_vcpu_id mapping
xen/events: fifo: use xen_vcpu_id mapping
xen/events: use xen_vcpu_id mapping in events_base
x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info
x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op
xen: introduce xen_vcpu_id mapping
x86/acpi: store ACPI ids from MADT for future usage
x86/xen: update cpuid.h from Xen-4.7
xen/evtchn: add IOCTL_EVTCHN_RESTRICT
xen-blkback: really don't leak mode property
xen-blkback: constify instance of "struct attribute_group"
xen-blkfront: prefer xenbus_scanf() over xenbus_gather()
xen-blkback: prefer xenbus_scanf() over xenbus_gather()
xen: support runqueue steal time on xen
arm/xen: add support for vm_assist hypercall
xen: update xen headers
xen-pciback: drop superfluous variables
xen-pciback: short-circuit read path used for merging write values
...

Linus Torvalds
2016-07-28 02:35:37 +0800

22 Jul, 2016

3 commits

aea305e11 xen-blkback: really don't leak mode property ... Browse Code »

Commit 9d092603cc ("xen-blkback: do not leak mode property") left one
path unfixed; correct this.

Acked-by: Jens Axboe
Acked-by: Roger Pau Monné
Signed-off-by: Jan Beulich
Signed-off-by: Konrad Rzeszutek Wilk

Jan Beulich
2016-07-22 20:24:43 +0800
530439484 xen-blkback: constify instance of "struct attribute_group" ... Browse Code »

The functions these get passed to have been taking pointers to const
since at least 2.6.16.

Acked-by: Jens Axboe
Acked-by: Roger Pau Monné
Signed-off-by: Jan Beulich
Signed-off-by: Konrad Rzeszutek Wilk

Jan Beulich
2016-07-22 20:23:52 +0800
6694389af xen-blkback: prefer xenbus_scanf() over xenbus_gather() ... Browse Code »

... for single items being collected: It is more typesafe (as the
compiler can check format string and to-be-written-to variable match)
and requires one less parameter to be passed.

Signed-off-by: Jan Beulich
Signed-off-by: Konrad Rzeszutek Wilk
Acked-by: Roger Pau Monné
Acked-by: Jens Axboe

Jan Beulich
2016-07-22 20:23:38 +0800

09 Jun, 2016

1 commit

288dab8a3 block: add a separate operation type for secure erase ... Browse Code »

Instead of overloading the discard support with the REQ_SECURE flag.
Use the opportunity to rename the queue flag as well, and remove the
dead checks for this flag in the RAID 1 and RAID 10 drivers that don't
claim support for secure erase.

Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe

Christoph Hellwig
2016-06-09 23:52:25 +0800

08 Jun, 2016

2 commits

a022606e5 xen: use bio op accessors ... Browse Code »

Separate the op from the rq_flag_bits and have xen
set/get the bio using bio_set_op_attrs/bio_op.

Signed-off-by: Mike Christie
Reviewed-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Signed-off-by: Jens Axboe

Mike Christie
2016-06-08 03:41:38 +0800
4e49ea4a3 block/fs/drivers: remove rw argument from submit_bio ... Browse Code »

This has callers of submit_bio/submit_bio_wait set the bio->bi_rw
instead of passing it in. This makes that use the same as
generic_make_request and how we set the other bio fields.

Signed-off-by: Mike Christie

Fixed up fs/ext4/crypto.c

Signed-off-by: Jens Axboe

Mike Christie
2016-06-08 03:41:38 +0800

14 Apr, 2016

1 commit

c888a8f95 block: kill off q->flush_flags ... Browse Code »

Now that we converted everything to the newer block write cache
interface, kill off the queue flush_flags and queueable flush
entries.

Signed-off-by: Jens Axboe

Jens Axboe
2016-04-14 03:33:19 +0800

04 Mar, 2016

2 commits

fa3184b89 xen/blback: Fit the important information of the thread in 17 characters ... Browse Code »

The processes names are truncated to 17, while we had the length
of the process as name 20 - which meant that while we filled
it out with various details - the last 3 characters (which had
the queue number) never surfaced to the user-space.

To simplify this and be able to fit the device name, domain id,
and the queue number we remove the 'blkback' from the name.

Prior to this patch the device name is "blkback.."
for example: blkback.8.xvda, blkback.11.hda.

With the multiqueue block backend we add "-%d" for the queue.
But sadly this is already way past the limit so it gets stripped.

Possible solution had been identified by Ian:
http://lists.xenproject.org/archives/html/xen-devel/2015-05/msg03516.html

"
If you are pressed for space then the "xvd" is probably a bit redundant
in a string which starts blkbk.

The guest may not even call the device xvdN (iirc BSD has another
prefix) any how, so having blkback say so seems of limited use anyway.

Since this seems to not include a partition number how does this work in
the split partition scheme? (i.e. one where the guest is given xvda1 and
xvda2 rather than xvda with a partition table)

[It will be 'blkback.8.xvda1', and 'blkback.11.xvda2']

Perhaps something derived from one of the schemes in
http://xenbits.xen.org/docs/unstable/misc/vbd-interface.txt might be a
better fit?

After a bit of discussion (see
http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg01588.html)
we settled on dropping the "blback" part.

This will make it possible to have the .-:

[1.xvda-0]
[1.xvda-1]

And we enough space to make it go up to:

[32100.xvdfg9-5]

Acked-by: Roger Pau Monné
Reported-by: Jan Beulich
Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2016-03-04 05:45:54 +0800
5a7058450 xen-blkback: advertise indirect segment support earlier ... Browse Code »

There's no reason to defer this until the connect phase, and in fact
there are frontend implementations expecting this to be available
earlier. Move it into the probe function.

Acked-by: Roger Pau Monné
Signed-off-by: Jan Beulich
Cc: Bob Liu
Signed-off-by: Konrad Rzeszutek Wilk

Jan Beulich
2016-03-04 05:45:53 +0800

22 Jan, 2016

1 commit

641203549 Merge branch 'for-4.5/drivers' of git://git.kernel.dk/linux-block ... Browse Code »

Pull block driver updates from Jens Axboe:
"This is the block driver pull request for 4.5, with the exception of
NVMe, which is in a separate branch and will be posted after this one.

This pull request contains:

- A set of bcache stability fixes, which have been acked by Kent.
These have been used and tested for more than a year by the
community, so it's about time that they got in.

- A set of drbd updates from the drbd team (Andreas, Lars, Philipp)
and Markus Elfring, Oleg Drokin.

- A set of fixes for xen blkback/front from the usual suspects, (Bob,
Konrad) as well as community based fixes from Kiri, Julien, and
Peng.

- A 2038 time fix for sx8 from Shraddha, with a fix from me.

- A small mtip32xx cleanup from Zhu Yanjun.

- A null_blk division fix from Arnd"

* 'for-4.5/drivers' of git://git.kernel.dk/linux-block: (71 commits)
null_blk: use sector_div instead of do_div
mtip32xx: restrict variables visible in current code module
xen/blkfront: Fix crash if backend doesn't follow the right states.
xen/blkback: Fix two memory leaks.
xen/blkback: make st_ statistics per ring
xen/blkfront: Handle non-indirect grant with 64KB pages
xen-blkfront: Introduce blkif_ring_get_request
xen-blkback: clear PF_NOFREEZE for xen_blkif_schedule()
xen/blkback: Free resources if connect_ring failed.
xen/blocks: Return -EXX instead of -1
xen/blkback: make pool of persistent grants and free pages per-queue
xen/blkback: get the number of hardware queues/rings from blkfront
xen/blkback: pseudo support for multi hardware queues/rings
xen/blkback: separate ring information out of struct xen_blkif
xen/blkfront: correct setting for xen_blkif_max_ring_order
xen/blkfront: make persistent grants pool per-queue
xen/blkfront: Remove duplicate setting of ->xbdev.
xen/blkfront: Cleanup of comments, fix unaligned variables, and syntax errors.
xen/blkfront: negotiate number of queues/rings to be used with backend
xen/blkfront: split per device io_lock
...

Linus Torvalds
2016-01-22 10:19:38 +0800

05 Jan, 2016

9 commits

93bb277f9 xen/blkback: Fix two memory leaks. ... Browse Code »

This patch fixs two memleaks:
backtrace:
[] kmemleak_alloc+0x28/0x50
[] kmem_cache_alloc+0xbb/0x1d0
[] xen_blkbk_probe+0x58/0x230
[] xenbus_dev_probe+0x76/0x130
[] driver_probe_device+0x166/0x2c0
[] __device_attach_driver+0xac/0xb0
[] bus_for_each_drv+0x67/0x90
[] __device_attach+0xc7/0x120
[] device_initial_probe+0x13/0x20
[] bus_probe_device+0x9a/0xb0
[] device_add+0x3b1/0x5c0
[] device_register+0x1e/0x30
[] xenbus_probe_node+0x158/0x170
[] xenbus_dev_changed+0x1af/0x1c0
[] backend_changed+0x1b/0x20
[] xenwatch_thread+0xb6/0x160
unreferenced object 0xffff880007ba8ef8 (size 224):

backtrace:
[] kmemleak_alloc+0x28/0x50
[] __kmalloc+0xd3/0x1e0
[] frontend_changed+0x2c7/0x580
[] xenbus_otherend_changed+0xa2/0xb0
[] frontend_changed+0x10/0x20
[] xenwatch_thread+0xb6/0x160
[] kthread+0xd7/0xf0
[] ret_from_fork+0x3f/0x70
[] 0xffffffffffffffff
unreferenced object 0xffff8800048dcd38 (size 224):

The first leak is caused by not put() the be->blkif reference
which we had gotten in xen_blkif_alloc(), while the second is
us not freeing blkif->rings in the right place.

Signed-off-by: Bob Liu
Reported-and-Tested-by: Konrad Rzeszutek Wilk
Signed-off-by: Konrad Rzeszutek Wilk

Bob Liu
2016-01-05 01:21:26 +0800
db6fbc106 xen/blkback: make st_ statistics per ring ... Browse Code »

Make st_* statistics per ring and the VBD sysfs would iterate over all the
rings.

Note: xenvbd_sysfs_delif() is called in xen_blkbk_remove() before all rings
are torn down, so it's safe.

Signed-off-by: Bob Liu
Signed-off-by: Konrad Rzeszutek Wilk
---
v2: Aligned the variables on the same column.

Bob Liu
2016-01-05 01:21:25 +0800
a6e7af128 xen-blkback: clear PF_NOFREEZE for xen_blkif_schedule() ... Browse Code »

xen_blkif_schedule() kthread calls try_to_freeze() at the beginning of
every attempt to purge the LRU. This operation can't ever succeed though,
as the kthread hasn't marked itself as freezable.

Before (hopefully eventually) kthread freezing gets converted to fileystem
freezing, we'd rather mark xen_blkif_schedule() freezable (as it can
generate I/O during suspend).

Signed-off-by: Jiri Kosina
Signed-off-by: Konrad Rzeszutek Wilk

Jiri Kosina
2016-01-05 01:21:24 +0800
2d0382fac xen/blkback: Free resources if connect_ring failed. ... Browse Code »

With the multi-queue support we could fail at setting up
some of the rings and fail the connection. That meant that
all resources tied to rings[0..n-1] (where n is the ring
that failed to be setup). Eventually the frontend will switch
to the states and we will call xen_blkif_disconnect.

However we do not want to be at the mercy of the frontend
deciding when to change states. This allows us to do the
cleanup right away and freeing resources.

Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2016-01-05 01:21:07 +0800
bde21f73b xen/blocks: Return -EXX instead of -1 ... Browse Code »

Lets return sensible values instead of -1.

Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2016-01-05 01:21:07 +0800
d4bf0065b xen/blkback: make pool of persistent grants and free pages per-queue ... Browse Code »

Make pool of persistent grants and free pages per-queue/ring instead of
per-device to get better scalability.

Test was done based on null_blk driver:
dom0: v4.2-rc8 16vcpus 10GB "modprobe null_blk"
domu: v4.2-rc8 16vcpus 10GB

[test]
rw=read
direct=1
ioengine=libaio
bs=4k
time_based
runtime=30
filename=/dev/xvdb
numjobs=16
iodepth=64
iodepth_batch=64
iodepth_batch_complete=64
group_reporting

Results:
iops1: After patch "xen/blkfront: make persistent grants per-queue".
iops2: After this patch.

Queues: 1 4 8 16
Iops orig(k): 810 1064 780 700
Iops1(k): 810 1230(~20%) 1024(~20%) 850(~20%)
Iops2(k): 810 1410(~35%) 1354(~75%) 1440(~100%)

With 4 queues after this commit we can get ~75% increase in IOPS, and
performance won't drop if increasing queue numbers.

Please find the respective chart in this link:
https://www.dropbox.com/s/agrcy2pbzbsvmwv/iops.png?dl=0

Signed-off-by: Bob Liu
Signed-off-by: Konrad Rzeszutek Wilk

Bob Liu
2016-01-05 01:21:06 +0800
d62d86000 xen/blkback: get the number of hardware queues/rings from blkfront ... Browse Code »

Backend advertises "multi-queue-max-queues" to front, also get the negotiated
number from "multi-queue-num-queues" written by blkfront.

Signed-off-by: Bob Liu
Signed-off-by: Konrad Rzeszutek Wilk

Bob Liu
2016-01-05 01:21:06 +0800
2fb1ef4f1 xen/blkback: pseudo support for multi hardware queues/rings ... Browse Code »

Preparatory patch for multiple hardware queues (rings). The number of
rings is unconditionally set to 1, larger number will be enabled in
"xen/blkback: get the number of hardware queues/rings from blkfront".

Signed-off-by: Arianna Avanzini
Signed-off-by: Bob Liu
Signed-off-by: Konrad Rzeszutek Wilk
---
v2: Align variables in the structures.

Konrad Rzeszutek Wilk
2016-01-05 01:21:05 +0800
597957000 xen/blkback: separate ring information out of struct xen_blkif ... Browse Code »

Split per ring information to an new structure "xen_blkif_ring", so that one vbd
device can be associated with one or more rings/hardware queues.

Introduce 'pers_gnts_lock' to protect the pool of persistent grants since we
may have multi backend threads.

This patch is a preparation for supporting multi hardware queues/rings.

Signed-off-by: Arianna Avanzini
Signed-off-by: Bob Liu
Signed-off-by: Konrad Rzeszutek Wilk
---
v2: Align the variables in the structure.

Bob Liu
2016-01-05 01:21:05 +0800

18 Dec, 2015

2 commits

187791491 xen-blkback: read from indirect descriptors only once ... Browse Code »

Since indirect descriptors are in memory shared with the frontend, the
frontend could alter the first_sect and last_sect values after they have
been validated but before they are recorded in the request. This may
result in I/O requests that overflow the foreign page, possibly
overwriting local pages when the I/O request is executed.

When parsing indirect descriptors, only read first_sect and last_sect
once.

This is part of XSA155.

CC: stable@vger.kernel.org
Signed-off-by: Roger Pau Monné
Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk

Roger Pau Monné
2015-12-18 23:00:37 +0800
1f13d75cc xen-blkback: only read request operation from shared ring once ... Browse Code »

A compiler may load a switch statement value multiple times, which could
be bad when the value is in memory shared with the frontend.

When converting a non-native request to a native one, ensure that
src->operation is only loaded once by using READ_ONCE().

This is part of XSA155.

CC: stable@vger.kernel.org
Signed-off-by: Roger Pau Monné
Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk

Roger Pau Monné
2015-12-18 23:00:32 +0800

23 Oct, 2015

2 commits

9cce2914e xen/xenbus: Rename *RING_PAGE* to *RING_GRANT* ... Browse Code »

Linux may use a different page size than the size of grant. So make
clear that the order is actually in number of grant.

Signed-off-by: Julien Grall
Signed-off-by: David Vrabel

Julien Grall
2015-10-23 21:20:46 +0800
67de5dfbc block/xen-blkback: Make it running on 64KB page granularity ... Browse Code »

The PV block protocol is using 4KB page granularity. The goal of this
patch is to allow a Linux using 64KB page granularity behaving as a
block backend on a non-modified Xen.

It's only necessary to adapt the ring size and the number of request per
indirect frames. The rest of the code is relying on the grant table
code.

Note that the grant table code is allocating a Linux page per grant
which will result to waste 6OKB for every grant when Linux is using 64KB
page granularity. This could be improved by sharing the page between
multiple grants.

Signed-off-by: Julien Grall
Acked-by: "Roger Pau Monné"
Signed-off-by: David Vrabel

Julien Grall
2015-10-23 21:20:40 +0800

24 Sep, 2015

2 commits

adbe734b2 Merge branch 'stable/for-jens-4.3' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/konrad/xen into for-linus

Konrad writes:

It has one fix that should go in and also be put in stable tree (I've
added the CC already).

It is a fix for a memory leak that can exposed via using UEFI
xen-blkfront driver.

Jens Axboe
2015-09-24 00:59:44 +0800
f929d42ce xen/blkback: free requests on disconnection ... Browse Code »

This is due to commit 86839c56dee28c315a4c19b7bfee450ccd84cd25
"xen/block: add multi-page ring support"

When using an guest under UEFI - after the domain is destroyed
the following warning comes from blkback.

------------[ cut here ]------------
WARNING: CPU: 2 PID: 95 at
/home/julien/works/linux/drivers/block/xen-blkback/xenbus.c:274
xen_blkif_deferred_free+0x1f4/0x1f8()
Modules linked in:
CPU: 2 PID: 95 Comm: kworker/2:1 Tainted: G W 4.2.0 #85
Hardware name: APM X-Gene Mustang board (DT)
Workqueue: events xen_blkif_deferred_free
Call trace:
[] dump_backtrace+0x0/0x124
[] show_stack+0x10/0x1c
[] dump_stack+0x78/0x98
[] warn_slowpath_common+0x9c/0xd4
[] warn_slowpath_null+0x14/0x20
[] xen_blkif_deferred_free+0x1f0/0x1f8
[] process_one_work+0x160/0x3b4
[] worker_thread+0x140/0x494
[] kthread+0xd8/0xf0
---[ end trace 6f859b7883c88cdd ]---

Request allocation has been moved to connect_ring, which is called every
time blkback connects to the frontend (this can happen multiple times during
a blkback instance life cycle). On the other hand, request freeing has not
been moved, so it's only called when destroying the backend instance. Due to
this mismatch, blkback can allocate the request pool multiple times, without
freeing it.

In order to fix it, move the freeing of requests to xen_blkif_disconnect to
restore the symmetry between request allocation and freeing.

Reported-by: Julien Grall
Signed-off-by: Roger Pau Monné
Tested-by: Julien Grall
Cc: Konrad Rzeszutek Wilk
Cc: Boris Ostrovsky
Cc: David Vrabel
Cc: xen-devel@lists.xenproject.org
CC: stable@vger.kernel.org # 4.2
Signed-off-by: Konrad Rzeszutek Wilk

Roger Pau Monne
2015-09-24 00:09:19 +0800

03 Sep, 2015

1 commit

1081230b7 Merge branch 'for-4.3/core' of git://git.kernel.dk/linux-block ... Browse Code »

Pull core block updates from Jens Axboe:
"This first core part of the block IO changes contains:

- Cleanup of the bio IO error signaling from Christoph. We used to
rely on the uptodate bit and passing around of an error, now we
store the error in the bio itself.

- Improvement of the above from myself, by shrinking the bio size
down again to fit in two cachelines on x86-64.

- Revert of the max_hw_sectors cap removal from a revision again,
from Jeff Moyer. This caused performance regressions in various
tests. Reinstate the limit, bump it to a more reasonable size
instead.

- Make /sys/block//queue/discard_max_bytes writeable, by me.
Most devices have huge trim limits, which can cause nasty latencies
when deleting files. Enable the admin to configure the size down.
We will look into having a more sane default instead of UINT_MAX
sectors.

- Improvement of the SGP gaps logic from Keith Busch.

- Enable the block core to handle arbitrarily sized bios, which
enables a nice simplification of bio_add_page() (which is an IO hot
path). From Kent.

- Improvements to the partition io stats accounting, making it
faster. From Ming Lei.

- Also from Ming Lei, a basic fixup for overflow of the sysfs pending
file in blk-mq, as well as a fix for a blk-mq timeout race
condition.

- Ming Lin has been carrying Kents above mentioned patches forward
for a while, and testing them. Ming also did a few fixes around
that.

- Sasha Levin found and fixed a use-after-free problem introduced by
the bio->bi_error changes from Christoph.

- Small blk cgroup cleanup from Viresh Kumar"

* 'for-4.3/core' of git://git.kernel.dk/linux-block: (26 commits)
blk: Fix bio_io_vec index when checking bvec gaps
block: Replace SG_GAPS with new queue limits mask
block: bump BLK_DEF_MAX_SECTORS to 2560
Revert "block: remove artifical max_hw_sectors cap"
blk-mq: fix race between timeout and freeing request
blk-mq: fix buffer overflow when reading sysfs file of 'pending'
Documentation: update notes in biovecs about arbitrarily sized bios
block: remove bio_get_nr_vecs()
fs: use helper bio_add_page() instead of open coding on bi_io_vec
block: kill merge_bvec_fn() completely
md/raid5: get rid of bio_fits_rdev()
md/raid5: split bio for chunk_aligned_read
block: remove split code in blkdev_issue_{discard,write_same}
btrfs: remove bio splitting and merge_bvec_fn() calls
bcache: remove driver private bio splitting code
block: simplify bio_add_page()
block: make generic_make_request handle arbitrarily sized bios
blk-cgroup: Drop unlikely before IS_ERR(_OR_NULL)
block: don't access bio->bi_error after bio_put()
block: shrink struct bio down to 2 cache lines again
...

Linus Torvalds
2015-09-03 04:10:25 +0800

29 Jul, 2015

1 commit

4246a0b63 block: add a bi_error field to struct bio ... Browse Code »

Currently we have two different ways to signal an I/O error on a BIO:

(1) by clearing the BIO_UPTODATE flag
(2) by returning a Linux errno value to the bi_end_io callback

The first one has the drawback of only communicating a single possible
error (-EIO), and the second one has the drawback of not beeing persistent
when bios are queued up, and are not passed along from child to parent
bio in the ever more popular chaining scenario. Having both mechanisms
available has the additional drawback of utterly confusing driver authors
and introducing bugs where various I/O submitters only deal with one of
them, and the others have to add boilerplate code to deal with both kinds
of error returns.

So add a new bi_error field to store an errno value directly in struct
bio and remove the existing mechanisms to clean all this up.

Signed-off-by: Christoph Hellwig
Reviewed-by: Hannes Reinecke
Reviewed-by: NeilBrown
Signed-off-by: Jens Axboe

Christoph Hellwig
2015-07-29 22:55:15 +0800

28 Jul, 2015

1 commit

e162b219a Merge branch 'stable/for-jens-4.2' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/konrad/xen into for-linus

Konrad writes:

"There are three bugs that have been found in the xen-blkfront (and
backend). Two of them have the stable tree CC-ed. They have been found
where an guest is migrating to a host that is missing
'feature-persistent' support (from one that has it enabled). We end up
hitting an BUG() in the driver code."

Jens Axboe
2015-07-28 01:58:41 +0800

24 Jul, 2015

1 commit

53bc7dc00 xen-blkback: replace work_pending with work_busy in purge_persistent_gnt() ... Browse Code »

The BUG_ON() in purge_persistent_gnt() will be triggered when previous purge
work haven't finished.

There is a work_pending() before this BUG_ON, but it doesn't account if the work
is still currently running.

CC: stable@vger.kernel.org
Acked-by: Roger Pau Monné
Signed-off-by: Bob Liu
Signed-off-by: Konrad Rzeszutek Wilk

Bob Liu
2015-07-24 21:09:49 +0800

02 Jul, 2015

1 commit

7adf12b87 Merge tag 'for-linus-4.2-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip ... Browse Code »

Pull xen updates from David Vrabel:
"Xen features and cleanups for 4.2-rc0:

- add "make xenconfig" to assist in generating configs for Xen guests

- preparatory cleanups necessary for supporting 64 KiB pages in ARM
guests

- automatically use hvc0 as the default console in ARM guests"

* tag 'for-linus-4.2-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
block/xen-blkback: s/nr_pages/nr_segs/
block/xen-blkfront: Remove invalid comment
block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
arm/xen: Drop duplicate define mfn_to_virt
xen/grant-table: Remove unused macro SPP
xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
xen: Include xen/page.h rather than asm/xen/page.h
kconfig: add xenconfig defconfig helper
kconfig: clarify kvmconfig is for kvm
xen/pcifront: Remove usage of struct timeval
xen/tmem: use BUILD_BUG_ON() in favor of BUG_ON()
hvc_xen: avoid uninitialized variable warning
xenbus: avoid uninitialized variable warning
xen/arm: allow console=hvc0 to be omitted for guests
arm,arm64/xen: move Xen initialization earlier
arm/xen: Correctly check if the event channel interrupt is present

Linus Torvalds
2015-07-02 02:53:46 +0800

28 Jun, 2015

1 commit

6443af985 Merge branch 'stable/for-jens-4.2' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/konrad/xen into for-linus

Jens Axboe
2015-06-28 01:47:07 +0800

17 Jun, 2015

1 commit

6684fa1cd block/xen-blkback: s/nr_pages/nr_segs/ ... Browse Code »

Make the code less confusing to read now that Linux may not have the
same page size as Xen.

Signed-off-by: Julien Grall
Acked-by: Roger Pau Monné
Cc: Konrad Rzeszutek Wilk
Signed-off-by: David Vrabel

Julien Grall
2015-06-17 23:35:19 +0800

06 Jun, 2015

2 commits

86839c56d xen/block: add multi-page ring support ... Browse Code »

Extend xen/block to support multi-page ring, so that more requests can be
issued by using more than one pages as the request ring between blkfront
and backend.
As a result, the performance can get improved significantly.

We got some impressive improvements on our highend iscsi storage cluster
backend. If using 64 pages as the ring, the IOPS increased about 15 times
for the throughput testing and above doubled for the latency testing.

The reason was the limit on outstanding requests is 32 if use only one-page
ring, but in our case the iscsi lun was spread across about 100 physical
drives, 32 was really not enough to keep them busy.

Changes in v2:
- Rebased to 4.0-rc6.
- Document on how multi-page ring feature working to linux io/blkif.h.

Changes in v3:
- Remove changes to linux io/blkif.h and follow the protocol defined
in io/blkif.h of XEN tree.
- Rebased to 4.1-rc3

Changes in v4:
- Turn to use 'ring-page-order' and 'max-ring-page-order'.
- A few comments from Roger.

Changes in v5:
- Clarify with 4k granularity to comment
- Address more comments from Roger

Signed-off-by: Bob Liu
Signed-off-by: Konrad Rzeszutek Wilk

Bob Liu
2015-06-06 09:14:05 +0800
69b91ede5 drivers: xen-blkback: delay pending_req allocation to connect_ring ... Browse Code »

This is a pre-patch for multi-page ring feature.
In connect_ring, we can know exactly how many pages are used for the shared
ring, delay pending_req allocation here so that we won't waste too much memory.

Signed-off-by: Bob Liu
Signed-off-by: Konrad Rzeszutek Wilk

Bob Liu
2015-06-06 09:14:05 +0800

27 Apr, 2015

1 commit

b44166cd4 xen/grant: introduce func gnttab_unmap_refs_sync() ... Browse Code »

There are several place using gnttab async unmap and wait for
completion, so move the common code to a function
gnttab_unmap_refs_sync().

Signed-off-by: Bob Liu
Acked-by: Roger Pau Monné
Acked-by: Konrad Rzeszutek Wilk
Signed-off-by: David Vrabel

Bob Liu
2015-04-27 18:41:12 +0800