24 Oct, 2020
1 commit
-
…art_in_flight_rw") into android-mailine
Bisection on the way to 5.10-rc1
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I2cd6d1d4deabeb7f9a17cf558aebd589449a37da
02 Sep, 2020
1 commit
-
This comment was added before the multiqueue I/O scheduler framework
was introduced; multiqueue has support for I/O scheduling now, so this
obsolete comment can be removed.Signed-off-by: Danny Lin
Signed-off-by: Jens Axboe
07 Aug, 2020
1 commit
-
…into android-mainline
Steps on the way to 5.9-rc1
Resolves conflicts in:
drivers/irqchip/qcom-pdc.c
include/linux/device.h
net/xfrm/xfrm_state.c
security/lsm_audit.cSigned-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I4aeb3d04f4717714a421721eb3ce690c099bb30a
08 Jul, 2020
1 commit
-
Add support for NVM Express Zoned Namespaces (ZNS) Command Set defined
in NVM Express TP4053. Zoned namespaces are discovered based on their
Command Set Identifier reported in the namespaces Namespace
Identification Descriptor list. A successfully discovered Zoned
Namespace will be registered with the block layer as a host managed
zoned block device with Zone Append command support. A namespace that
does not support append is not supported by the driver.Reviewed-by: Martin K. Petersen
Reviewed-by: Johannes Thumshirn
Reviewed-by: Hannes Reinecke
Reviewed-by: Sagi Grimberg
Reviewed-by: Javier González
Reviewed-by: Himanshu Madhani
Signed-off-by: Hans Holmberg
Signed-off-by: Dmitry Fomichev
Signed-off-by: Ajay Joshi
Signed-off-by: Aravind Ramesh
Signed-off-by: Niklas Cassel
Signed-off-by: Matias Bjørling
Signed-off-by: Damien Le Moal
Signed-off-by: Keith Busch
Signed-off-by: Christoph Hellwig
25 Jun, 2020
1 commit
-
Linux 5.8-rc1
Signed-off-by: Greg Kroah-Hartman
Change-Id: I00f2168bc9b6fd8e48c7c0776088d2c6cb8e1629
18 Jun, 2020
1 commit
-
…rnel.dk/linux-block") into android-mainline
Conflicts:
block/blk-core.c
block/blk-crypto-fallback.c
block/blk-crypto.c
block/keyslot-manager.c
drivers/md/dm.c
include/linux/blk-crypto.h
include/linux/blk_types.h
include/linux/keyslot-manager.hChange-Id: Ie757c41fa41e6a9aacdf123d82d4f681623a02a8
Signed-off-by: Eric Biggers <ebiggers@google.com>
14 Jun, 2020
1 commit
-
Since commit 84af7a6194e4 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.There are a variety of indentation styles found.
a) 4 spaces + '---help---'
b) 7 spaces + '---help---'
c) 8 spaces + '---help---'
d) 1 space + 1 tab + '---help---'
e) 1 tab + '---help---' (correct indentation)
f) 1 tab + 1 space + '---help---'
g) 1 tab + 2 spaces + '---help---'In order to convert all of them to 1 tab + 'help', I ran the
following commend:$ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'
Signed-off-by: Masahiro Yamada
14 May, 2020
2 commits
-
Blk-crypto delegates crypto operations to inline encryption hardware
when available. The separately configurable blk-crypto-fallback contains
a software fallback to the kernel crypto API - when enabled, blk-crypto
will use this fallback for en/decryption when inline encryption hardware
is not available.This lets upper layers not have to worry about whether or not the
underlying device has support for inline encryption before deciding to
specify an encryption context for a bio. It also allows for testing
without actual inline encryption hardware - in particular, it makes it
possible to test the inline encryption code in ext4 and f2fs simply by
running xfstests with the inlinecrypt mount option, which in turn allows
for things like the regular upstream regression testing of ext4 to cover
the inline encryption code paths.For more details, refer to Documentation/block/inline-encryption.rst.
Signed-off-by: Satya Tangirala
Reviewed-by: Eric Biggers
Signed-off-by: Jens Axboe -
Inline Encryption hardware allows software to specify an encryption context
(an encryption key, crypto algorithm, data unit num, data unit size) along
with a data transfer request to a storage device, and the inline encryption
hardware will use that context to en/decrypt the data. The inline
encryption hardware is part of the storage device, and it conceptually sits
on the data path between system memory and the storage device.Inline Encryption hardware implementations often function around the
concept of "keyslots". These implementations often have a limited number
of "keyslots", each of which can hold a key (we say that a key can be
"programmed" into a keyslot). Requests made to the storage device may have
a keyslot and a data unit number associated with them, and the inline
encryption hardware will en/decrypt the data in the requests using the key
programmed into that associated keyslot and the data unit number specified
with the request.As keyslots are limited, and programming keys may be expensive in many
implementations, and multiple requests may use exactly the same encryption
contexts, we introduce a Keyslot Manager to efficiently manage keyslots.We also introduce a blk_crypto_key, which will represent the key that's
programmed into keyslots managed by keyslot managers. The keyslot manager
also functions as the interface that upper layers will use to program keys
into inline encryption hardware. For more information on the Keyslot
Manager, refer to documentation found in block/keyslot-manager.c and
linux/keyslot-manager.h.Co-developed-by: Eric Biggers
Signed-off-by: Eric Biggers
Signed-off-by: Satya Tangirala
Reviewed-by: Eric Biggers
Reviewed-by: Christoph Hellwig
Signed-off-by: Jens Axboe
01 May, 2020
1 commit
-
On each IO completion, iocost decides whether the IO met or missed its latency
target. Currently, the targets are fixed numbers per IO type. While this can be
good enough for loose latency targets way higher than typical completion
latencies, the effect of IO size makes it difficult to tighten the latency
target - a target adequate for 4k IOs might be too tight for 512k IOs and
vice-versa.iocost already has all the necessary information to account for different IO
sizes when testing whether the latency target is met as iocost can calculate the
size vtime cost of a given IO. This patch updates the completion path to
calculate the size vtime cost of the IO, deduct the nsec equivalent from the
observed latency and use the adjusted value to decide whether the target is met.This makes latency targets independent from IO size and enables determining
adequate latency targets with fixed size fio runs.Signed-off-by: Tejun Heo
Cc: Andy Newell
Signed-off-by: Jens Axboe
30 Jan, 2020
2 commits
-
…g/pub/scm/linux/kernel/git/kdave/linux") into android-mainline
Baby steps in the 5.6-rc1 merge cycle to make things easier to review
and debug.Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I005e68433be6b1d66bd56d7e1c8f44ab8e78bebe -
Bug: 139431025
Test: Treehugger
Change-Id: Iba414b03a9ee1544036c4153bf52e8c8edbb9c17
Signed-off-by: Ram Muthiah
13 Jan, 2020
1 commit
-
Changes v5 => v6:
- Blk-crypto's kernel crypto API fallback is no longer restricted to
8-byte DUNs. It's also now separately configurable from blk-crypto, and
can be disabled entirely, while still allowing the kernel to use inline
encryption hardware. Further, struct bio_crypt_ctx takes up less space,
and no longer contains the information needed by the crypto API
fallback - the fallback allocates the required memory when necessary.
- Blk-crypto now supports all file content encryption modes supported by
fscrypt.
- Fixed bio merging logic in blk-merge.c
- Fscrypt now supports inline encryption with the direct key policy, since
blk-crypto now has support for larger DUNs.
- Keyslot manager now uses a hashtable to lookup which keyslot contains
any particular key (thanks Eric!)
- Fscrypt support for inline encryption now handles filesystems with
multiple underlying block devices (thanks Eric!)
- Numerous cleanupsBug: 137270441
Test: refer to I26376479ee38259b8c35732cb3a1d7e15f9b05a3
Change-Id: I13e2e327e0b4784b394cb1e7cf32a04856d95f01
Link: https://lore.kernel.org/linux-block/20191218145136.172774-1-satyat@google.com/
Signed-off-by: Satya Tangirala
07 Jan, 2020
1 commit
-
Currently t10-pi can only be built into the block layer which via
crc-t10dif pulls in a whole chunk of the Crypto API. In fact all
users of t10-pi work as modules and there is no reason for it to
always be built-in.This patch adds a new hidden option for t10-pi that is selected
automatically based on BLK_DEV_INTEGRITY and whether the users
of t10-pi are built-in or not.Signed-off-by: Herbert Xu
Signed-off-by: Jens Axboe
09 Dec, 2019
1 commit
-
Linux 5.5-rc1
Signed-off-by: Greg Kroah-Hartman
Change-Id: I6f952ebdd40746115165a2f99bab340482f5c237
08 Nov, 2019
1 commit
-
blkg_rwstat is now only used by bfq-iosched and blk-throtl when on
cgroup1. Let's move it into its own files and gate it behind a config
option.Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe
31 Oct, 2019
2 commits
-
We introduce blk-crypto, which manages programming keyslots for struct
bios. With blk-crypto, filesystems only need to call bio_crypt_set_ctx with
the encryption key, algorithm and data_unit_num; they don't have to worry
about getting a keyslot for each encryption context, as blk-crypto handles
that. Blk-crypto also makes it possible for layered devices like device
mapper to make use of inline encryption hardware.Blk-crypto delegates crypto operations to inline encryption hardware when
available, and also contains a software fallback to the kernel crypto API.
For more details, refer to Documentation/block/inline-encryption.rst.Bug: 137270441
Test: tested as series; see Ie1b77f7615d6a7a60fdc9105c7ab2200d17636a8
Change-Id: I7df59fef0c1e90043b1899c5a95973e23afac0c5
Signed-off-by: Satya Tangirala
Link: https://patchwork.kernel.org/patch/11214731/ -
Inline Encryption hardware allows software to specify an encryption context
(an encryption key, crypto algorithm, data unit num, data unit size, etc.)
along with a data transfer request to a storage device, and the inline
encryption hardware will use that context to en/decrypt the data. The
inline encryption hardware is part of the storage device, and it
conceptually sits on the data path between system memory and the storage
device.Inline Encryption hardware implementations often function around the
concept of "keyslots". These implementations often have a limited number
of "keyslots", each of which can hold an encryption context (we say that
an encryption context can be "programmed" into a keyslot). Requests made
to the storage device may have a keyslot associated with them, and the
inline encryption hardware will en/decrypt the data in the requests using
the encryption context programmed into that associated keyslot. As
keyslots are limited, and programming keys may be expensive in many
implementations, and multiple requests may use exactly the same encryption
contexts, we introduce a Keyslot Manager to efficiently manage keyslots.
The keyslot manager also functions as the interface that upper layers will
use to program keys into inline encryption hardware. For more information
on the Keyslot Manager, refer to documentation found in
block/keyslot-manager.c and linux/keyslot-manager.h.Bug: 137270441
Test: tested as series; see Ie1b77f7615d6a7a60fdc9105c7ab2200d17636a8
Change-Id: Iea1ee5a7eec46cb50d33cf1e2d20dfb7335af4ed
Signed-off-by: Satya Tangirala
Link: https://patchwork.kernel.org/patch/11214713/
29 Aug, 2019
2 commits
-
This patchset implements IO cost model based work-conserving
proportional controller.While io.latency provides the capability to comprehensively prioritize
and protect IOs depending on the cgroups, its protection is binary -
the lowest latency target cgroup which is suffering is protected at
the cost of all others. In many use cases including stacking multiple
workload containers in a single system, it's necessary to distribute
IO capacity with better granularity.One challenge of controlling IO resources is the lack of trivially
observable cost metric. The most common metrics - bandwidth and iops
- can be off by orders of magnitude depending on the device type and
IO pattern. However, the cost isn't a complete mystery. Given
several key attributes, we can make fairly reliable predictions on how
expensive a given stream of IOs would be, at least compared to other
IO patterns.The function which determines the cost of a given IO is the IO cost
model for the device. This controller distributes IO capacity based
on the costs estimated by such model. The more accurate the cost
model the better but the controller adapts based on IO completion
latency and as long as the relative costs across differents IO
patterns are consistent and sensible, it'll adapt to the actual
performance of the device.Currently, the only implemented cost model is a simple linear one with
a few sets of default parameters for different classes of device.
This covers most common devices reasonably well. All the
infrastructure to tune and add different cost models is already in
place and a later patch will also allow using bpf progs for cost
models.Please see the top comment in blk-iocost.c and documentation for
more details.v2: Rebased on top of RQ_ALLOC_TIME changes and folded in Rik's fix
for a divide-by-zero bug in current_hweight() triggered by zero
inuse_sum.Signed-off-by: Tejun Heo
Cc: Andy Newell
Cc: Josef Bacik
Cc: Rik van Riel
Signed-off-by: Jens Axboe -
There are currently two start time timestamps - start_time_ns and
io_start_time_ns. The former marks the request allocation and and the
second issue-to-device time. The planned io.weight controller needs
to measure the total time bios take to execute after it leaves rq_qos
including the time spent waiting for request to become available,
which can easily dominate on saturated devices.This patch adds request->alloc_time_ns which records when the request
allocation attempt started. As it isn't used for the usual stats,
make it optional behind CONFIG_BLK_RQ_ALLOC_TIME and
QUEUE_FLAG_RQ_ALLOC_TIME so that it can be compiled out when there are
no users and it's active only on queues which need it even when
compiled in.v2: s/pre_start_time/alloc_time/ and add CONFIG_BLK_RQ_ALLOC_TIME
gating as suggested by Jens.Signed-off-by: Tejun Heo
Signed-off-by: Jens Axboe
15 Jul, 2019
2 commits
-
Those files belong to the admin guide, so add them.
Signed-off-by: Mauro Carvalho Chehab
-
Rename the block documentation files to ReST, add an
index for them and adjust in order to produce a nice html
output via the Sphinx build system.At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.Signed-off-by: Mauro Carvalho Chehab
09 Jul, 2019
1 commit
-
Pull cgroup updates from Tejun Heo:
"Documentation updates and the addition of cgroup_parse_float() which
will be used by new controllers including blk-iocost"* 'for-5.3' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
docs: cgroup-v1: convert docs to ReST and rename to *.rst
cgroup: Move cgroup_parse_float() implementation out of CONFIG_SYSFS
cgroup: add cgroup_parse_float()
15 Jun, 2019
1 commit
-
Convert the cgroup-v1 files to ReST format, in order to
allow a later addition to the admin-guide.The conversion is actually:
- add blank lines and identation in order to identify paragraphs;
- fix tables markups;
- add some lists markups;
- mark literal blocks;
- adjust title markups.At its new index.rst, let's add a :orphan: while this is not linked to
the main index.rst file, in order to avoid build warnings.Signed-off-by: Mauro Carvalho Chehab
Acked-by: Tejun Heo
Signed-off-by: Tejun Heo
13 Jun, 2019
1 commit
-
In most use cases of zoned block devices (aka SMR disks), the
mq-deadline scheduler is mandatory as it implements sequential write
command processing guarantees with zone write locking. So make sure that
this scheduler is always enabled if CONFIG_BLK_DEV_ZONED is selected.Tested-by: Chaitanya Kulkarni
Reviewed-by: Chaitanya Kulkarni
Signed-off-by: Damien Le Moal
Reviewed-by: Ming Lei
Signed-off-by: Jens Axboe
07 Apr, 2019
1 commit
-
Currently support for 64-bit sector_t and blkcnt_t is optional on 32-bit
architectures. These types are required to support block device and/or
file sizes larger than 2 TiB, and have generally defaulted to on for
a long time. Enabling the option only increases the i386 tinyconfig
size by 145 bytes, and many data structures already always use
64-bit values for their in-core and on-disk data structures anyway,
so there should not be a large change in dynamic memory usage either.Dropping this option removes a somewhat weird non-default config that
has cause various bugs or compiler warnings when actually used.Signed-off-by: Christoph Hellwig
Signed-off-by: Jens Axboe
30 Dec, 2018
1 commit
-
Pull Kconfig updates from Masahiro Yamada:
- support -y option for merge_config.sh to avoid downgrading =y to =m
- remove S_OTHER symbol type, and touch include/config/*.h files correctly
- fix file name and line number in lexer warnings
- fix memory leak when EOF is encountered in quotation
- resolve all shift/reduce conflicts of the parser
- warn no new line at end of file
- make 'source' statement more strict to take only string literal
- rewrite the lexer and remove the keyword lookup table
- convert to SPDX License Identifier
- compile C files independently instead of including them from zconf.y
- fix various warnings of gconfig
- misc cleanups
* tag 'kconfig-v4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
kconfig: surround dbg_sym_flags with #ifdef DEBUG to fix gconf warning
kconfig: split images.c out of qconf.cc/gconf.c to fix gconf warnings
kconfig: add static qualifiers to fix gconf warnings
kconfig: split the lexer out of zconf.y
kconfig: split some C files out of zconf.y
kconfig: convert to SPDX License Identifier
kconfig: remove keyword lookup table entirely
kconfig: update current_pos in the second lexer
kconfig: switch to ASSIGN_VAL state in the second lexer
kconfig: stop associating kconf_id with yylval
kconfig: refactor end token rules
kconfig: stop supporting '.' and '/' in unquoted words
treewide: surround Kconfig file paths with double quotes
microblaze: surround string default in Kconfig with double quotes
kconfig: use T_WORD instead of T_VARIABLE for variables
kconfig: use specific tokens instead of T_ASSIGN for assignments
kconfig: refactor scanning and parsing "option" properties
kconfig: use distinct tokens for type and default properties
kconfig: remove redundant token defines
kconfig: rename depends_list to comment_option_list
...
21 Dec, 2018
1 commit
-
The Kconfig lexer supports special characters such as '.' and '/' in
the parameter context. In my understanding, the reason is just to
support bare file paths in the source statement.I do not see a good reason to complicate Kconfig for the room of
ambiguity.The majority of code already surrounds file paths with double quotes,
and it makes sense since file paths are constant string literals.Make it treewide consistent now.
Signed-off-by: Masahiro Yamada
Acked-by: Wolfram Sang
Acked-by: Geert Uytterhoeven
Acked-by: Ingo Molnar
08 Nov, 2018
1 commit
-
Everything is blk-mq at this point, so it doesn't make any sense
to have this option available as it does nothing.Reviewed-by: Hannes Reinecke
Tested-by: Ming Lei
Reviewed-by: Omar Sandoval
Signed-off-by: Jens Axboe
11 Oct, 2018
1 commit
-
'default n' is the default value for any bool or tristate Kconfig
setting so there is no need to write it explicitly.Also since commit f467c5640c29 ("kconfig: only write '# CONFIG_FOO
is not set' for visible symbols") the Kconfig behavior is the same
regardless of 'default n' being present or not:...
One side effect of (and the main motivation for) this change is making
the following two definitions behave exactly the same:config FOO
boolconfig FOO
bool
default nWith this change, neither of these will generate a
'# CONFIG_FOO is not set' line (assuming FOO isn't selected/implied).
That might make it clearer to people that a bare 'default n' is
redundant.
...Signed-off-by: Bartlomiej Zolnierkiewicz
Signed-off-by: Jens Axboe
27 Sep, 2018
1 commit
-
Move the code for runtime power management from blk-core.c into the
new source file blk-pm.c. Move the corresponding declarations from
into . For CONFIG_PM=n, leave out
the declarations of the functions that are not used in that mode.
This patch not only reduces the number of #ifdefs in the block layer
core code but also reduces the size of header file
and hence should help to reduce the build time of the Linux kernel
if CONFIG_PM is not defined.Signed-off-by: Bart Van Assche
Reviewed-by: Ming Lei
Reviewed-by: Christoph Hellwig
Cc: Jianchao Wang
Cc: Hannes Reinecke
Cc: Johannes Thumshirn
Cc: Alan Stern
Signed-off-by: Jens Axboe
09 Jul, 2018
2 commits
-
Current IO controllers for the block layer are less than ideal for our
use case. The io.max controller is great at hard limiting, but it is
not work conserving. This patch introduces io.latency. You provide a
latency target for your group and we monitor the io in short windows to
make sure we are not exceeding those latency targets. This makes use of
the rq-qos infrastructure and works much like the wbt stuff. There are
a few differences from wbt- It's bio based, so the latency covers the whole block layer in addition to
the actual io.
- We will throttle all IO types that comes in here if we need to.
- We use the mean latency over the 100ms window. This is because writes can
be particularly fast, which could give us a false sense of the impact of
other workloads on our protected workload.
- By default there's no throttling, we set the queue_depth to INT_MAX so that
we can have as many outstanding bio's as we're allowed to. Only at
throttle time do we pay attention to the actual queue depth.
- We backcharge cgroups for root cg issued IO and induce artificial
delays in order to deal with cases like metadata only or swap heavy
workloads.In testing this has worked out relatively well. Protected workloads
will throttle noisy workloads down to 1 io at time if they are doing
normal IO on their own, or induce up to a 1 second delay per syscall if
they are doing a lot of root issued IO (metadata/swap IO).Our testing has revolved mostly around our production web servers where
we have hhvm (the web server application) in a protected group and
everything else in another group. We see slightly higher requests per
second (RPS) on the test tier vs the control tier, and much more stable
RPS across all machines in the test tier vs the control tier.Another test we run is a slow memory allocator in the unprotected group.
Before this would eventually push us into swap and cause the whole box
to die and not recover at all. With these patches we see slight RPS
drops (usually 10-15%) before the memory consumer is properly killed and
things recover within seconds.Signed-off-by: Josef Bacik
Acked-by: Tejun Heo
Signed-off-by: Jens Axboe -
Exclude zoned block device members from struct request_queue for
CONFIG_BLK_DEV_ZONED == n. Avoid breaking the build by only building
the code that uses these struct request_queue members if
CONFIG_BLK_DEV_ZONED != n.Signed-off-by: Bart Van Assche
Reviewed-by: Damien Le Moal
Cc: Matias Bjorling
Cc: Christoph Hellwig
Signed-off-by: Jens Axboe
16 Jun, 2018
1 commit
-
As we move stuff around, some doc references are broken. Fix some of
them via this script:
./scripts/documentation-file-ref-check --fixManually checked if the produced result is valid, removing a few
false-positives.Acked-by: Takashi Iwai
Acked-by: Masami Hiramatsu
Acked-by: Stephen Boyd
Acked-by: Charles Keepax
Acked-by: Mathieu Poirier
Reviewed-by: Coly Li
Signed-off-by: Mauro Carvalho Chehab
Acked-by: Jonathan Corbet
02 Nov, 2017
1 commit
-
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.By default all files without license information are under the default
license of the kernel, which is GPL version 2.Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier. The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.How this work was done:
Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
- file had no licensing information it it.
- file was a */uapi/* one with no licensing information in it,
- file was a */uapi/* one with existing licensing information,Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne. Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed. Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.Criteria used to select files for SPDX license identifier tagging was:
- Files considered eligible had to be source code files.
- Make and config files were included as candidates if they contained >5
lines of source
- File already had some variant of a license header in it (even if
Reviewed-by: Philippe Ombredanne
Reviewed-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman
09 Aug, 2017
1 commit
-
Like pci and virtio, we add a rdma helper for affinity
spreading. This achieves optimal mq affinity assignments
according to the underlying rdma device affinity maps.Reviewed-by: Jens Axboe
Reviewed-by: Christoph Hellwig
Reviewed-by: Max Gurtovoy
Signed-off-by: Sagi Grimberg
Signed-off-by: Doug Ledford
13 May, 2017
1 commit
-
Pull libnvdimm fixes from Dan Williams:
"Incremental fixes and a small feature addition on top of the main
libnvdimm 4.12 pull request:- Geert noticed that tinyconfig was bloated by BLOCK selecting DAX.
The size regression is fixed by moving all dax helpers into the
dax-core and only specifying "select DAX" for FS_DAX and
dax-capable drivers. He also asked for clarification of the
NR_DEV_DAX config option which, on closer look, does not need to be
a config option at all. Mike also throws in a DEV_DAX_PMEM fixup
for good measure.- Ben's attention to detail on -stable patch submissions caught a
case where the recent fixes to arch_copy_from_iter_pmem() missed a
condition where we strand dirty data in the cache. This is tagged
for -stable and will also be included in the rework of the pmem api
to a proposed {memcpy,copy_user}_flushcache() interface for 4.13.- Vishal adds a feature that missed the initial pull due to pending
review feedback. It allows the kernel to clear media errors when
initializing a BTT (atomic sector update driver) instance on a pmem
namespace.- Ross noticed that the dax_device + dax_operations conversion broke
__dax_zero_page_range(). The nvdimm unit tests fail to check this
path, but xfstests immediately trips over it. No excuse for missing
this before submitting the 4.12 pull request.These all pass the nvdimm unit tests and an xfstests spot check. The
set has received a build success notification from the kbuild robot"* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
filesystem-dax: fix broken __dax_zero_page_range() conversion
libnvdimm, btt: ensure that initializing metadata clears poison
libnvdimm: add an atomic vs process context flag to rw_bytes
x86, pmem: Fix cache flushing for iovec write < 8 bytes
device-dax: kill NR_DEV_DAX
block, dax: move "select DAX" from BLOCK to FS_DAX
device-dax: Tell kbuild DEV_DAX_PMEM depends on DEV_DAX
09 May, 2017
1 commit
-
For configurations that do not enable DAX filesystems or drivers, do not
require the DAX core to be built.Given that the 'direct_access' method has been removed from
'block_device_operations', we can also go ahead and remove the
block-related dax helper functions from fs/block_dev.c to
drivers/dax/super.c. This keeps dax details out of the block layer and
lets the DAX core be built as a module in the FS_DAX=n case.Filesystems need to include dax.h to call bdev_dax_supported().
Cc: linux-xfs@vger.kernel.org
Cc: Jens Axboe
Cc: "Theodore Ts'o"
Cc: Matthew Wilcox
Cc: Alexander Viro
Cc: "Darrick J. Wong"
Cc: Ross Zwisler
Reviewed-by: Jan Kara
Reported-by: Geert Uytterhoeven
Signed-off-by: Dan Williams
06 May, 2017
1 commit
-
Pull libnvdimm updates from Dan Williams:
"The bulk of this has been in multiple -next releases. There were a few
late breaking fixes and small features that got added in the last
couple days, but the whole set has received a build success
notification from the kbuild robot.Change summary:
- Region media error reporting: A libnvdimm region device is the
parent to one or more namespaces. To date, media errors have been
reported via the "badblocks" attribute attached to pmem block
devices for namespaces in "raw" or "memory" mode. Given that
namespaces can be in "device-dax" or "btt-sector" mode this new
interface reports media errors generically, i.e. independent of
namespace modes or state.This subsequently allows userspace tooling to craft "ACPI 6.1
Section 9.20.7.6 Function Index 4 - Clear Uncorrectable Error"
requests and submit them via the ioctl path for NVDIMM root bus
devices.- Introduce 'struct dax_device' and 'struct dax_operations': Prompted
by a request from Linus and feedback from Christoph this allows for
dax capable drivers to publish their own custom dax operations.
This fixes the broken assumption that all dax operations are
related to a persistent memory device, and makes it easier for
other architectures and platforms to add customized persistent
memory support.- 'libnvdimm' core updates: A new "deep_flush" sysfs attribute is
available for storage appliance applications to manually trigger
memory controllers to drain write-pending buffers that would
otherwise be flushed automatically by the platform ADR
(asynchronous-DRAM-refresh) mechanism at a power loss event.
Support for "locked" DIMMs is included to prevent namespaces from
surfacing when the namespace label data area is locked. Finally,
fixes for various reported deadlocks and crashes, also tagged for
-stable.- ACPI / nfit driver updates: General updates of the nfit driver to
add DSM command overrides, ACPI 6.1 health state flags support, DSM
payload debug available by default, and various fixes.Acknowledgements that came after the branch was pushed:
- commmit 565851c972b5 "device-dax: fix sysfs attribute deadlock":
Tested-by: Yi Zhang- commit 23f498448362 "libnvdimm: rework region badblocks clearing"
Tested-by: Toshi Kani "* tag 'libnvdimm-for-4.12' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (52 commits)
libnvdimm, pfn: fix 'npfns' vs section alignment
libnvdimm: handle locked label storage areas
libnvdimm: convert NDD_ flags to use bitops, introduce NDD_LOCKED
brd: fix uninitialized use of brd->dax_dev
block, dax: use correct format string in bdev_dax_supported
device-dax: fix sysfs attribute deadlock
libnvdimm: restore "libnvdimm: band aid btt vs clear poison locking"
libnvdimm: fix nvdimm_bus_lock() vs device_lock() ordering
libnvdimm: rework region badblocks clearing
acpi, nfit: kill ACPI_NFIT_DEBUG
libnvdimm: fix clear length of nvdimm_forget_poison()
libnvdimm, pmem: fix a NULL pointer BUG in nd_pmem_notify
libnvdimm, region: sysfs trigger for nvdimm_flush()
libnvdimm: fix phys_addr for nvdimm_clear_poison
x86, dax, pmem: remove indirection around memcpy_from_pmem()
block: remove block_device_operations ->direct_access()
block, dax: convert bdev_dax_supported() to dax_direct_access()
filesystem-dax: convert to dax_direct_access()
Revert "block: use DAX for partition table reads"
ext2, ext4, xfs: retrieve dax_device for iomap operations
...
21 Apr, 2017
1 commit
-
Replace bdev_direct_access() with dax_direct_access() that uses
dax_device and dax_operations instead of a block_device and
block_device_operations for dax. Once all consumers of the old api have
been converted bdev_direct_access() will be deleted.Given that block device partitioning decisions can cause dax page
alignment constraints to be violated this also introduces the
bdev_dax_pgoff() helper. It handles calculating a logical pgoff relative
to the dax_device and also checks for page alignment.Signed-off-by: Dan Williams