06 Apr, 2019
1 commit
-
[ Upstream commit 0ff06c44efeede4acd068847d3bf8cf894b6c664 ]
Prior to dma unmap/free operations the ism driver tries to ensure
that the memory is no longer accessed by the HW. When errors
during deregistration of memory regions from the HW occur the ism
driver will not unmap/free this memory.When we receive notification from the hypervisor that a PCI function
has been detached we can no longer access the device and would never
unmap/free these memory regions which led to complaints by the DMA
debug API.Treat this kind of errors during the deregistration of memory regions
from the HW as success since it is already ensured that the memory
is no longer accessed by HW.Reported-by: Karsten Graul
Reported-by: Hans Wippel
Signed-off-by: Sebastian Ott
Signed-off-by: Martin Schwidefsky
Signed-off-by: Sasha Levin
03 Apr, 2019
3 commits
-
commit 242ec1455151267fe35a0834aa9038e4c4670884 upstream.
Suppose more than one non-NPIV FCP device is active on the same channel.
Send I/O to storage and have some of the pending I/O run into a SCSI
command timeout, e.g. due to bit errors on the fibre. Now the error
situation stops. However, we saw FCP requests continue to timeout in the
channel. The abort will be successful, but the subsequent TUR fails.
Scsi_eh starts. The LUN reset fails. The target reset fails. The host
reset only did an FCP device recovery. However, for non-NPIV FCP devices,
this does not close and reopen ports on the SAN-side if other non-NPIV FCP
device(s) share the same open ports.In order to resolve the continuing FCP request timeouts, we need to
explicitly close and reopen ports on the SAN-side.This was missing since the beginning of zfcp in v2.6.0 history commit
ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter.").Note: The FSF requests for forced port reopen could run into FSF request
timeouts due to other reasons. This would trigger an internal FCP device
recovery. Pending forced port reopen recoveries would get dismissed. So
some ports might not get fully reopened during this host reset handler.
However, subsequent I/O would trigger the above described escalation and
eventually all ports would be forced reopen to resolve any continuing FCP
request timeouts due to earlier bit errors.Signed-off-by: Steffen Maier
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: #3.0+
Reviewed-by: Jens Remus
Reviewed-by: Benjamin Block
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
commit fe67888fc007a76b81e37da23ce5bd8fb95890b0 upstream.
An already deleted SCSI device can exist on the Scsi_Host and remain there
because something still holds a reference. A new SCSI device with the same
H:C:T:L and FCP device, target port WWPN, and FCP LUN can be created. When
we try to unblock an rport, we still find the deleted SCSI device and
return early because the zfcp_scsi_dev of that SCSI device is not
ZFCP_STATUS_COMMON_UNBLOCKED. Hence we miss to unblock the rport, even if
the new proper SCSI device would be in good state.Therefore, skip deleted SCSI devices when iterating the sdevs of the shost.
[cf. __scsi_device_lookup{_by_target}() or scsi_device_get()]The following abbreviated trace sequence can indicate such problem:
Area : REC
Tag : ersfs_3
LUN : 0x4045400300000000
WWPN : 0x50050763031bd327
LUN status : 0x40000000 not ZFCP_STATUS_COMMON_UNBLOCKED
Ready count : n not incremented yet
Running count : 0x00000000
ERP want : 0x01
ERP need : 0xc1 ZFCP_ERP_ACTION_NONEArea : REC
Tag : ersfs_3
LUN : 0x4045400300000000
WWPN : 0x50050763031bd327
LUN status : 0x41000000
Ready count : n+1
Running count : 0x00000000
ERP want : 0x01
ERP need : 0x01...
Area : REC
Level : 4 only with increased trace level
Tag : ertru_l
LUN : 0x4045400300000000
WWPN : 0x50050763031bd327
LUN status : 0x40000000
Request ID : 0x0000000000000000
ERP status : 0x01800000
ERP step : 0x1000
ERP action : 0x01
ERP count : 0x00NOT followed by a trace record with tag "scpaddy"
for WWPN 0x50050763031bd327.Signed-off-by: Steffen Maier
Fixes: 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery")
Cc: #2.6.32+
Reviewed-by: Jens Remus
Reviewed-by: Benjamin Block
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman -
commit 50b7f1b7236bab08ebbbecf90521e84b068d7a17 upstream.
When we get an interrupt for a channel program, it is not
necessarily the final interrupt; for example, the issuing
guest may request an intermediate interrupt by specifying
the program-controlled-interrupt flag on a ccw.We must not switch the state to idle if the interrupt is not
yet final; even more importantly, we must not free the translated
channel program if the interrupt is not yet final, or the host
can crash during cp rewind.Fixes: e5f84dbaea59 ("vfio: ccw: return I/O results asynchronously")
Cc: stable@vger.kernel.org # v4.12+
Reviewed-by: Eric Farman
Signed-off-by: Cornelia Huck
Signed-off-by: Greg Kroah-Hartman
24 Mar, 2019
2 commits
-
commit 3438b2c039b4bf26881786a1f3450f016d66ad11 upstream.
A queue with a capacity of zero is clearly not a valid virtio queue.
Some emulators report zero queue size if queried with an invalid queue
index. Instead of crashing in this case let us just return -ENOENT. To
make that work properly, let us fix the notifier cleanup logic as well.Cc: stable@vger.kernel.org
Signed-off-by: Halil Pasic
Signed-off-by: Cornelia Huck
Signed-off-by: Michael S. Tsirkin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 4a8ef6999bce998fa5813023a9a6b56eea329dba ]
Dan Carpenter reported the following:
The patch 52898025cf7d: "[S390] dasd: security and PSF update patch
for EMC CKD ioctl" from Mar 8, 2010, leads to the following static
checker warning:drivers/s390/block/dasd_eckd.c:4486 dasd_symm_io()
error: using offset into zero size array 'psf_data[]'drivers/s390/block/dasd_eckd.c
4458 /* Copy parms from caller */
4459 rc = -EFAULT;
4460 if (copy_from_user(&usrparm, argp, sizeof(usrparm)))
^^^^^^^
The user can specify any "usrparm.psf_data_len". They choose zero by
mistake.4461 goto out;
4462 if (is_compat_task()) {
4463 /* Make sure pointers are sane even on 31 bit. */
4464 rc = -EINVAL;
4465 if ((usrparm.psf_data >> 32) != 0)
4466 goto out;
4467 if ((usrparm.rssd_result >> 32) != 0)
4468 goto out;
4469 usrparm.psf_data &= 0x7fffffffULL;
4470 usrparm.rssd_result &= 0x7fffffffULL;
4471 }
4472 /* alloc I/O data area */
4473 psf_data = kzalloc(usrparm.psf_data_len, GFP_KERNEL
| GFP_DMA);
4474 rssd_result = kzalloc(usrparm.rssd_result_len, GFP_KERNEL
| GFP_DMA);
4475 if (!psf_data || !rssd_result) {kzalloc() returns a ZERO_SIZE_PTR (0x16).
4476 rc = -ENOMEM;
4477 goto out_free;
4478 }
4479
4480 /* get syscall header from user space */
4481 rc = -EFAULT;
4482 if (copy_from_user(psf_data,
4483 (void __user *)(unsigned long)
usrparm.psf_data,
4484 usrparm.psf_data_len))That all works great.
4485 goto out_free;
4486 psf0 = psf_data[0];
4487 psf1 = psf_data[1];But now we're assuming that "->psf_data_len" was at least 2 bytes.
Fix this by checking the user specified length psf_data_len.
Fixes: 52898025cf7d ("[S390] dasd: security and PSF update patch for EMC CKD ioctl")
Reported-by: Dan Carpenter
Signed-off-by: Stefan Haberland
Signed-off-by: Martin Schwidefsky
Signed-off-by: Sasha Levin
14 Mar, 2019
3 commits
-
[ Upstream commit c2780c1a3fb724560b1d44f7976e0de17bf153c7 ]
A card's close_dev work is scheduled on a driver-wide workqueue. If the
card is removed and freed while the work is still active, this causes a
use-after-free.
So make sure that the work is completed before freeing the card.Fixes: 0f54761d167f ("qeth: Support VEPA mode")
Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin -
[ Upstream commit afa0c5904ba16d59b0454f7ee4c807dae350f432 ]
The error path in qeth_alloc_qdio_buffers() that takes care of
cleaning up the Output Queues is buggy. It first frees the queue, but
then calls qeth_clear_outq_buffers() with that very queue struct.Make the call to qeth_clear_outq_buffers() part of the free action
(in the correct order), and while at it fix the naming of the helper.Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks")
Signed-off-by: Julian Wiedmann
Reviewed-by: Alexandra Winter
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin -
[ Upstream commit 5065b2dd3e5f9247a6c9d67974bc0472bf561b9d ]
Whenever we fail before/while starting an IO, make sure to release the
IO buffer. Usually qeth_irq() would do this for us, but if the IO
doesn't even start we obviously won't get an interrupt for it either.Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
20 Feb, 2019
1 commit
-
commit 8f9aca0c45322a807a343fc32f95f2500f83b9ae upstream.
The older machines don't have the QCI instruction available.
With support for up to 256 crypto cards the probing of each
card has been extended to check card ids from 0 up to 255.
For machines with QCI support there is a filter limiting the
range of probed cards. The older machines (z196 and older)
don't have this filter and so since support for 256 cards is
in the driver all cards are probed. However, these machines
also require to have the card id fit into 6 bits. Exceeding
this limit results in a specification exception which happens
on every kernel startup even when there is no crypto configured
and used at all.This fix limits the range of probed crypto cards to 64 if
there is no QCI instruction available to obey to the older
ap architecture and so fixes the specification exceptions
on z196 machines.Cc: stable@vger.kernel.org # v4.17+
Fixes: af4a72276d49 ("s390/zcrypt: Support up to 256 crypto adapters.")
Signed-off-by: Harald Freudenberger
Signed-off-by: Martin Schwidefsky
Signed-off-by: Greg Kroah-Hartman
13 Feb, 2019
1 commit
-
[ Upstream commit be534791011100d204602e2e0496e9e6ce8edf63 ]
There exist very few ap messages which need to have the 'special' flag
enabled. This flag tells the firmware layer to do some pre- and maybe
postprocessing. However, it may happen that this special flag is
enabled but the firmware is unable to deal with this kind of message
and thus returns with reply code 0x41. For example older firmware may
not know the newest messages triggered by the zcrypt device driver and
thus react with reject and the named reply code. Unfortunately this
reply code is not known to the zcrypt error routines and thus default
behavior is to switch the ap queue offline.This patch now makes the ap error routine aware of the reply code and
so userspace is informed about the bad processing result but the queue
is not switched to offline state any more.Signed-off-by: Harald Freudenberger
Signed-off-by: Martin Schwidefsky
Signed-off-by: Sasha Levin
31 Jan, 2019
1 commit
-
commit b7cb707c373094ce4008d4a6ac9b6b366ec52da5 upstream.
smp_rescan_cpus() is called without the device_hotplug_lock, which can lead
to a dedlock when a new CPU is found and immediately set online by a udev
rule.This was observed on an older kernel version, where the cpu_hotplug_begin()
loop was still present, and it resulted in hanging chcpu and systemd-udev
processes. This specific deadlock will not show on current kernels. However,
there may be other possible deadlocks, and since smp_rescan_cpus() can still
trigger a CPU hotplug operation, the device_hotplug_lock should be held.For reference, this was the deadlock with the old cpu_hotplug_begin() loop:
chcpu (rescan) systemd-udevd
echo 1 > /sys/../rescan
-> smp_rescan_cpus()
-> (*) get_online_cpus()
(increases refcount)
-> smp_add_present_cpu()
(new CPU found)
-> register_cpu()
-> device_add()
-> udev "add" event triggered -----------> udev rule sets CPU online
-> echo 1 > /sys/.../online
-> lock_device_hotplug_sysfs()
(this is missing in rescan path)
-> device_online()
-> (**) device_lock(new CPU dev)
-> cpu_up()
-> cpu_hotplug_begin()
(loops until refcount == 0)
-> deadlock with (*)
-> bus_probe_device()
-> device_attach()
-> device_lock(new CPU dev)
-> deadlock with (**)Fix this by taking the device_hotplug_lock in the CPU rescan path.
Cc:
Signed-off-by: Gerald Schaefer
Signed-off-by: Martin Schwidefsky
Signed-off-by: Greg Kroah-Hartman
13 Jan, 2019
1 commit
-
commit 60a161b7e5b2a252ff0d4c622266a7d8da1120ce upstream.
Suppose adapter (open) recovery is between opened QDIO queues and before
(the end of) initial posting of status read buffers (SRBs). This time
window can be seconds long due to FSF_PROT_HOST_CONNECTION_INITIALIZING
causing by design looping with exponential increase sleeps in the function
performing exchange config data during recovery
[zfcp_erp_adapter_strat_fsf_xconf()]. Recovery triggered by local link up.Suppose an event occurs for which the FCP channel would send an unsolicited
notification to zfcp by means of a previously posted SRB. We saw it with
local cable pull (link down) in multi-initiator zoning with multiple
NPIV-enabled subchannels of the same shared FCP channel.As soon as zfcp_erp_adapter_strategy_open_fsf() starts posting the initial
status read buffers from within the adapter's ERP thread, the channel does
send an unsolicited notification.Since v2.6.27 commit d26ab06ede83 ("[SCSI] zfcp: receiving an unsolicted
status can lead to I/O stall"), zfcp_fsf_status_read_handler() schedules
adapter->stat_work to re-fill the just consumed SRB from a work item.Now the ERP thread and the work item post SRBs in parallel. Both contexts
call the helper function zfcp_status_read_refill(). The tracking of
missing (to be posted / re-filled) SRBs is not thread-safe due to separate
atomic_read() and atomic_dec(), in order to depend on posting
success. Hence, both contexts can see
atomic_read(&adapter->stat_miss) == 1. One of the two contexts posts
one too many SRB. Zfcp gets QDIO_ERROR_SLSB_STATE on the output queue
(trace tag "qdireq1") leading to zfcp_erp_adapter_shutdown() in
zfcp_qdio_handler_error().An obvious and seemingly clean fix would be to schedule stat_work from the
ERP thread and wait for it to finish. This would serialize all SRB
re-fills. However, we already have another work item wait on the ERP
thread: adapter->scan_work runs zfcp_fc_scan_ports() which calls
zfcp_fc_eval_gpn_ft(). The latter calls zfcp_erp_wait() to wait for all the
open port recoveries during zfcp auto port scan, but in fact it waits for
any pending recovery including an adapter recovery. This approach leads to
a deadlock. [see also v3.19 commit 18f87a67e6d6 ("zfcp: auto port scan
resiliency"); v2.6.37 commit d3e1088d6873
("[SCSI] zfcp: No ERP escalation on gpn_ft eval");
v2.6.28 commit fca55b6fb587
("[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP")
fixing v2.6.27 commit c57a39a45a76
("[SCSI] zfcp: wait until adapter is finished with ERP during auto-port");
v2.6.27 commit cc8c282963bd
("[SCSI] zfcp: Automatically attach remote ports")]Instead make the accounting of missing SRBs atomic for parallel execution
in both the ERP thread and adapter->stat_work.Signed-off-by: Steffen Maier
Fixes: d26ab06ede83 ("[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall")
Cc: #2.6.27+
Reviewed-by: Jens Remus
Signed-off-by: Martin K. Petersen
Signed-off-by: Greg Kroah-Hartman
17 Dec, 2018
2 commits
-
[ Upstream commit b89e242eee8d4cd8261d8d821c62c5d1efc454d0 ]
Direct returns from within a loop are rude, but it doesn't mean it gets
to avoid releasing the memory acquired beforehand.Signed-off-by: Eric Farman
Message-Id:
Reviewed-by: Farhan Ali
Reviewed-by: Pierre Morel
Acked-by: Halil Pasic
Signed-off-by: Cornelia Huck
Signed-off-by: Sasha Levin -
[ Upstream commit 806212f91c874b24cf9eb4a9f180323671b6c5ed ]
If pfn_array_alloc fails somehow, we need to release the pfn_array_table
that was malloc'd earlier.Signed-off-by: Eric Farman
Message-Id:
Acked-by: Halil Pasic
Signed-off-by: Cornelia Huck
Signed-off-by: Sasha Levin
13 Dec, 2018
3 commits
-
commit 78b1a52e05c9db11d293342e8d6d8a230a04b4e7 upstream.
While ccw_io_helper() seems like intended to be exclusive in a sense that
it is supposed to facilitate I/O for at most one thread at any given
time, there is actually nothing ensuring that threads won't pile up at
vcdev->wait_q. If they do, all threads get woken up and see the status
that belongs to some other request than their own. This can lead to bugs.
For an example see:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1788432This race normally does not cause any problems. The operations provided
by struct virtio_config_ops are usually invoked in a well defined
sequence, normally don't fail, and are normally used quite infrequent
too.Yet, if some of the these operations are directly triggered via sysfs
attributes, like in the case described by the referenced bug, userspace
is given an opportunity to force races by increasing the frequency of the
given operations.Let us fix the problem by ensuring, that for each device, we finish
processing the previous request before starting with a new one.Signed-off-by: Halil Pasic
Reported-by: Colin Ian King
Cc: stable@vger.kernel.org
Message-Id:
Signed-off-by: Cornelia Huck
Signed-off-by: Michael S. Tsirkin
Signed-off-by: Greg Kroah-Hartman -
commit 2448a299ec416a80f699940a86f4a6d9a4f643b1 upstream.
Currently we have a race on vcdev->config in virtio_ccw_get_config() and
in virtio_ccw_set_config().This normally does not cause problems, as these are usually infrequent
operations. However, for some devices writing to/reading from the config
space can be triggered through sysfs attributes. For these, userspace can
force the race by increasing the frequency.Signed-off-by: Halil Pasic
Cc: stable@vger.kernel.org
Message-Id:
Signed-off-by: Cornelia Huck
Signed-off-by: Michael S. Tsirkin
Signed-off-by: Greg Kroah-Hartman -
[ Upstream commit 007b656851ed7f94ba0fa358ac3e5d7705da6846 ]
SMC-D stress workload showed connection stalls. Since the firmware
decides to skip raising an interrupt if the SBA DMBE mask bit is
still set, this SBA DMBE mask bit should be cleared before the
IRQ handling in the SMC code runs. Otherwise there are small windows
possible with missing interrupts for incoming data.
SMC-D currently does not care about the old value of the SBA DMBE
mask.Acked-by: Sebastian Ott
Signed-off-by: Ursula Braun
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
06 Dec, 2018
1 commit
-
[ Upstream commit 9a764c1e59684c0358e16ccaafd870629f2cfe67 ]
The response for a SNMP request can consist of multiple parts, which
the cmd callback stages into a kernel buffer until all parts have been
received. If the callback detects that the staging buffer provides
insufficient space, it bails out with error.
This processing is buggy for the first part of the response - while it
initially checks for a length of 'data_len', it later copies an
additional amount of 'offsetof(struct qeth_snmp_cmd, data)' bytes.Fix the calculation of 'data_len' for the first part of the response.
This also nicely cleans up the memcpy code.Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Julian Wiedmann
Reviewed-by: Ursula Braun
Signed-off-by: David S. Miller
Signed-off-by: Greg Kroah-Hartman
27 Nov, 2018
2 commits
-
[ Upstream commit 30356d08159d7899438e94503ae322a8b881e205 ]
qeth only registers its netdevice when the qeth device is first set
online. Thus a device that has never been set online will trigger
a WARN ("network todo 'hsi%d' but state 0") in unregister_netdev() when
removed.Fix this by protecting the unregister step, just like we already protect
against repeated registering of the netdevice.Fixes: d3d1b205e89f ("s390/qeth: allocate netdevice early")
Reported-by: Karsten Graul
Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin -
[ Upstream commit bd74a7f9cc033cf4d405788f80292268987dc0c5 ]
Sniffing mode for L3 HiperSockets requires that no IP addresses are
registered with the HW. The preferred way to achieve this is for
userspace to delete all the IPs on the interface. But qeth is expected
to also tolerate a configuration where that is not the case, by skipping
the IP registration when in sniffer mode.
Since commit 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
reworked the IP registration logic in the L3 subdriver, this no longer
works. When the qeth device is set online, qeth_l3_recover_ip() now
unconditionally registers all unicast addresses from our internal
IP table.While we could fix this particular problem by skipping
qeth_l3_recover_ip() on a sniffer device, the more future-proof change
is to skip the IP address registration at the lowest level. This way we
a) catch any future code path that attempts to register an IP address
without considering the sniffer scenario, and
b) continue to build up our internal IP table, so that if sniffer mode
is switched off later we can operate just like normal.Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller
Signed-off-by: Sasha Levin
10 Oct, 2018
1 commit
-
Martin writes:
"s390 fixes for 4.19-rc8Four more patches for 4.19:
- Fix resume after suspend-to-disk if resume-CPU != suspend-CPU
- Fix vfio-ccw check for pinned pages
- Two patches to avoid a usercopy-whitelist warning in vfio-ccw"* tag 's390-4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/cio: Fix how vfio-ccw checks pinned pages
s390/cio: Refactor alloc of ccw_io_region
s390/cio: Convert ccw_io_region to pointer
s390/hibernate: fix error handling when suspend cpu != resume cpu
02 Oct, 2018
1 commit
-
We have two nested loops to check the entries within the pfn_array_table
arrays. But we mistakenly use the outer array as an index in our check,
and completely ignore the indexing performed by the inner loop.Cc: stable@vger.kernel.org
Signed-off-by: Eric Farman
Message-Id:
Signed-off-by: Cornelia Huck
29 Sep, 2018
2 commits
-
Functions qeth_get_ipa_msg and qeth_get_ipa_cmd_name are modifying
the last member of global arrays without any locking that I can see.
If two instances of either function are running at the same time,
it could cause a race ultimately leading to an array overrun (the
contents of the last entry of the array is the only guarantee that
the loop will ever stop).Performing the lookups without modifying the arrays is admittedly
slower (two comparisons per iteration instead of one) but these
are operations which are rare (should only be needed in error
cases or when debugging, not during successful operation) and it
seems still less costly than introducing a mutex to protect the
arrays in question.As a side bonus, it allows us to declare both arrays as const data.
Signed-off-by: Jean Delvare
Cc: Julian Wiedmann
Cc: Ursula Braun
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller -
Use the common code ARRAY_SIZE macro instead of a private implementation.
Reviewed-by: Jean Delvare
Signed-off-by: zhong jiang
Signed-off-by: Martin Schwidefsky
Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller
27 Sep, 2018
2 commits
-
If I attach a vfio-ccw device to my guest, I get the following warning
on the host when the host kernel is CONFIG_HARDENED_USERCOPY=y[250757.595325] Bad or missing usercopy whitelist? Kernel memory overwrite attempt detected to SLUB object 'dma-kmalloc-512' (offset 64, size 124)!
[250757.595365] WARNING: CPU: 2 PID: 10958 at mm/usercopy.c:81 usercopy_warn+0xac/0xd8
[250757.595369] Modules linked in: kvm vhost_net vhost tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c devlink tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables sunrpc dm_multipath s390_trng crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha1_s390 eadm_sch tape_3590 tape tape_class qeth_l2 qeth ccwgroup vfio_ccw vfio_mdev zcrypt_cex4 mdev vfio_iommu_type1 zcrypt vfio sha256_s390 sha_common zfcp scsi_transport_fc qdio dasd_eckd_mod dasd_mod
[250757.595424] CPU: 2 PID: 10958 Comm: CPU 2/KVM Not tainted 4.18.0-derp #2
[250757.595426] Hardware name: IBM 3906 M05 780 (LPAR)
...snip regs...
[250757.595523] Call Trace:
[250757.595529] ([] usercopy_warn+0xa8/0xd8)
[250757.595535] [] __check_heap_object+0xfa/0x160
[250757.595540] [] __check_object_size+0x156/0x1d0
[250757.595547] [] vfio_ccw_mdev_write+0x74/0x148 [vfio_ccw]
[250757.595552] [] __vfs_write+0x3a/0x188
[250757.595556] [] vfs_write+0xa8/0x1b8
[250757.595559] [] ksys_pwrite64+0x86/0xc0
[250757.595568] [] system_call+0xdc/0x2b0
[250757.595570] Last Breaking-Event-Address:
[250757.595573] [] usercopy_warn+0xa8/0xd8While vfio_ccw_mdev_{write|read} validates that the input position/count
does not run over the ccw_io_region struct, the usercopy code that does
copy_{to|from}_user doesn't necessarily know this. It sees the variable
length and gets worried that it's affecting a normal kmalloc'd struct,
and generates the above warning.Adjust how the ccw_io_region is alloc'd with a whitelist to remove this
warning. The boundary checking will continue to do its thing.Signed-off-by: Eric Farman
Message-Id:
Signed-off-by: Cornelia Huck -
In the event that we want to change the layout of the ccw_io_region in the
future[1], it might be easier to work with it as a pointer within the
vfio_ccw_private struct rather than an embedded struct.[1] https://patchwork.kernel.org/comment/22228541/
Signed-off-by: Eric Farman
Message-Id:
Signed-off-by: Cornelia Huck
20 Sep, 2018
1 commit
-
The resume code checks if the resume cpu is the same as the suspend cpu.
If not, and if it is also not possible to switch to the suspend cpu, an
error message should be printed and the resume process should be stopped
by loading a disabled wait psw.The current logic is broken in multiple ways, the message is never printed,
and the disabled wait psw never loaded because the kernel panics before that:
- sam31 and SIGP_SET_ARCHITECTURE to ESA mode is wrong, this will break
on the first 64bit instruction in sclp_early_printk().
- The init stack should be used, but the stack pointer is not set up correctly
(missing aghi %r15,-STACK_FRAME_OVERHEAD).
- __sclp_early_printk() checks the sclp_init_state. If it is not
sclp_init_state_uninitialized, it simply returns w/o printing anything.
In the resumed kernel however, sclp_init_state will never be uninitialized.This patch fixes those issues by removing the sam31/ESA logic, adding a
correct init stack pointer, and also introducing sclp_early_printk_force()
to allow using sclp_early_printk() even when sclp_init_state is not
uninitialized.Reviewed-by: Heiko Carstens
Signed-off-by: Gerald Schaefer
Signed-off-by: Martin Schwidefsky
14 Sep, 2018
1 commit
-
Pull s390 fixes from Martin Schwidefsky:
- One fix for the zcrypt driver to correctly handle incomplete
encryption/decryption operations.- A cleanup for the aqmask/apmask parsing to avoid variable length
arrays on the stack.* tag 's390-4.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390/zcrypt: remove VLA usage from the AP bus
s390/crypto: Fix return code checking in cbc_paes_crypt()
13 Sep, 2018
4 commits
-
For inbound data with an unsupported HW header format, only dump the
actual HW header. We have no idea how much payload follows it, and what
it contains. Worst case, we dump past the end of the Inbound Buffer and
access whatever is located next in memory.Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller -
qeth_query_oat_command() currently allocates the kernel buffer for
the SIOC_QETH_QUERY_OAT ioctl with kzalloc. So on systems with
fragmented memory, large allocations may fail (eg. the qethqoat tool by
default uses 132KB).Solve this issue by using vzalloc, backing the allocation with
non-contiguous memory.Signed-off-by: Wenjia Zhang
Reviewed-by: Julian Wiedmann
Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller -
Scatter-gather transmit brings a nice performance boost. Considering the
rather large MTU sizes at play, it's also totally the Right Thing To Do.Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller -
Bailing out on allocation error is nice, but we also need to tell the
ccwgroup core that creating the qeth groupdev failed.Fixes: d3d1b205e89f ("s390/qeth: allocate netdevice early")
Signed-off-by: Julian Wiedmann
Signed-off-by: David S. Miller
12 Sep, 2018
1 commit
-
The use of variable length arrays on the stack is deprecated.
git commit 3d8f60d38e249f989a7fca9c2370c31c3d5487e1
"s390/zcrypt: hex string mask improvements for apmask and aqmask."
added three new VLA arrays. Remove them again.Reviewed-by: Harald Freudenberger
Signed-off-by: Martin Schwidefsky
26 Aug, 2018
1 commit
-
Pull libnvdimm updates from Dave Jiang:
"Collection of misc libnvdimm patches for 4.19 submission:- Adding support to read locked nvdimm capacity.
- Change test code to make DSM failure code injection an override.
- Add support for calculate maximum contiguous area for namespace.
- Add support for queueing a short ARS when there is on going ARS for
nvdimm.- Allow NULL to be passed in to ->direct_access() for kaddr and pfn
params.- Improve smart injection support for nvdimm emulation testing.
- Fix test code that supports for emulating controller temperature.
- Fix hang on error before devm_memremap_pages()
- Fix a bug that causes user memory corruption when data returned to
user for ars_status.- Maintainer updates for Ross Zwisler emails and adding Jan Kara to
fsdax"* tag 'libnvdimm-for-4.19_misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm:
libnvdimm: fix ars_status output length calculation
device-dax: avoid hang on error before devm_memremap_pages()
tools/testing/nvdimm: improve emulation of smart injection
filesystem-dax: Do not request kaddr and pfn when not required
md/dm-writecache: Don't request pointer dummy_addr when not required
dax/super: Do not request a pointer kaddr when not required
tools/testing/nvdimm: kaddr and pfn can be NULL to ->direct_access()
s390, dcssblk: kaddr and pfn can be NULL to ->direct_access()
libnvdimm, pmem: kaddr and pfn can be NULL to ->direct_access()
acpi/nfit: queue issuing of ars when an uc error notification comes in
libnvdimm: Export max available extent
libnvdimm: Use max contiguous area for namespace size
MAINTAINERS: Add Jan Kara for filesystem DAX
MAINTAINERS: update Ross Zwisler's email address
tools/testing/nvdimm: Fix support for emulating controller temperature
tools/testing/nvdimm: Make DSM failure code injection an override
acpi, nfit: Prefer _DSM over _LSR for namespace label reads
libnvdimm: Introduce locked DIMM capacity support
25 Aug, 2018
1 commit
-
Pull s390 updates from Martin Schwidefsky:
- A couple of patches for the zcrypt driver:
+ Add two masks to determine which AP cards and queues are host
devices, this will be useful for KVM AP device passthrough
+ Add-on patch to improve the parsing of the new apmask and aqmask
+ Some code beautification- Second try to reenable the GCC plugins, the first patch set had a
patch to do this but the merge somehow missed this- Remove the s390 specific GCC version check and use the generic one
- Three patches for kdump, two bug fixes and one cleanup
- Three patches for the PCI layer, one bug fix and two cleanups
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
s390: remove gcc version check (4.3 or newer)
s390/zcrypt: hex string mask improvements for apmask and aqmask.
s390/zcrypt: AP bus support for alternate driver(s)
s390/zcrypt: code beautify
s390/zcrypt: switch return type to bool for ap_instructions_available()
s390/kdump: Remove kzalloc_panic
s390/kdump: Fix memleak in nt_vmcoreinfo
s390/kdump: Make elfcorehdr size calculation ABI compliant
s390/pci: remove fmb address from debug output
s390/pci: remove stale rc
s390/pci: fix out of bounds access during irq setup
s390/zcrypt: fix ap_instructions_available() returncodes
s390: reenable gcc plugins for real
21 Aug, 2018
1 commit
-
The sysfs attributes /sys/bus/ap/apmask and /sys/bus/ap/aqmask
and the kernel command line arguments ap.apm and ap.aqm get
an improvement of the value parsing with this patch:The mask values are bitmaps in big endian order starting with bit 0.
So adapter number 0 is the leftmost bit, mask is 0x8000... The sysfs
attributes and the kernel command line accept 2 different formats:
- Absolute hex string starting with 0x like "0x12345678" does set
the mask starting from left to right. If the given string is shorter
than the mask it is padded with 0s on the right. If the string is
longer than the mask an error comes back (EINVAL).
- Relative format - a concatenation (done with ',') of the terms
+[-] or -[-]. may be any
valid number (hex, decimal or octal) in the range 0...255.
Here are some examples:
"+0-15,+32,-128,-0xFF"
"-0-255,+1-16,+0x128"Signed-off-by: Harald Freudenberger
Signed-off-by: Martin Schwidefsky
20 Aug, 2018
2 commits
-
The current AP bus, AP devices and AP device drivers implementation
uses a clearly defined mapping for binding AP devices to AP device
drivers. So for example a CEX6C queue will always be bound to the
cex4queue device driver.The Linux Device Driver model has no sensitivity for more than one
device driver eligible for one device type. If there exist more than
one drivers matching to the device type, simple all drivers are tried
consecutively. There is no way to determine and influence the probing
order of the drivers.With KVM there is a need to provide additional device drivers matching
to the very same type of AP devices. With a simple implementation the
KVM drivers run in competition to the regular drivers. Whichever
'wins' a device depends on build order and implementation details
within the common Linux Device Driver Model and is not
deterministic. However, a userspace process could figure out which
device should be bound to which driver and sort out the correct
binding by manipulating attributes in the sysfs.If for security reasons a AP device must not get bound to the 'wrong'
device driver the sorting out has to be done within the Linux kernel
by the AP bus code. This patch modifies the behavior of the AP bus
for probing drivers for devices in a way that two sets of drivers are
usable. Two new bitmasks 'apmask' and 'aqmask' are used to mark a
subset of the APQN range for 'usable by the ap bus and the default
drivers' or 'not usable by the default drivers and thus available for
alternate drivers like vfio-xxx'. So an APQN which is addressed by
this masking only the default drivers will be probed. In contrary an
APQN which is not addressed by the masks will never be probed and
bound to default drivers but onny to alternate drivers.Eventually the two masks give a way to divide the range of APQNs into
two pools: one pool of APQNs used by the AP bus and the default
drivers and thus via zcrypt drivers available to the userspace of the
system. And another pool where no zcrypt drivers are bound to and
which can be used by alternate drivers (like vfio-xxx) for their
needs. This division is hot-plug save and makes sure a APQN assigned
to an alternate driver is at no time somehow exploitable by the wrong
party.The two masks are located in sysfs at /sys/bus/ap/apmask and
/sys/bus/ap/aqmask. The mask syntax is exactly the same as the
already existing mask attributes in the /sys/bus/ap directory (for
example ap_usage_domain_mask and ap_control_domain_mask).By default all APQNs belong to the ap bus and the default drivers:
cat /sys/bus/ap/apmask
0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
cat /sys/bus/ap/aqmask
0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffThe masks can be changed at boot time with the kernel command line
like this:... ap.apmask=0xffff ap.aqmask=0x40
This would give these two pools:
default drivers pool: adapter 0 - 15, domain 1
alternate drivers pool: adapter 0 - 15, all but domain 1
adapter 16-255, all domainsThe sysfs attributes for this two masks are writeable and an
administrator is able to reconfigure the assignements on the fly by
writing new mask values into. With changing the mask(s) a revision of
the existing queue to driver bindings is done. So all APQNs which are
bound to the 'wrong' driver are reprobed via kernel function
device_reprobe() and thus the new correct driver will be assigned with
respect of the changed apmask and aqmask bits.The mask values are bitmaps in big endian order starting with bit 0.
So adapter number 0 is the leftmost bit, mask is 0x8000... The sysfs
attributes accept 2 different formats:
- Absolute hex string starting with 0x like "0x12345678" does set
the mask starting from left to right. If the given string is shorter
than the mask it is padded with 0s on the right. If the string is
longer than the mask an error comes back (EINVAL).
- '+' or '-' followed by a numerical value. Valid examples are "+1",
"-13", "+0x41", "-0xff" and even "+0" and "-0". Only the addressed
bit in the mask is switched on ('+') or off ('-').This patch will also be the base for an upcoming extension to the
zcrypt drivers to be able to provide additional zcrypt device nodes
with filtering based on ap and aq masks.Signed-off-by: Harald Freudenberger
Signed-off-by: Martin Schwidefsky -
Code beautify by following most of the checkpatch suggestions:
- SPDX license identifier line complains by checkpatch
- missing space or newline complains by checkpatch
- octal numbers for permssions complains by checkpatch
- renaming of static sysfs functions complains by checkpatch
- fix of block comment complains by checkpatch
- fix printf like calls where function name instead of %s __func__
was used
- __packed instead of __attribute__((packed))
- init to zero for static variables removed
- use of DEVICE_ATTR_RO and DEVICE_ATTR_RW macrosNo functional code changes or API changes!
Signed-off-by: Harald Freudenberger
Signed-off-by: Martin Schwidefsky
19 Aug, 2018
1 commit
-
Pull tty/serial driver updates from Greg KH:
"Here is the big tty and serial driver pull request for 4.19-rc1.It's not all that big, just a number of small serial driver updates
and fixes, along with some better vt handling for unicode characters
for those using braille terminals.All of these patches have been in linux-next for a long time with no
reported issues"* tag 'tty-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (73 commits)
tty: serial: 8250: Revert NXP SC16C2552 workaround
serial: 8250_exar: Read INT0 from slave device, too
tty: rocket: Fix possible buffer overwrite on register_PCI
serial: 8250_dw: Add ACPI support for uart on Broadcom SoC
serial: 8250_dw: always set baud rate in dw8250_set_termios
dt-bindings: serial: Add binding for uartlite
tty: serial: uartlite: Add support for suspend and resume
tty: serial: uartlite: Add clock adaptation
tty: serial: uartlite: Add structure for private data
serial: sh-sci: Improve support for separate TEI and DRI interrupts
serial: sh-sci: Remove SCIx_RZ_SCIFA_REGTYPE
serial: sh-sci: Allow for compressed SCIF address
serial: sh-sci: Improve interrupts description
serial: 8250: Use cached port name directly in messages
serial: 8250_exar: Drop unused variable in pci_xr17v35x_setup()
vt: drop unused struct vt_struct
vt: avoid a VLA in the unicode screen scroll function
vt: add /dev/vcsu* to devices.txt
vt: coherence validation code for the unicode screen buffer
vt: selection: take screen contents from uniscr if available
...