06 Apr, 2019

1 commit

  • [ Upstream commit 0ff06c44efeede4acd068847d3bf8cf894b6c664 ]

    Prior to dma unmap/free operations the ism driver tries to ensure
    that the memory is no longer accessed by the HW. When errors
    during deregistration of memory regions from the HW occur the ism
    driver will not unmap/free this memory.

    When we receive notification from the hypervisor that a PCI function
    has been detached we can no longer access the device and would never
    unmap/free these memory regions which led to complaints by the DMA
    debug API.

    Treat this kind of errors during the deregistration of memory regions
    from the HW as success since it is already ensured that the memory
    is no longer accessed by HW.

    Reported-by: Karsten Graul
    Reported-by: Hans Wippel
    Signed-off-by: Sebastian Ott
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Sasha Levin

    Sebastian Ott
     

03 Apr, 2019

3 commits

  • commit 242ec1455151267fe35a0834aa9038e4c4670884 upstream.

    Suppose more than one non-NPIV FCP device is active on the same channel.
    Send I/O to storage and have some of the pending I/O run into a SCSI
    command timeout, e.g. due to bit errors on the fibre. Now the error
    situation stops. However, we saw FCP requests continue to timeout in the
    channel. The abort will be successful, but the subsequent TUR fails.
    Scsi_eh starts. The LUN reset fails. The target reset fails. The host
    reset only did an FCP device recovery. However, for non-NPIV FCP devices,
    this does not close and reopen ports on the SAN-side if other non-NPIV FCP
    device(s) share the same open ports.

    In order to resolve the continuing FCP request timeouts, we need to
    explicitly close and reopen ports on the SAN-side.

    This was missing since the beginning of zfcp in v2.6.0 history commit
    ea127f975424 ("[PATCH] s390 (7/7): zfcp host adapter.").

    Note: The FSF requests for forced port reopen could run into FSF request
    timeouts due to other reasons. This would trigger an internal FCP device
    recovery. Pending forced port reopen recoveries would get dismissed. So
    some ports might not get fully reopened during this host reset handler.
    However, subsequent I/O would trigger the above described escalation and
    eventually all ports would be forced reopen to resolve any continuing FCP
    request timeouts due to earlier bit errors.

    Signed-off-by: Steffen Maier
    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Cc: #3.0+
    Reviewed-by: Jens Remus
    Reviewed-by: Benjamin Block
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Steffen Maier
     
  • commit fe67888fc007a76b81e37da23ce5bd8fb95890b0 upstream.

    An already deleted SCSI device can exist on the Scsi_Host and remain there
    because something still holds a reference. A new SCSI device with the same
    H:C:T:L and FCP device, target port WWPN, and FCP LUN can be created. When
    we try to unblock an rport, we still find the deleted SCSI device and
    return early because the zfcp_scsi_dev of that SCSI device is not
    ZFCP_STATUS_COMMON_UNBLOCKED. Hence we miss to unblock the rport, even if
    the new proper SCSI device would be in good state.

    Therefore, skip deleted SCSI devices when iterating the sdevs of the shost.
    [cf. __scsi_device_lookup{_by_target}() or scsi_device_get()]

    The following abbreviated trace sequence can indicate such problem:

    Area : REC
    Tag : ersfs_3
    LUN : 0x4045400300000000
    WWPN : 0x50050763031bd327
    LUN status : 0x40000000 not ZFCP_STATUS_COMMON_UNBLOCKED
    Ready count : n not incremented yet
    Running count : 0x00000000
    ERP want : 0x01
    ERP need : 0xc1 ZFCP_ERP_ACTION_NONE

    Area : REC
    Tag : ersfs_3
    LUN : 0x4045400300000000
    WWPN : 0x50050763031bd327
    LUN status : 0x41000000
    Ready count : n+1
    Running count : 0x00000000
    ERP want : 0x01
    ERP need : 0x01

    ...

    Area : REC
    Level : 4 only with increased trace level
    Tag : ertru_l
    LUN : 0x4045400300000000
    WWPN : 0x50050763031bd327
    LUN status : 0x40000000
    Request ID : 0x0000000000000000
    ERP status : 0x01800000
    ERP step : 0x1000
    ERP action : 0x01
    ERP count : 0x00

    NOT followed by a trace record with tag "scpaddy"
    for WWPN 0x50050763031bd327.

    Signed-off-by: Steffen Maier
    Fixes: 6f2ce1c6af37 ("scsi: zfcp: fix rport unblock race with LUN recovery")
    Cc: #2.6.32+
    Reviewed-by: Jens Remus
    Reviewed-by: Benjamin Block
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Steffen Maier
     
  • commit 50b7f1b7236bab08ebbbecf90521e84b068d7a17 upstream.

    When we get an interrupt for a channel program, it is not
    necessarily the final interrupt; for example, the issuing
    guest may request an intermediate interrupt by specifying
    the program-controlled-interrupt flag on a ccw.

    We must not switch the state to idle if the interrupt is not
    yet final; even more importantly, we must not free the translated
    channel program if the interrupt is not yet final, or the host
    can crash during cp rewind.

    Fixes: e5f84dbaea59 ("vfio: ccw: return I/O results asynchronously")
    Cc: stable@vger.kernel.org # v4.12+
    Reviewed-by: Eric Farman
    Signed-off-by: Cornelia Huck
    Signed-off-by: Greg Kroah-Hartman

    Cornelia Huck
     

24 Mar, 2019

2 commits

  • commit 3438b2c039b4bf26881786a1f3450f016d66ad11 upstream.

    A queue with a capacity of zero is clearly not a valid virtio queue.
    Some emulators report zero queue size if queried with an invalid queue
    index. Instead of crashing in this case let us just return -ENOENT. To
    make that work properly, let us fix the notifier cleanup logic as well.

    Cc: stable@vger.kernel.org
    Signed-off-by: Halil Pasic
    Signed-off-by: Cornelia Huck
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Greg Kroah-Hartman

    Halil Pasic
     
  • [ Upstream commit 4a8ef6999bce998fa5813023a9a6b56eea329dba ]

    Dan Carpenter reported the following:

    The patch 52898025cf7d: "[S390] dasd: security and PSF update patch
    for EMC CKD ioctl" from Mar 8, 2010, leads to the following static
    checker warning:

    drivers/s390/block/dasd_eckd.c:4486 dasd_symm_io()
    error: using offset into zero size array 'psf_data[]'

    drivers/s390/block/dasd_eckd.c
    4458 /* Copy parms from caller */
    4459 rc = -EFAULT;
    4460 if (copy_from_user(&usrparm, argp, sizeof(usrparm)))
    ^^^^^^^
    The user can specify any "usrparm.psf_data_len". They choose zero by
    mistake.

    4461 goto out;
    4462 if (is_compat_task()) {
    4463 /* Make sure pointers are sane even on 31 bit. */
    4464 rc = -EINVAL;
    4465 if ((usrparm.psf_data >> 32) != 0)
    4466 goto out;
    4467 if ((usrparm.rssd_result >> 32) != 0)
    4468 goto out;
    4469 usrparm.psf_data &= 0x7fffffffULL;
    4470 usrparm.rssd_result &= 0x7fffffffULL;
    4471 }
    4472 /* alloc I/O data area */
    4473 psf_data = kzalloc(usrparm.psf_data_len, GFP_KERNEL
    | GFP_DMA);
    4474 rssd_result = kzalloc(usrparm.rssd_result_len, GFP_KERNEL
    | GFP_DMA);
    4475 if (!psf_data || !rssd_result) {

    kzalloc() returns a ZERO_SIZE_PTR (0x16).

    4476 rc = -ENOMEM;
    4477 goto out_free;
    4478 }
    4479
    4480 /* get syscall header from user space */
    4481 rc = -EFAULT;
    4482 if (copy_from_user(psf_data,
    4483 (void __user *)(unsigned long)
    usrparm.psf_data,
    4484 usrparm.psf_data_len))

    That all works great.

    4485 goto out_free;
    4486 psf0 = psf_data[0];
    4487 psf1 = psf_data[1];

    But now we're assuming that "->psf_data_len" was at least 2 bytes.

    Fix this by checking the user specified length psf_data_len.

    Fixes: 52898025cf7d ("[S390] dasd: security and PSF update patch for EMC CKD ioctl")
    Reported-by: Dan Carpenter
    Signed-off-by: Stefan Haberland
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Sasha Levin

    Stefan Haberland
     

14 Mar, 2019

3 commits

  • [ Upstream commit c2780c1a3fb724560b1d44f7976e0de17bf153c7 ]

    A card's close_dev work is scheduled on a driver-wide workqueue. If the
    card is removed and freed while the work is still active, this causes a
    use-after-free.
    So make sure that the work is completed before freeing the card.

    Fixes: 0f54761d167f ("qeth: Support VEPA mode")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • [ Upstream commit afa0c5904ba16d59b0454f7ee4c807dae350f432 ]

    The error path in qeth_alloc_qdio_buffers() that takes care of
    cleaning up the Output Queues is buggy. It first frees the queue, but
    then calls qeth_clear_outq_buffers() with that very queue struct.

    Make the call to qeth_clear_outq_buffers() part of the free action
    (in the correct order), and while at it fix the naming of the helper.

    Fixes: 0da9581ddb0f ("qeth: exploit asynchronous delivery of storage blocks")
    Signed-off-by: Julian Wiedmann
    Reviewed-by: Alexandra Winter
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • [ Upstream commit 5065b2dd3e5f9247a6c9d67974bc0472bf561b9d ]

    Whenever we fail before/while starting an IO, make sure to release the
    IO buffer. Usually qeth_irq() would do this for us, but if the IO
    doesn't even start we obviously won't get an interrupt for it either.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     

20 Feb, 2019

1 commit

  • commit 8f9aca0c45322a807a343fc32f95f2500f83b9ae upstream.

    The older machines don't have the QCI instruction available.
    With support for up to 256 crypto cards the probing of each
    card has been extended to check card ids from 0 up to 255.
    For machines with QCI support there is a filter limiting the
    range of probed cards. The older machines (z196 and older)
    don't have this filter and so since support for 256 cards is
    in the driver all cards are probed. However, these machines
    also require to have the card id fit into 6 bits. Exceeding
    this limit results in a specification exception which happens
    on every kernel startup even when there is no crypto configured
    and used at all.

    This fix limits the range of probed crypto cards to 64 if
    there is no QCI instruction available to obey to the older
    ap architecture and so fixes the specification exceptions
    on z196 machines.

    Cc: stable@vger.kernel.org # v4.17+
    Fixes: af4a72276d49 ("s390/zcrypt: Support up to 256 crypto adapters.")
    Signed-off-by: Harald Freudenberger
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Harald Freudenberger
     

13 Feb, 2019

1 commit

  • [ Upstream commit be534791011100d204602e2e0496e9e6ce8edf63 ]

    There exist very few ap messages which need to have the 'special' flag
    enabled. This flag tells the firmware layer to do some pre- and maybe
    postprocessing. However, it may happen that this special flag is
    enabled but the firmware is unable to deal with this kind of message
    and thus returns with reply code 0x41. For example older firmware may
    not know the newest messages triggered by the zcrypt device driver and
    thus react with reject and the named reply code. Unfortunately this
    reply code is not known to the zcrypt error routines and thus default
    behavior is to switch the ap queue offline.

    This patch now makes the ap error routine aware of the reply code and
    so userspace is informed about the bad processing result but the queue
    is not switched to offline state any more.

    Signed-off-by: Harald Freudenberger
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Sasha Levin

    Harald Freudenberger
     

31 Jan, 2019

1 commit

  • commit b7cb707c373094ce4008d4a6ac9b6b366ec52da5 upstream.

    smp_rescan_cpus() is called without the device_hotplug_lock, which can lead
    to a dedlock when a new CPU is found and immediately set online by a udev
    rule.

    This was observed on an older kernel version, where the cpu_hotplug_begin()
    loop was still present, and it resulted in hanging chcpu and systemd-udev
    processes. This specific deadlock will not show on current kernels. However,
    there may be other possible deadlocks, and since smp_rescan_cpus() can still
    trigger a CPU hotplug operation, the device_hotplug_lock should be held.

    For reference, this was the deadlock with the old cpu_hotplug_begin() loop:

    chcpu (rescan) systemd-udevd

    echo 1 > /sys/../rescan
    -> smp_rescan_cpus()
    -> (*) get_online_cpus()
    (increases refcount)
    -> smp_add_present_cpu()
    (new CPU found)
    -> register_cpu()
    -> device_add()
    -> udev "add" event triggered -----------> udev rule sets CPU online
    -> echo 1 > /sys/.../online
    -> lock_device_hotplug_sysfs()
    (this is missing in rescan path)
    -> device_online()
    -> (**) device_lock(new CPU dev)
    -> cpu_up()
    -> cpu_hotplug_begin()
    (loops until refcount == 0)
    -> deadlock with (*)
    -> bus_probe_device()
    -> device_attach()
    -> device_lock(new CPU dev)
    -> deadlock with (**)

    Fix this by taking the device_hotplug_lock in the CPU rescan path.

    Cc:
    Signed-off-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Greg Kroah-Hartman

    Gerald Schaefer
     

13 Jan, 2019

1 commit

  • commit 60a161b7e5b2a252ff0d4c622266a7d8da1120ce upstream.

    Suppose adapter (open) recovery is between opened QDIO queues and before
    (the end of) initial posting of status read buffers (SRBs). This time
    window can be seconds long due to FSF_PROT_HOST_CONNECTION_INITIALIZING
    causing by design looping with exponential increase sleeps in the function
    performing exchange config data during recovery
    [zfcp_erp_adapter_strat_fsf_xconf()]. Recovery triggered by local link up.

    Suppose an event occurs for which the FCP channel would send an unsolicited
    notification to zfcp by means of a previously posted SRB. We saw it with
    local cable pull (link down) in multi-initiator zoning with multiple
    NPIV-enabled subchannels of the same shared FCP channel.

    As soon as zfcp_erp_adapter_strategy_open_fsf() starts posting the initial
    status read buffers from within the adapter's ERP thread, the channel does
    send an unsolicited notification.

    Since v2.6.27 commit d26ab06ede83 ("[SCSI] zfcp: receiving an unsolicted
    status can lead to I/O stall"), zfcp_fsf_status_read_handler() schedules
    adapter->stat_work to re-fill the just consumed SRB from a work item.

    Now the ERP thread and the work item post SRBs in parallel. Both contexts
    call the helper function zfcp_status_read_refill(). The tracking of
    missing (to be posted / re-filled) SRBs is not thread-safe due to separate
    atomic_read() and atomic_dec(), in order to depend on posting
    success. Hence, both contexts can see
    atomic_read(&adapter->stat_miss) == 1. One of the two contexts posts
    one too many SRB. Zfcp gets QDIO_ERROR_SLSB_STATE on the output queue
    (trace tag "qdireq1") leading to zfcp_erp_adapter_shutdown() in
    zfcp_qdio_handler_error().

    An obvious and seemingly clean fix would be to schedule stat_work from the
    ERP thread and wait for it to finish. This would serialize all SRB
    re-fills. However, we already have another work item wait on the ERP
    thread: adapter->scan_work runs zfcp_fc_scan_ports() which calls
    zfcp_fc_eval_gpn_ft(). The latter calls zfcp_erp_wait() to wait for all the
    open port recoveries during zfcp auto port scan, but in fact it waits for
    any pending recovery including an adapter recovery. This approach leads to
    a deadlock. [see also v3.19 commit 18f87a67e6d6 ("zfcp: auto port scan
    resiliency"); v2.6.37 commit d3e1088d6873
    ("[SCSI] zfcp: No ERP escalation on gpn_ft eval");
    v2.6.28 commit fca55b6fb587
    ("[SCSI] zfcp: fix deadlock between wq triggered port scan and ERP")
    fixing v2.6.27 commit c57a39a45a76
    ("[SCSI] zfcp: wait until adapter is finished with ERP during auto-port");
    v2.6.27 commit cc8c282963bd
    ("[SCSI] zfcp: Automatically attach remote ports")]

    Instead make the accounting of missing SRBs atomic for parallel execution
    in both the ERP thread and adapter->stat_work.

    Signed-off-by: Steffen Maier
    Fixes: d26ab06ede83 ("[SCSI] zfcp: receiving an unsolicted status can lead to I/O stall")
    Cc: #2.6.27+
    Reviewed-by: Jens Remus
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Steffen Maier
     

17 Dec, 2018

2 commits

  • [ Upstream commit b89e242eee8d4cd8261d8d821c62c5d1efc454d0 ]

    Direct returns from within a loop are rude, but it doesn't mean it gets
    to avoid releasing the memory acquired beforehand.

    Signed-off-by: Eric Farman
    Message-Id:
    Reviewed-by: Farhan Ali
    Reviewed-by: Pierre Morel
    Acked-by: Halil Pasic
    Signed-off-by: Cornelia Huck
    Signed-off-by: Sasha Levin

    Eric Farman
     
  • [ Upstream commit 806212f91c874b24cf9eb4a9f180323671b6c5ed ]

    If pfn_array_alloc fails somehow, we need to release the pfn_array_table
    that was malloc'd earlier.

    Signed-off-by: Eric Farman
    Message-Id:
    Acked-by: Halil Pasic
    Signed-off-by: Cornelia Huck
    Signed-off-by: Sasha Levin

    Eric Farman
     

13 Dec, 2018

3 commits

  • commit 78b1a52e05c9db11d293342e8d6d8a230a04b4e7 upstream.

    While ccw_io_helper() seems like intended to be exclusive in a sense that
    it is supposed to facilitate I/O for at most one thread at any given
    time, there is actually nothing ensuring that threads won't pile up at
    vcdev->wait_q. If they do, all threads get woken up and see the status
    that belongs to some other request than their own. This can lead to bugs.
    For an example see:
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1788432

    This race normally does not cause any problems. The operations provided
    by struct virtio_config_ops are usually invoked in a well defined
    sequence, normally don't fail, and are normally used quite infrequent
    too.

    Yet, if some of the these operations are directly triggered via sysfs
    attributes, like in the case described by the referenced bug, userspace
    is given an opportunity to force races by increasing the frequency of the
    given operations.

    Let us fix the problem by ensuring, that for each device, we finish
    processing the previous request before starting with a new one.

    Signed-off-by: Halil Pasic
    Reported-by: Colin Ian King
    Cc: stable@vger.kernel.org
    Message-Id:
    Signed-off-by: Cornelia Huck
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Greg Kroah-Hartman

    Halil Pasic
     
  • commit 2448a299ec416a80f699940a86f4a6d9a4f643b1 upstream.

    Currently we have a race on vcdev->config in virtio_ccw_get_config() and
    in virtio_ccw_set_config().

    This normally does not cause problems, as these are usually infrequent
    operations. However, for some devices writing to/reading from the config
    space can be triggered through sysfs attributes. For these, userspace can
    force the race by increasing the frequency.

    Signed-off-by: Halil Pasic
    Cc: stable@vger.kernel.org
    Message-Id:
    Signed-off-by: Cornelia Huck
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: Greg Kroah-Hartman

    Halil Pasic
     
  • [ Upstream commit 007b656851ed7f94ba0fa358ac3e5d7705da6846 ]

    SMC-D stress workload showed connection stalls. Since the firmware
    decides to skip raising an interrupt if the SBA DMBE mask bit is
    still set, this SBA DMBE mask bit should be cleared before the
    IRQ handling in the SMC code runs. Otherwise there are small windows
    possible with missing interrupts for incoming data.
    SMC-D currently does not care about the old value of the SBA DMBE
    mask.

    Acked-by: Sebastian Ott
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Ursula Braun
     

06 Dec, 2018

1 commit

  • [ Upstream commit 9a764c1e59684c0358e16ccaafd870629f2cfe67 ]

    The response for a SNMP request can consist of multiple parts, which
    the cmd callback stages into a kernel buffer until all parts have been
    received. If the callback detects that the staging buffer provides
    insufficient space, it bails out with error.
    This processing is buggy for the first part of the response - while it
    initially checks for a length of 'data_len', it later copies an
    additional amount of 'offsetof(struct qeth_snmp_cmd, data)' bytes.

    Fix the calculation of 'data_len' for the first part of the response.
    This also nicely cleans up the memcpy code.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Signed-off-by: Julian Wiedmann
    Reviewed-by: Ursula Braun
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Julian Wiedmann
     

27 Nov, 2018

2 commits

  • [ Upstream commit 30356d08159d7899438e94503ae322a8b881e205 ]

    qeth only registers its netdevice when the qeth device is first set
    online. Thus a device that has never been set online will trigger
    a WARN ("network todo 'hsi%d' but state 0") in unregister_netdev() when
    removed.

    Fix this by protecting the unregister step, just like we already protect
    against repeated registering of the netdevice.

    Fixes: d3d1b205e89f ("s390/qeth: allocate netdevice early")
    Reported-by: Karsten Graul
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • [ Upstream commit bd74a7f9cc033cf4d405788f80292268987dc0c5 ]

    Sniffing mode for L3 HiperSockets requires that no IP addresses are
    registered with the HW. The preferred way to achieve this is for
    userspace to delete all the IPs on the interface. But qeth is expected
    to also tolerate a configuration where that is not the case, by skipping
    the IP registration when in sniffer mode.
    Since commit 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
    reworked the IP registration logic in the L3 subdriver, this no longer
    works. When the qeth device is set online, qeth_l3_recover_ip() now
    unconditionally registers all unicast addresses from our internal
    IP table.

    While we could fix this particular problem by skipping
    qeth_l3_recover_ip() on a sniffer device, the more future-proof change
    is to skip the IP address registration at the lowest level. This way we
    a) catch any future code path that attempts to register an IP address
    without considering the sniffer scenario, and
    b) continue to build up our internal IP table, so that if sniffer mode
    is switched off later we can operate just like normal.

    Fixes: 5f78e29ceebf ("qeth: optimize IP handling in rx_mode callback")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     

10 Oct, 2018

1 commit

  • Martin writes:
    "s390 fixes for 4.19-rc8

    Four more patches for 4.19:
    - Fix resume after suspend-to-disk if resume-CPU != suspend-CPU
    - Fix vfio-ccw check for pinned pages
    - Two patches to avoid a usercopy-whitelist warning in vfio-ccw"

    * tag 's390-4.19-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    s390/cio: Fix how vfio-ccw checks pinned pages
    s390/cio: Refactor alloc of ccw_io_region
    s390/cio: Convert ccw_io_region to pointer
    s390/hibernate: fix error handling when suspend cpu != resume cpu

    Greg Kroah-Hartman
     

02 Oct, 2018

1 commit

  • We have two nested loops to check the entries within the pfn_array_table
    arrays. But we mistakenly use the outer array as an index in our check,
    and completely ignore the indexing performed by the inner loop.

    Cc: stable@vger.kernel.org
    Signed-off-by: Eric Farman
    Message-Id:
    Signed-off-by: Cornelia Huck

    Eric Farman
     

29 Sep, 2018

2 commits

  • Functions qeth_get_ipa_msg and qeth_get_ipa_cmd_name are modifying
    the last member of global arrays without any locking that I can see.
    If two instances of either function are running at the same time,
    it could cause a race ultimately leading to an array overrun (the
    contents of the last entry of the array is the only guarantee that
    the loop will ever stop).

    Performing the lookups without modifying the arrays is admittedly
    slower (two comparisons per iteration instead of one) but these
    are operations which are rare (should only be needed in error
    cases or when debugging, not during successful operation) and it
    seems still less costly than introducing a mutex to protect the
    arrays in question.

    As a side bonus, it allows us to declare both arrays as const data.

    Signed-off-by: Jean Delvare
    Cc: Julian Wiedmann
    Cc: Ursula Braun
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Jean Delvare
     
  • Use the common code ARRAY_SIZE macro instead of a private implementation.

    Reviewed-by: Jean Delvare
    Signed-off-by: zhong jiang
    Signed-off-by: Martin Schwidefsky
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    zhong jiang
     

27 Sep, 2018

2 commits

  • If I attach a vfio-ccw device to my guest, I get the following warning
    on the host when the host kernel is CONFIG_HARDENED_USERCOPY=y

    [250757.595325] Bad or missing usercopy whitelist? Kernel memory overwrite attempt detected to SLUB object 'dma-kmalloc-512' (offset 64, size 124)!
    [250757.595365] WARNING: CPU: 2 PID: 10958 at mm/usercopy.c:81 usercopy_warn+0xac/0xd8
    [250757.595369] Modules linked in: kvm vhost_net vhost tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c devlink tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables sunrpc dm_multipath s390_trng crc32_vx_s390 ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha1_s390 eadm_sch tape_3590 tape tape_class qeth_l2 qeth ccwgroup vfio_ccw vfio_mdev zcrypt_cex4 mdev vfio_iommu_type1 zcrypt vfio sha256_s390 sha_common zfcp scsi_transport_fc qdio dasd_eckd_mod dasd_mod
    [250757.595424] CPU: 2 PID: 10958 Comm: CPU 2/KVM Not tainted 4.18.0-derp #2
    [250757.595426] Hardware name: IBM 3906 M05 780 (LPAR)
    ...snip regs...
    [250757.595523] Call Trace:
    [250757.595529] ([] usercopy_warn+0xa8/0xd8)
    [250757.595535] [] __check_heap_object+0xfa/0x160
    [250757.595540] [] __check_object_size+0x156/0x1d0
    [250757.595547] [] vfio_ccw_mdev_write+0x74/0x148 [vfio_ccw]
    [250757.595552] [] __vfs_write+0x3a/0x188
    [250757.595556] [] vfs_write+0xa8/0x1b8
    [250757.595559] [] ksys_pwrite64+0x86/0xc0
    [250757.595568] [] system_call+0xdc/0x2b0
    [250757.595570] Last Breaking-Event-Address:
    [250757.595573] [] usercopy_warn+0xa8/0xd8

    While vfio_ccw_mdev_{write|read} validates that the input position/count
    does not run over the ccw_io_region struct, the usercopy code that does
    copy_{to|from}_user doesn't necessarily know this. It sees the variable
    length and gets worried that it's affecting a normal kmalloc'd struct,
    and generates the above warning.

    Adjust how the ccw_io_region is alloc'd with a whitelist to remove this
    warning. The boundary checking will continue to do its thing.

    Signed-off-by: Eric Farman
    Message-Id:
    Signed-off-by: Cornelia Huck

    Eric Farman
     
  • In the event that we want to change the layout of the ccw_io_region in the
    future[1], it might be easier to work with it as a pointer within the
    vfio_ccw_private struct rather than an embedded struct.

    [1] https://patchwork.kernel.org/comment/22228541/

    Signed-off-by: Eric Farman
    Message-Id:
    Signed-off-by: Cornelia Huck

    Eric Farman
     

20 Sep, 2018

1 commit

  • The resume code checks if the resume cpu is the same as the suspend cpu.
    If not, and if it is also not possible to switch to the suspend cpu, an
    error message should be printed and the resume process should be stopped
    by loading a disabled wait psw.

    The current logic is broken in multiple ways, the message is never printed,
    and the disabled wait psw never loaded because the kernel panics before that:
    - sam31 and SIGP_SET_ARCHITECTURE to ESA mode is wrong, this will break
    on the first 64bit instruction in sclp_early_printk().
    - The init stack should be used, but the stack pointer is not set up correctly
    (missing aghi %r15,-STACK_FRAME_OVERHEAD).
    - __sclp_early_printk() checks the sclp_init_state. If it is not
    sclp_init_state_uninitialized, it simply returns w/o printing anything.
    In the resumed kernel however, sclp_init_state will never be uninitialized.

    This patch fixes those issues by removing the sam31/ESA logic, adding a
    correct init stack pointer, and also introducing sclp_early_printk_force()
    to allow using sclp_early_printk() even when sclp_init_state is not
    uninitialized.

    Reviewed-by: Heiko Carstens
    Signed-off-by: Gerald Schaefer
    Signed-off-by: Martin Schwidefsky

    Gerald Schaefer
     

14 Sep, 2018

1 commit


13 Sep, 2018

4 commits

  • For inbound data with an unsupported HW header format, only dump the
    actual HW header. We have no idea how much payload follows it, and what
    it contains. Worst case, we dump past the end of the Inbound Buffer and
    access whatever is located next in memory.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • qeth_query_oat_command() currently allocates the kernel buffer for
    the SIOC_QETH_QUERY_OAT ioctl with kzalloc. So on systems with
    fragmented memory, large allocations may fail (eg. the qethqoat tool by
    default uses 132KB).

    Solve this issue by using vzalloc, backing the allocation with
    non-contiguous memory.

    Signed-off-by: Wenjia Zhang
    Reviewed-by: Julian Wiedmann
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Wenjia Zhang
     
  • Scatter-gather transmit brings a nice performance boost. Considering the
    rather large MTU sizes at play, it's also totally the Right Thing To Do.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     
  • Bailing out on allocation error is nice, but we also need to tell the
    ccwgroup core that creating the qeth groupdev failed.

    Fixes: d3d1b205e89f ("s390/qeth: allocate netdevice early")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller

    Julian Wiedmann
     

12 Sep, 2018

1 commit

  • The use of variable length arrays on the stack is deprecated.
    git commit 3d8f60d38e249f989a7fca9c2370c31c3d5487e1
    "s390/zcrypt: hex string mask improvements for apmask and aqmask."
    added three new VLA arrays. Remove them again.

    Reviewed-by: Harald Freudenberger
    Signed-off-by: Martin Schwidefsky

    Martin Schwidefsky
     

26 Aug, 2018

1 commit

  • Pull libnvdimm updates from Dave Jiang:
    "Collection of misc libnvdimm patches for 4.19 submission:

    - Adding support to read locked nvdimm capacity.

    - Change test code to make DSM failure code injection an override.

    - Add support for calculate maximum contiguous area for namespace.

    - Add support for queueing a short ARS when there is on going ARS for
    nvdimm.

    - Allow NULL to be passed in to ->direct_access() for kaddr and pfn
    params.

    - Improve smart injection support for nvdimm emulation testing.

    - Fix test code that supports for emulating controller temperature.

    - Fix hang on error before devm_memremap_pages()

    - Fix a bug that causes user memory corruption when data returned to
    user for ars_status.

    - Maintainer updates for Ross Zwisler emails and adding Jan Kara to
    fsdax"

    * tag 'libnvdimm-for-4.19_misc' of gitolite.kernel.org:pub/scm/linux/kernel/git/nvdimm/nvdimm:
    libnvdimm: fix ars_status output length calculation
    device-dax: avoid hang on error before devm_memremap_pages()
    tools/testing/nvdimm: improve emulation of smart injection
    filesystem-dax: Do not request kaddr and pfn when not required
    md/dm-writecache: Don't request pointer dummy_addr when not required
    dax/super: Do not request a pointer kaddr when not required
    tools/testing/nvdimm: kaddr and pfn can be NULL to ->direct_access()
    s390, dcssblk: kaddr and pfn can be NULL to ->direct_access()
    libnvdimm, pmem: kaddr and pfn can be NULL to ->direct_access()
    acpi/nfit: queue issuing of ars when an uc error notification comes in
    libnvdimm: Export max available extent
    libnvdimm: Use max contiguous area for namespace size
    MAINTAINERS: Add Jan Kara for filesystem DAX
    MAINTAINERS: update Ross Zwisler's email address
    tools/testing/nvdimm: Fix support for emulating controller temperature
    tools/testing/nvdimm: Make DSM failure code injection an override
    acpi, nfit: Prefer _DSM over _LSR for namespace label reads
    libnvdimm: Introduce locked DIMM capacity support

    Linus Torvalds
     

25 Aug, 2018

1 commit

  • Pull s390 updates from Martin Schwidefsky:

    - A couple of patches for the zcrypt driver:
    + Add two masks to determine which AP cards and queues are host
    devices, this will be useful for KVM AP device passthrough
    + Add-on patch to improve the parsing of the new apmask and aqmask
    + Some code beautification

    - Second try to reenable the GCC plugins, the first patch set had a
    patch to do this but the merge somehow missed this

    - Remove the s390 specific GCC version check and use the generic one

    - Three patches for kdump, two bug fixes and one cleanup

    - Three patches for the PCI layer, one bug fix and two cleanups

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
    s390: remove gcc version check (4.3 or newer)
    s390/zcrypt: hex string mask improvements for apmask and aqmask.
    s390/zcrypt: AP bus support for alternate driver(s)
    s390/zcrypt: code beautify
    s390/zcrypt: switch return type to bool for ap_instructions_available()
    s390/kdump: Remove kzalloc_panic
    s390/kdump: Fix memleak in nt_vmcoreinfo
    s390/kdump: Make elfcorehdr size calculation ABI compliant
    s390/pci: remove fmb address from debug output
    s390/pci: remove stale rc
    s390/pci: fix out of bounds access during irq setup
    s390/zcrypt: fix ap_instructions_available() returncodes
    s390: reenable gcc plugins for real

    Linus Torvalds
     

21 Aug, 2018

1 commit

  • The sysfs attributes /sys/bus/ap/apmask and /sys/bus/ap/aqmask
    and the kernel command line arguments ap.apm and ap.aqm get
    an improvement of the value parsing with this patch:

    The mask values are bitmaps in big endian order starting with bit 0.
    So adapter number 0 is the leftmost bit, mask is 0x8000... The sysfs
    attributes and the kernel command line accept 2 different formats:
    - Absolute hex string starting with 0x like "0x12345678" does set
    the mask starting from left to right. If the given string is shorter
    than the mask it is padded with 0s on the right. If the string is
    longer than the mask an error comes back (EINVAL).
    - Relative format - a concatenation (done with ',') of the terms
    +[-] or -[-]. may be any
    valid number (hex, decimal or octal) in the range 0...255.
    Here are some examples:
    "+0-15,+32,-128,-0xFF"
    "-0-255,+1-16,+0x128"

    Signed-off-by: Harald Freudenberger
    Signed-off-by: Martin Schwidefsky

    Harald Freudenberger
     

20 Aug, 2018

2 commits

  • The current AP bus, AP devices and AP device drivers implementation
    uses a clearly defined mapping for binding AP devices to AP device
    drivers. So for example a CEX6C queue will always be bound to the
    cex4queue device driver.

    The Linux Device Driver model has no sensitivity for more than one
    device driver eligible for one device type. If there exist more than
    one drivers matching to the device type, simple all drivers are tried
    consecutively. There is no way to determine and influence the probing
    order of the drivers.

    With KVM there is a need to provide additional device drivers matching
    to the very same type of AP devices. With a simple implementation the
    KVM drivers run in competition to the regular drivers. Whichever
    'wins' a device depends on build order and implementation details
    within the common Linux Device Driver Model and is not
    deterministic. However, a userspace process could figure out which
    device should be bound to which driver and sort out the correct
    binding by manipulating attributes in the sysfs.

    If for security reasons a AP device must not get bound to the 'wrong'
    device driver the sorting out has to be done within the Linux kernel
    by the AP bus code. This patch modifies the behavior of the AP bus
    for probing drivers for devices in a way that two sets of drivers are
    usable. Two new bitmasks 'apmask' and 'aqmask' are used to mark a
    subset of the APQN range for 'usable by the ap bus and the default
    drivers' or 'not usable by the default drivers and thus available for
    alternate drivers like vfio-xxx'. So an APQN which is addressed by
    this masking only the default drivers will be probed. In contrary an
    APQN which is not addressed by the masks will never be probed and
    bound to default drivers but onny to alternate drivers.

    Eventually the two masks give a way to divide the range of APQNs into
    two pools: one pool of APQNs used by the AP bus and the default
    drivers and thus via zcrypt drivers available to the userspace of the
    system. And another pool where no zcrypt drivers are bound to and
    which can be used by alternate drivers (like vfio-xxx) for their
    needs. This division is hot-plug save and makes sure a APQN assigned
    to an alternate driver is at no time somehow exploitable by the wrong
    party.

    The two masks are located in sysfs at /sys/bus/ap/apmask and
    /sys/bus/ap/aqmask. The mask syntax is exactly the same as the
    already existing mask attributes in the /sys/bus/ap directory (for
    example ap_usage_domain_mask and ap_control_domain_mask).

    By default all APQNs belong to the ap bus and the default drivers:

    cat /sys/bus/ap/apmask
    0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff
    cat /sys/bus/ap/aqmask
    0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff

    The masks can be changed at boot time with the kernel command line
    like this:

    ... ap.apmask=0xffff ap.aqmask=0x40

    This would give these two pools:

    default drivers pool: adapter 0 - 15, domain 1
    alternate drivers pool: adapter 0 - 15, all but domain 1
    adapter 16-255, all domains

    The sysfs attributes for this two masks are writeable and an
    administrator is able to reconfigure the assignements on the fly by
    writing new mask values into. With changing the mask(s) a revision of
    the existing queue to driver bindings is done. So all APQNs which are
    bound to the 'wrong' driver are reprobed via kernel function
    device_reprobe() and thus the new correct driver will be assigned with
    respect of the changed apmask and aqmask bits.

    The mask values are bitmaps in big endian order starting with bit 0.
    So adapter number 0 is the leftmost bit, mask is 0x8000... The sysfs
    attributes accept 2 different formats:
    - Absolute hex string starting with 0x like "0x12345678" does set
    the mask starting from left to right. If the given string is shorter
    than the mask it is padded with 0s on the right. If the string is
    longer than the mask an error comes back (EINVAL).
    - '+' or '-' followed by a numerical value. Valid examples are "+1",
    "-13", "+0x41", "-0xff" and even "+0" and "-0". Only the addressed
    bit in the mask is switched on ('+') or off ('-').

    This patch will also be the base for an upcoming extension to the
    zcrypt drivers to be able to provide additional zcrypt device nodes
    with filtering based on ap and aq masks.

    Signed-off-by: Harald Freudenberger
    Signed-off-by: Martin Schwidefsky

    Harald Freudenberger
     
  • Code beautify by following most of the checkpatch suggestions:
    - SPDX license identifier line complains by checkpatch
    - missing space or newline complains by checkpatch
    - octal numbers for permssions complains by checkpatch
    - renaming of static sysfs functions complains by checkpatch
    - fix of block comment complains by checkpatch
    - fix printf like calls where function name instead of %s __func__
    was used
    - __packed instead of __attribute__((packed))
    - init to zero for static variables removed
    - use of DEVICE_ATTR_RO and DEVICE_ATTR_RW macros

    No functional code changes or API changes!

    Signed-off-by: Harald Freudenberger
    Signed-off-by: Martin Schwidefsky

    Harald Freudenberger
     

19 Aug, 2018

1 commit

  • Pull tty/serial driver updates from Greg KH:
    "Here is the big tty and serial driver pull request for 4.19-rc1.

    It's not all that big, just a number of small serial driver updates
    and fixes, along with some better vt handling for unicode characters
    for those using braille terminals.

    All of these patches have been in linux-next for a long time with no
    reported issues"

    * tag 'tty-4.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (73 commits)
    tty: serial: 8250: Revert NXP SC16C2552 workaround
    serial: 8250_exar: Read INT0 from slave device, too
    tty: rocket: Fix possible buffer overwrite on register_PCI
    serial: 8250_dw: Add ACPI support for uart on Broadcom SoC
    serial: 8250_dw: always set baud rate in dw8250_set_termios
    dt-bindings: serial: Add binding for uartlite
    tty: serial: uartlite: Add support for suspend and resume
    tty: serial: uartlite: Add clock adaptation
    tty: serial: uartlite: Add structure for private data
    serial: sh-sci: Improve support for separate TEI and DRI interrupts
    serial: sh-sci: Remove SCIx_RZ_SCIFA_REGTYPE
    serial: sh-sci: Allow for compressed SCIF address
    serial: sh-sci: Improve interrupts description
    serial: 8250: Use cached port name directly in messages
    serial: 8250_exar: Drop unused variable in pci_xr17v35x_setup()
    vt: drop unused struct vt_struct
    vt: avoid a VLA in the unicode screen scroll function
    vt: add /dev/vcsu* to devices.txt
    vt: coherence validation code for the unicode screen buffer
    vt: selection: take screen contents from uniscr if available
    ...

    Linus Torvalds