01 Oct, 2020

3 commits

  • commit f7e80983f0cf470bb82036e73bff4d5a7daf8fc2 upstream.

    reqcnt is an u32 pointer but we do copy sizeof(reqcnt) which is the
    size of the pointer. This means we only copy 8 byte. Let us copy
    the full monty.

    Signed-off-by: Christian Borntraeger
    Cc: Harald Freudenberger
    Cc: stable@vger.kernel.org
    Fixes: af4a72276d49 ("s390/zcrypt: Support up to 256 crypto adapters.")
    Reviewed-by: Harald Freudenberger
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Greg Kroah-Hartman

    Christian Borntraeger
     
  • commit 709192d531e5b0a91f20aa14abfe2fc27ddd47af upstream.

    A discard request that writes zeros using the global kernel internal
    ZERO_PAGE will fail for machines with more than 2GB of memory due to the
    location of the ZERO_PAGE.

    Fix this by using a driver owned global zero page allocated with GFP_DMA
    flag set.

    Fixes: 28b841b3a7cb ("s390/dasd: Add discard support for FBA devices")
    Signed-off-by: Jan Höppner
    Reviewed-by: Stefan Haberland
    Cc: # 4.14+
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Jan Höppner
     
  • [ Upstream commit 8719b6d29d2851fa84c4074bb2e5adc022911ab8 ]

    request_irq() is preferred over setup_irq(). Invocations of setup_irq()
    occur after memory allocators are ready.

    Per tglx[1], setup_irq() existed in olden days when allocators were not
    ready by the time early interrupts were initialized.

    Hence replace setup_irq() by request_irq().

    [1] https://lkml.kernel.org/r/alpine.DEB.2.20.1710191609480.1971@nanos

    Signed-off-by: afzal mohammed
    Message-Id:
    [heiko.carstens@de.ibm.com: replace pr_err with panic]
    Signed-off-by: Heiko Carstens
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    afzal mohammed
     

23 Sep, 2020

1 commit

  • commit b6186d7fb53349efd274263a45f0b08749ccaa2d upstream.

    Tests showed that under stress conditions the kernel may
    temporary fail to allocate 256k with kmalloc. However,
    this fix reworks the related code in the cca_findcard2()
    function to use kvmalloc instead.

    Signed-off-by: Harald Freudenberger
    Reviewed-by: Ingo Franzki
    Cc: Stable
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Greg Kroah-Hartman

    Harald Freudenberger
     

03 Sep, 2020

1 commit

  • [ Upstream commit 0b8eb2ee9da1e8c9b8082f404f3948aa82a057b2 ]

    The scanning through subchannels during the time of an event could
    take significant amount of time in case of platforms with lots of
    known subchannels. This might result in higher scheduling latencies
    for other tasks especially on systems with a single CPU. Add
    cond_resched() call, as the loop in slow_eval_known_fn() can be
    executed for a longer duration.

    Reviewed-by: Peter Oberparleiter
    Signed-off-by: Vineeth Vijayan
    Signed-off-by: Heiko Carstens
    Signed-off-by: Sasha Levin

    Vineeth Vijayan
     

26 Aug, 2020

1 commit

  • commit 2d9a2c5f581be3991ba67fa9e7497c711220ea8e upstream.

    Before v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use
    timer_setup()"), we intentionally only passed zfcp_adapter as context
    argument to zfcp_fsf_request_timeout_handler(). Since we only trigger
    adapter recovery, it was unnecessary to sync against races between timeout
    and (late) completion. Likewise, we only passed zfcp_erp_action as context
    argument to zfcp_erp_timeout_handler(). Since we only wakeup an ERP action,
    it was unnecessary to sync against races between timeout and (late)
    completion.

    Meanwhile the timeout handlers get timer_list as context argument and do a
    timer-specific container-of to zfcp_fsf_req which can have been freed.

    Fix it by making sure that any request timeout handlers, that might just
    have started before del_timer(), are completed by using del_timer_sync()
    instead. This ensures the request free happens afterwards.

    Space time diagram of potential use-after-free:

    Basic idea is to have 2 or more pending requests whose timeouts run out at
    almost the same time.

    req 1 timeout ERP thread req 2 timeout
    ---------------- ---------------- ---------------------------------------
    zfcp_fsf_request_timeout_handler
    fsf_req = from_timer(fsf_req, t, timer)
    adapter = fsf_req->adapter
    zfcp_qdio_siosl(adapter)
    zfcp_erp_adapter_reopen(adapter,...)
    zfcp_erp_strategy
    ...
    zfcp_fsf_req_dismiss_all
    list_for_each_entry_safe
    zfcp_fsf_req_complete 1
    del_timer 1
    zfcp_fsf_req_free 1
    zfcp_fsf_req_complete 2
    zfcp_fsf_request_timeout_handler
    del_timer 2
    fsf_req = from_timer(fsf_req, t, timer)
    zfcp_fsf_req_free 2
    adapter = fsf_req->adapter
    ^^^^^^^ already freed

    Link: https://lore.kernel.org/r/20200813152856.50088-1-maier@linux.ibm.com
    Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()")
    Cc: #4.15+
    Suggested-by: Julian Wiedmann
    Reviewed-by: Julian Wiedmann
    Signed-off-by: Steffen Maier
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Steffen Maier
     

19 Aug, 2020

2 commits

  • commit 9f4aa52387c68049403b59939df5c0dd8e3872cc upstream.

    During initialization of the DASD DIAG driver a request is issued
    that has a bio structure that resides on the stack. With virtually
    mapped kernel stacks this bio address might be in virtual storage
    which is unsuitable for usage with the diag250 call.
    In this case the device can not be set online using the DIAG
    discipline and fails with -EOPNOTSUP.
    In the system journal the following error message is presented:

    dasd: X.X.XXXX Setting the DASD online with discipline DIAG failed
    with rc=-95

    Fix by allocating the bio structure instead of having it on the stack.

    Fixes: ce3dc447493f ("s390: add support for virtually mapped kernel stacks")
    Signed-off-by: Stefan Haberland
    Reviewed-by: Peter Oberparleiter
    Cc: stable@vger.kernel.org #4.20
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Stefan Haberland
     
  • [ Upstream commit 02472e28b9a45471c6d8729ff2c7422baa9be46a ]

    Discard events that don't contain any entries. This shouldn't happen,
    but subsequent code relies on being able to use entry 0. So better
    be safe than accessing garbage.

    Fixes: b4d72c08b358 ("qeth: bridgeport support - basic control")
    Signed-off-by: Julian Wiedmann
    Reviewed-by: Alexandra Winter
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     

01 Jul, 2020

2 commits

  • [ Upstream commit e2dfcfba00ba4a414617ef4c5a8501fe21567eb3 ]

    Current(?) OSA devices also store their cmd-specific return codes for
    SET_ACCESS_CONTROL cmds into the top-level cmd->hdr.return_code.
    So once we added stricter checking for the top-level field a while ago,
    none of the error logic that rolls back the user's configuration to its
    old state is applied any longer.

    For this specific cmd, go back to the old model where we peek into the
    cmd structure even though the top-level field indicated an error.

    Fixes: 686c97ee29c8 ("s390/qeth: fix error handling in adapter command callbacks")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • commit 936e6b85da0476dd2edac7c51c68072da9fb4ba2 upstream.

    Suppose that, for unrelated reasons, FSF requests on behalf of recovery are
    very slow and can run into the ERP timeout.

    In the case at hand, we did adapter recovery to a large degree. However
    due to the slowness a LUN open is pending so the corresponding fc_rport
    remains blocked. After fast_io_fail_tmo we trigger close physical port
    recovery for the port under which the LUN should have been opened. The new
    higher order port recovery dismisses the pending LUN open ERP action and
    dismisses the pending LUN open FSF request. Such dismissal decouples the
    ERP action from the pending corresponding FSF request by setting
    zfcp_fsf_req->erp_action to NULL (among other things)
    [zfcp_erp_strategy_check_fsfreq()].

    If now the ERP timeout for the pending open LUN request runs out, we must
    not use zfcp_fsf_req->erp_action in the ERP timeout handler. This is a
    problem since v4.15 commit 75492a51568b ("s390/scsi: Convert timers to use
    timer_setup()"). Before that we intentionally only passed zfcp_erp_action
    as context argument to zfcp_erp_timeout_handler().

    Note: The lifetime of the corresponding zfcp_fsf_req object continues until
    a (late) response or an (unrelated) adapter recovery.

    Just like the regular response path ignores dismissed requests
    [zfcp_fsf_req_complete() => zfcp_fsf_protstatus_eval() => return early] the
    ERP timeout handler now needs to ignore dismissed requests. So simply
    return early in the ERP timeout handler if the FSF request is marked as
    dismissed in its status flags. To protect against the race where
    zfcp_erp_strategy_check_fsfreq() dismisses and sets
    zfcp_fsf_req->erp_action to NULL after our previous status flag check,
    return early if zfcp_fsf_req->erp_action is NULL. After all, the former
    ERP action does not need to be woken up as that was already done as part of
    the dismissal above [zfcp_erp_action_dismiss()].

    This fixes the following panic due to kernel page fault in IRQ context:

    Unable to handle kernel pointer dereference in virtual kernel address space
    Failing address: 0000000000000000 TEID: 0000000000000483
    Fault in home space mode while using kernel ASCE.
    AS:000009859238c00b R2:00000e3e7ffd000b R3:00000e3e7ffcc007 S:00000e3e7ffd7000 P:000000000000013d
    Oops: 0004 ilc:2 [#1] SMP
    Modules linked in: ...
    CPU: 82 PID: 311273 Comm: stress Kdump: loaded Tainted: G E X ...
    Hardware name: IBM 8561 T01 701 (LPAR)
    Krnl PSW : 0404c00180000000 001fffff80549be0 (zfcp_erp_notify+0x40/0xc0 [zfcp])
    R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
    Krnl GPRS: 0000000000000080 00000e3d00000000 00000000000000f0 0000000000030000
    000000010028e700 000000000400a39c 000000010028e700 00000e3e7cf87e02
    0000000010000000 0700098591cb67f0 0000000000000000 0000000000000000
    0000033840e9a000 0000000000000000 001fffe008d6bc18 001fffe008d6bbc8
    Krnl Code: 001fffff80549bd4: a7180000 lhi %r1,0
    001fffff80549bd8: 4120a0f0 la %r2,240(%r10)
    #001fffff80549bdc: a53e0003 llilh %r3,3
    >001fffff80549be0: ba132000 cs %r1,%r3,0(%r2)
    001fffff80549be4: a7740037 brc 7,1fffff80549c52
    001fffff80549be8: e320b0180004 lg %r2,24(%r11)
    001fffff80549bee: e31020e00004 lg %r1,224(%r2)
    001fffff80549bf4: 412020e0 la %r2,224(%r2)
    Call Trace:
    [] zfcp_erp_notify+0x40/0xc0 [zfcp]
    [] call_timer_fn+0x38/0x190
    [] expire_timers+0xfc/0x190
    [] run_timer_softirq+0xec/0x218
    [] __do_softirq+0x144/0x398
    [] do_softirq_own_stack+0x72/0x88
    [] irq_exit+0xb0/0xb8
    [] do_IRQ+0x82/0xb0
    [] ext_int_handler+0x128/0x12c
    [] clear_subpage.constprop.13+0x38/0x60
    ([] clear_huge_page+0xec/0x250)
    [] do_huge_pmd_anonymous_page+0x32a/0x768
    [] __handle_mm_fault+0x88a/0x900
    [] handle_mm_fault+0xd8/0x1b0
    [] do_dat_exception+0x136/0x3e8
    [] pgm_check_handler+0x1c8/0x220
    Last Breaking-Event-Address:
    [] zfcp_erp_timeout_handler+0x10/0x18 [zfcp]
    Kernel panic - not syncing: Fatal exception in interrupt

    Link: https://lore.kernel.org/r/20200623140242.98864-1-maier@linux.ibm.com
    Fixes: 75492a51568b ("s390/scsi: Convert timers to use timer_setup()")
    Cc: #4.15+
    Reviewed-by: Julian Wiedmann
    Signed-off-by: Steffen Maier
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Steffen Maier
     

24 Jun, 2020

1 commit

  • [ Upstream commit 75e82bec6b2622c6f455b7a543fb5476a5d0eed7 ]

    qdio_establish() calls qdio_setup_thinint() via qdio_setup_irq().
    If the subsequent qdio_establish_thinint() fails, we miss to put the
    DSCI again. Thus the DSCI isn't available for re-use. Given enough of
    such errors, we could end up with having only the shared DSCI available.

    Merge qdio_setup_thinint() into qdio_establish_thinint(), and deal with
    such an error internally.

    Fixes: 779e6e1c724d ("[S390] qdio: new qdio driver.")
    Signed-off-by: Julian Wiedmann
    Reviewed-by: Benjamin Block
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     

20 May, 2020

1 commit

  • [ Upstream commit 29b74cb75e3572d83708745e81e24d37837415f9 ]

    Fix to return negative error code -ENOMEM from the smcd_alloc_dev()
    error handling case instead of 0, as done elsewhere in this function.

    Fixes: 684b89bc39ce ("s390/ism: add device driver for internal shared memory")
    Reported-by: Hulk Robot
    Signed-off-by: Wei Yongjun
    Signed-off-by: Ursula Braun
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Wei Yongjun
     

29 Apr, 2020

2 commits

  • [ Upstream commit 05ce3e53f375295c2940390b2b429e506e07655c ]

    The common I/O layer delays the ADD uevent for subchannels and
    delegates generating this uevent to the individual subchannel
    drivers. The io_subchannel driver will do so when the associated
    ccw_device has been registered -- but unconditionally, so more
    ADD uevents will be generated if a subchannel has been unbound
    from the io_subchannel driver and later rebound.

    To fix this, only generate the ADD event if uevents were still
    suppressed for the device.

    Fixes: fa1a8c23eb7d ("s390: cio: Delay uevents for subchannels")
    Message-Id:
    Reported-by: Boris Fiuczynski
    Reviewed-by: Peter Oberparleiter
    Reviewed-by: Boris Fiuczynski
    Signed-off-by: Cornelia Huck
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Cornelia Huck
     
  • [ Upstream commit 2bc55eaeb88d30accfc1b6ac2708d4e4b81ca260 ]

    The common I/O layer delays the ADD uevent for subchannels and
    delegates generating this uevent to the individual subchannel
    drivers. The vfio-ccw I/O subchannel driver, however, did not
    do that, and will not generate an ADD uevent for subchannels
    that had not been bound to a different driver (or none at all,
    which also triggers the uevent).

    Generate the ADD uevent at the end of the probe function if
    uevents were still suppressed for the device.

    Message-Id:
    Fixes: 63f1934d562d ("vfio: ccw: basic implementation for vfio_ccw driver")
    Reviewed-by: Eric Farman
    Signed-off-by: Cornelia Huck
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Cornelia Huck
     

17 Apr, 2020

1 commit

  • commit 819732be9fea728623e1ed84eba28def7384ad1f upstream.

    v2.6.27 commit cc8c282963bd ("[SCSI] zfcp: Automatically attach remote
    ports") introduced zfcp automatic port scan.

    Before that, the user had to use the sysfs attribute "port_add" of an FCP
    device (adapter) to add and open remote (target) ports, even for the remote
    peer port in point-to-point topology. That code path did a proper port open
    recovery trigger taking the erp_lock.

    Since above commit, a new helper function zfcp_erp_open_ptp_port()
    performed an UNlocked port open recovery trigger. This can race with other
    parallel recovery triggers. In zfcp_erp_action_enqueue() this could corrupt
    e.g. adapter->erp_total_count or adapter->erp_ready_head.

    As already found for fabric topology in v4.17 commit fa89adba1941 ("scsi:
    zfcp: fix infinite iteration on ERP ready list"), there was an endless loop
    during tracing of rport (un)block. A subsequent v4.18 commit 9e156c54ace3
    ("scsi: zfcp: assert that the ERP lock is held when tracing a recovery
    trigger") introduced a lockdep assertion for that case.

    As a side effect, that lockdep assertion now uncovered the unlocked code
    path for PtP. It is from within an adapter ERP action:

    zfcp_erp_strategy[1479] intentionally DROPs erp lock around
    zfcp_erp_strategy_do_action()
    zfcp_erp_strategy_do_action[1441] NO erp lock
    zfcp_erp_adapter_strategy[876] NO erp lock
    zfcp_erp_adapter_strategy_open[855] NO erp lock
    zfcp_erp_adapter_strategy_open_fsf[806]NO erp lock
    zfcp_erp_adapter_strat_fsf_xconf[772] erp lock only around
    zfcp_erp_action_to_running(),
    BUT *_not_* around
    zfcp_erp_enqueue_ptp_port()
    zfcp_erp_enqueue_ptp_port[728] BUG: *_not_* taking erp lock
    _zfcp_erp_port_reopen[432] assumes to be called with erp lock
    zfcp_erp_action_enqueue[314] assumes to be called with erp lock
    zfcp_dbf_rec_trig[288] _checks_ to be called with erp lock:
    lockdep_assert_held(&adapter->erp_lock);

    It causes the following lockdep warning:

    WARNING: CPU: 2 PID: 775 at drivers/s390/scsi/zfcp_dbf.c:288
    zfcp_dbf_rec_trig+0x16a/0x188
    no locks held by zfcperp0.0.17c0/775.

    Fix this by using the proper locked recovery trigger helper function.

    Link: https://lore.kernel.org/r/20200312174505.51294-2-maier@linux.ibm.com
    Fixes: cc8c282963bd ("[SCSI] zfcp: Automatically attach remote ports")
    Cc: #v2.6.27+
    Reviewed-by: Jens Remus
    Reviewed-by: Benjamin Block
    Signed-off-by: Steffen Maier
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Steffen Maier
     

01 Apr, 2020

2 commits

  • [ Upstream commit 17413852804d7e86e6f0576cca32c1541817800e ]

    qeth_init_qdio_queues() fills the RX ring with an initial set of
    RX buffers. If qeth_init_input_buffer() fails to back one of the RX
    buffers with memory, we need to bail out and report the error.

    Fixes: 4a71df50047f ("qeth: new qeth device driver")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • [ Upstream commit 240c1948491b81cfe40f84ea040a8f2a4966f101 ]

    When an OSA device in prio-queue setup is reduced to 1 TX queue due to
    HW restrictions, we reset its the default_out_queue to 0.

    In the old code this was needed so that qeth_get_priority_queue() gets
    the queue selection right. But with proper multiqueue support we already
    reduced dev->real_num_tx_queues to 1, and so the stack puts all traffic
    on txq 0 without even calling .ndo_select_queue.

    Thus we can preserve the user's configuration, and apply it if the OSA
    device later re-gains support for multiple TX queues.

    Fixes: 73dc2daf110f ("s390/qeth: add TX multiqueue support for OSA devices")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     

18 Mar, 2020

1 commit

  • commit 5e6bdd37c5526ef01326df5dabb93011ee89237e upstream.

    Devices are formatted in multiple of tracks.
    For an Extent Space Efficient (ESE) volume we get errors when accessing
    unformatted tracks. In this case the driver either formats the track on
    the flight for write requests or returns zero data for read requests.

    In case a request spans multiple tracks, the indication of an unformatted
    track presented for the first track is incorrectly applied to all tracks
    covered by the request. As a result, tracks containing data will be handled
    as empty, resulting in zero data being returned on read, or overwriting
    existing data with zero on write.

    Fix by determining the track that gets the NRF error.
    For write requests only format the track that is surely not formatted.
    For Read requests all tracks before have returned valid data and should not
    be touched.
    All tracks after the unformatted track might be formatted or not. Those are
    returned to the blocklayer to build a new request.

    When using alias devices there is a chance that multiple write requests
    trigger a format of the same track which might lead to data loss. Ensure
    that a track is formatted only once by maintaining a list of currently
    processed tracks.

    Fixes: 5e2b17e712cf ("s390/dasd: Add dynamic formatting support for ESE volumes")
    Cc: stable@vger.kernel.org # 5.3+
    Signed-off-by: Stefan Haberland
    Reviewed-by: Jan Hoeppner
    Reviewed-by: Peter Oberparleiter
    Signed-off-by: Jens Axboe
    Signed-off-by: Greg Kroah-Hartman

    Stefan Haberland
     

12 Mar, 2020

2 commits

  • [ Upstream commit e9091ffd6a0aaced111b5d6ead5eaab5cd7101bc ]

    As the comment says, sl->sbal holds an absolute address. qeth currently
    solves this through wild casting, while zfcp doesn't care.

    Handle this properly in the code that actually builds the SL.

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Alexandra Winter
    Reviewed-by: Steffen Maier [for qdio]
    Reviewed-by: Benjamin Block
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • [ Upstream commit 8b101a5e14f2161869636ff9cb4907b7749dc0c2 ]

    if seq_file .next fuction does not change position index,
    read after some lseek can generate unexpected output.

    Link: https://bugzilla.kernel.org/show_bug.cgi?id=206283
    Link: https://lore.kernel.org/r/d44c53a7-9bc1-15c7-6d4a-0c10cb9dffce@virtuozzo.com
    Reviewed-by: Cornelia Huck
    Signed-off-by: Christian Borntraeger
    Signed-off-by: Vasily Averin
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Vasily Averin
     

05 Mar, 2020

2 commits

  • commit 6f3846f0955308b6d1b219419da42b8de2c08845 upstream.

    When getting or setting VNICC parameters, the error code EOPNOTSUPP
    should have precedence over EBUSY.

    EBUSY is used because vnicc feature and bridgeport feature are mutually
    exclusive, which is a temporary condition.
    Whereas EOPNOTSUPP indicates that the HW does not support all or parts of
    the vnicc feature.
    This issue causes the vnicc sysfs params to show 'blocked by bridgeport'
    for HW that does not support VNICC at all.

    Fixes: caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
    Signed-off-by: Alexandra Winter
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexandra Winter
     
  • [ Upstream commit fcd98d4002539f1e381916fc1b6648938c1eac76 ]

    The internal statistic counters for the total number of
    requests processed per card and per queue used integers. So they do
    wrap after a rather huge amount of crypto requests processed. This
    patch introduces uint64 counters which should hold much longer but
    still may wrap. The sysfs attributes request_count for card and queue
    also used only %ld and now display the counter value with %llu.

    This is not a security relevant fix. The int overflow which happened
    is not in any way exploitable as a security breach.

    Signed-off-by: Harald Freudenberger
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Harald Freudenberger
     

20 Feb, 2020

1 commit

  • commit aab73d278d49c718b722ff5052e16c9cddf144d4 upstream.

    The pkey ioctl call PKEY_SEC2PROTK updates a struct pkey_protkey
    on return. The protected key is stored in, the protected key type
    is stored in but the len information was not updated. This patch
    now fixes this and so the len field gets an update to refrect
    the actual size of the protected key value returned.

    Fixes: efc598e6c8a9 ("s390/zcrypt: move cca misc functions to new code file")
    Cc: Stable
    Signed-off-by: Harald Freudenberger
    Reported-by: Christian Rund
    Suggested-by: Ingo Franzki
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Greg Kroah-Hartman

    Harald Freudenberger
     

06 Feb, 2020

1 commit

  • [ Upstream commit 0c874cd04292c7ee22d70eefc341fa2648f41f46 ]

    This patch moves the reset invocation of an ap device when
    fresh detected from the ap bus to the probe() function of
    the driver responsible for this device.

    The virtualisation of ap devices makes it necessary to
    remove unconditioned resets on fresh appearing apqn devices.
    It may be that such a device is already enabled for guest
    usage. So there may be a race condition between host ap bus
    and guest ap bus doing the reset. This patch moves the
    reset from the ap bus to the zcrypt drivers. So if there
    is no zcrypt driver bound to an ap device - for example
    the ap device is bound to the vfio device driver - the
    ap device is untouched passed to the vfio device driver.

    Signed-off-by: Harald Freudenberger
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Harald Freudenberger
     

26 Jan, 2020

2 commits

  • [ Upstream commit f9e50b02a99c3ebbaa30690e8d5be28a5c2624eb ]

    The cio layer's intparm logic does not align itself well with how qeth
    manages cmd IOs. When an active IO gets terminated via halt/clear, the
    corresponding IRQ's intparm does not reflect the cmd buffer but rather
    the intparm that was passed to ccw_device_halt() / ccw_device_clear().
    This behaviour was recently clarified in
    commit b91d9e67e50b ("s390/cio: fix intparm documentation").

    As a result, qeth_irq() currently doesn't cancel a cmd that was
    terminated via halt/clear. This primarily causes us to leak
    card->read_cmd after the qeth device is removed, since our IO path still
    holds a refcount for this cmd.

    For qeth this means that we need to keep track of which IO is pending on
    a device ('active_cmd'), and use this as the intparm when calling
    halt/clear. Otherwise qeth_irq() can't match the subsequent IRQ to its
    cmd buffer.
    Since we now keep track of the _expected_ intparm, we can also detect
    any mismatch; this would constitute a bug somewhere in the lower layers.
    In this case cancel the active cmd - we effectively "lost" the IRQ and
    should not expect any further notification for this IO.

    Fixes: 405548959cc7 ("s390/qeth: add support for dynamically allocated cmds")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • commit f9cac4fd8878929c6ebff0bd272317905d77c38a upstream.

    Fixes: f2bbc96e7cfad ("s390/pkey: add CCA AES cipher key support")
    Reported-by: Markus Elfring
    Reported-by: Christian Borntraeger
    Signed-off-by: Heiko Carstens
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Greg Kroah-Hartman

    Heiko Carstens
     

23 Jan, 2020

1 commit

  • commit 94dd3bada53ee77b80d0aeee5571eeb83654d156 upstream.

    Regression tests showed that the CCA cipher key function which
    generates an CCA cipher key with given clear key value does not work
    correctly. At parsing the reply CPRB two limits are wrong calculated
    resulting in rejecting the reply as invalid with s390dbf message
    "_ip_cprb_helper reply with invalid or unknown key block".

    Fixes: f2bbc96e7cfa ("s390/pkey: add CCA AES cipher key support")
    Cc: Stable
    Signed-off-by: Harald Freudenberger
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Greg Kroah-Hartman

    Harald Freudenberger
     

18 Jan, 2020

6 commits

  • [ Upstream commit 5b6c7b55cfe26224b0f41b1c226d3534c542787f ]

    qeth_l3_dev_hsuid_store() initially checks the card state, but doesn't
    take the conf_mutex to ensure that the card stays in this state while
    being reconfigured.

    Rework the code to take this lock, and drop a redundant state check in a
    helper function.

    Fixes: b333293058aa ("qeth: add support for af_iucv HiperSockets transport")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • commit 0b698c838e84149b690c7e979f78cccb6f8aa4b9 upstream.

    I stumbled over an old OSA model that claims to support DIAG_ASSIST,
    but then rejects the cmd to query its DIAG capabilities.

    In the old code this was ok, as the returned raw error code was > 0.
    Now that we translate the raw codes to errnos, the "rc < 0" causes us
    to fail the initialization of the device.

    The fix is trivial: don't bail out when the DIAG query fails. Such an
    error is not critical, we can still use the device (with a slightly
    reduced set of features).

    Fixes: 742d4d40831d ("s390/qeth: convert remaining legacy cmd callbacks")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Julian Wiedmann
     
  • commit d1b9ae1864fc3c000e0eb4af8482d78c63e0915a upstream.

    During vnicc_init wanted_char should be compared to cur_char and not
    to QETH_VNICC_DEFAULT. Without this patch there is no way to enforce
    the default values as desired values.

    Note, that it is expected, that a card comes online with default values.
    This patch was tested with private card firmware.

    Fixes: caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
    Signed-off-by: Alexandra Winter
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexandra Winter
     
  • commit e8a66d800471e2df7f0b484e2e46898b21d1fa82 upstream.

    Symptom: After vnicc/rx_bcast has been manually set to 0,
    bridge_* sysfs parameters can still be set or written.
    Only occurs on HiperSockets, as OSA doesn't support changing rx_bcast.

    Vnic characteristics and bridgeport settings are mutually exclusive.
    rx_bcast defaults to 1, so manually setting it to 0 should disable
    bridge_* parameters.

    Instead it makes sense here to check the supported mask. If the card
    does not support vnicc at all, bridge commands are always allowed.

    Fixes: caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
    Signed-off-by: Alexandra Winter
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexandra Winter
     
  • commit 68c57bfd52836e31bff33e5e1fc64029749d2c35 upstream.

    Symptom: Error message "Configuring the VNIC characteristics failed"
    in dmesg whenever an OSA interface on z15 is set online.

    The VNIC characteristics get re-programmed when setting a L2 device
    online. This follows the selected 'wanted' characteristics - with the
    exception that the INVISIBLE characteristic unconditionally gets
    switched off.

    For devices that don't support INVISIBLE (ie. OSA), the resulting
    IO failure raises a noisy error message
    ("Configuring the VNIC characteristics failed").
    For IQD, INVISIBLE is off by default anyways.

    So don't unnecessarily special-case the INVISIBLE characteristic, and
    thereby suppress the misleading error message on OSA devices.

    Fixes: caa1f0b10d18 ("s390/qeth: add VNICC enable/disable support")
    Signed-off-by: Alexandra Winter
    Reviewed-by: Julian Wiedmann
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Alexandra Winter
     
  • commit 8b5026bc16938920e4780b9094c3bf20e1e0939d upstream.

    qeth_l?_set_online() goes through a number of initialization steps, and
    on any error uses qeth_l?_stop_card() to tear down the residual state.

    The first initialization step is qeth_core_hardsetup_card(). When this
    fails after having established a QDIO context on the device
    (ie. somewhere after qeth_mpc_initialize()), qeth_l?_stop_card() doesn't
    shut down this QDIO context again (since the card state hasn't
    progressed from DOWN at this stage).

    Even worse, we then call qdio_free() as final teardown step to free the
    QDIO data structures - while some of them are still hooked into wider
    QDIO infrastructure such as the IRQ list. This is inevitably followed by
    use-after-frees and other nastyness.

    Fix this by unconditionally calling qeth_qdio_clear_card() to shut down
    the QDIO context, and also to halt/clear any pending activity on the
    various IO channels.
    Remove the naive attempt at handling the teardown in
    qeth_mpc_initialize(), it clearly doesn't suffice and we're handling it
    properly now in the wider teardown code.

    Fixes: 4a71df50047f ("qeth: new qeth device driver")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Greg Kroah-Hartman

    Julian Wiedmann
     

12 Jan, 2020

5 commits

  • [ Upstream commit 39bdbf3e648d801596498a5a625fbc9fc1c0002f ]

    ENOTSUPP is not uapi, use EOPNOTSUPP instead.

    Fixes: d66cb37e9664 ("qeth: Add new priority queueing options")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • [ Upstream commit 0f399305cd31e5c813086eaa264f7f47e205c10e ]

    When managing the promiscuous mode during an RX modeset, qeth caches the
    current HW state to avoid repeated programming of the same state on each
    modeset.

    But while tearing down a device, we forget to clear the cached state. So
    when the device is later set online again, the initial RX modeset
    doesn't program the promiscuous mode since we believe it is already
    enabled.
    Fix this by clearing the cached state in the tear-down path.

    Note that for the SBP variant of promiscuous mode, this accidentally
    works right now because we unconditionally restore the SBP role while
    re-initializing.

    Fixes: 4a71df50047f ("qeth: new qeth device driver")
    Signed-off-by: Julian Wiedmann
    Reviewed-by: Alexandra Winter
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • [ Upstream commit 2e3d7fa5d29b7ab649fdf8f9533ae0c0888a7fac ]

    Along with z/VM NICs, there's additional device types that only support
    a specific transport mode (eg. external-bridged IQD).
    Identify the corresponding error code, and raise a fitting error message
    so that the user knows to adjust their device configuration.

    On top of that also fix the subsequent error path, so that the rejected
    cmd doesn't need to wait for a timeout but gets cancelled straight away.

    Fixes: 4a71df50047f ("qeth: new qeth device driver")
    Signed-off-by: Julian Wiedmann
    Signed-off-by: David S. Miller
    Signed-off-by: Sasha Levin

    Julian Wiedmann
     
  • [ Upstream commit 00b39f698a4f1ee897227cace2e3937fc4412270 ]

    If for whatever reason the dasd_eckd_check_characteristics() function
    exits after at least some paths have their configuration data
    allocated those data is never freed again. In the error case the
    device->private pointer is set to NULL and dasd_eckd_uncheck_device()
    will exit without freeing the path data because of this NULL pointer.

    Fix by calling dasd_eckd_clear_conf_data() for error cases.

    Also use dasd_eckd_clear_conf_data() in dasd_eckd_uncheck_device()
    to avoid code duplication.

    Reported-by: Qian Cai
    Reviewed-by: Jan Hoeppner
    Signed-off-by: Stefan Haberland
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Stefan Haberland
     
  • [ Upstream commit dd4b3c83b9efac10d48a94c61372119fc555a077 ]

    The max data count (mdc) is an unsigned 16-bit integer value as per AR
    documentation and is received via ccw_device_get_mdc() for a specific
    path mask from the CIO layer. The function itself also always returns a
    positive mdc value or 0 in case mdc isn't supported or couldn't be
    determined.

    Though, the comment for this function describes a negative return value
    to indicate failures.

    As a result, the DASD device driver interprets the return value of
    ccw_device_get_mdc() incorrectly. The error case is essentially a dead
    code path.

    To fix this behaviour, check explicitly for a return value of 0 and
    change the comment for ccw_device_get_mdc() accordingly.

    This fix merely enables the error code path in the DASD functions
    get_fcx_max_data() and verify_fcx_max_data(). The actual functionality
    stays the same and is still correct.

    Reviewed-by: Cornelia Huck
    Signed-off-by: Jan Höppner
    Acked-by: Peter Oberparleiter
    Reviewed-by: Stefan Haberland
    Signed-off-by: Stefan Haberland
    Signed-off-by: Jens Axboe
    Signed-off-by: Sasha Levin

    Jan Höppner
     

05 Jan, 2020

1 commit

  • [ Upstream commit 6733775a92eacd612ac88afa0fd922e4ffeb2bc7 ]

    This patch introduces support for a new architectured reply
    code 0x8B indicating that a hypervisor layer (if any) has
    rejected an ap message.

    Linux may run as a guest on top of a hypervisor like zVM
    or KVM. So the crypto hardware seen by the ap bus may be
    restricted by the hypervisor for example only a subset like
    only clear key crypto requests may be supported. Other
    requests will be filtered out - rejected by the hypervisor.
    The new reply code 0x8B will appear in such cases and needs
    to get recognized by the ap bus and zcrypt device driver zoo.

    Signed-off-by: Harald Freudenberger
    Signed-off-by: Vasily Gorbik
    Signed-off-by: Sasha Levin

    Harald Freudenberger
     

18 Dec, 2019

1 commit

  • commit 100843f176109af94600e500da0428e21030ca7f upstream.

    While v2.6.26 commit b75db73159cc ("[SCSI] zfcp: Add qtcb dump to hba debug
    trace") is right that we don't want to flood the (payload) trace ring
    buffer, we don't trace successful FCP command responses by default. So we
    can include the channel log for problem determination with failed responses
    of any FSF request type.

    Fixes: b75db73159cc ("[SCSI] zfcp: Add qtcb dump to hba debug trace")
    Fixes: a54ca0f62f95 ("[SCSI] zfcp: Redesign of the debug tracing for HBA records.")
    Cc: #2.6.38+
    Link: https://lore.kernel.org/r/e37597b5c4ae123aaa85fd86c23a9f71e994e4a9.1572018132.git.bblock@linux.ibm.com
    Reviewed-by: Benjamin Block
    Signed-off-by: Steffen Maier
    Signed-off-by: Benjamin Block
    Signed-off-by: Martin K. Petersen
    Signed-off-by: Greg Kroah-Hartman

    Steffen Maier