06 Apr, 2020

2 commits

  • Polling drivers in a configuration with 1 Input Queue currently keep
    their DSCI armed all the way through the poll cycle, until
    qdio_start_irq() clears it.

    _Any_ intermittent QDIO interrupt delivered to tiqdio_thinint_handler()
    will thus cause
    1) the 'adapter_int' statistic to be incremented,
    2) a call to tiqdio_call_inq_handlers() for this device, and then
    3) the 'int_discarded' statistics to be incremented.

    This causes overhead & complexity in the IRQ path, along with ambiguity
    in the statistics.
    On the other hand the device should be in IRQ avoidance mode during a
    poll cycle, so there won't be a lot of DSCI ping-pong that this
    micro-optimization could prevent.

    So align the DSCI handling with what we already do for devices with
    multiple Input Queues: clear it right away while processing the IRQ.

    For the non-polling path this means that we no longer need to handle
    the 1-queue case separately.

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Benjamin Block
    Signed-off-by: Vasily Gorbik

    Julian Wiedmann
     
  • This is just prep work for a subsequent patch, no functional change.

    For the non-polling path we can pull the code chunk in front of the
    for-loop, since it only evaluates to true for a 1-queue configuration.

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Benjamin Block
    Signed-off-by: Vasily Gorbik

    Julian Wiedmann
     

05 Apr, 2020

1 commit

  • Pull s390 updates from Vasily Gorbik:

    - Update maintainers. Niklas Schnelle takes over zpci and Vineeth
    Vijayan common io code.

    - Extend cpuinfo to include topology information.

    - Add new extended counters for IBM z15 and sampling buffer allocation
    rework in perf code.

    - Add control over zeroing out memory during system restart.

    - CCA protected key block version 2 support and other
    fixes/improvements in crypto code.

    - Convert to new fallthrough; annotations.

    - Replace zero-length arrays with flexible-arrays.

    - QDIO debugfs and other small improvements.

    - Drop 2-level paging support optimization for compat tasks. Varios mm
    cleanups.

    - Remove broken and unused hibernate / power management support.

    - Remove fake numa support which does not bring any benefits.

    - Exclude offline CPUs from CPU topology masks to be more consistent
    with other architectures.

    - Prevent last branching instruction address leaking to userspace.

    - Other small various fixes and improvements all over the code.

    * tag 's390-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (57 commits)
    s390/mm: cleanup init_new_context() callback
    s390/mm: cleanup virtual memory constants usage
    s390/mm: remove page table downgrade support
    s390/qdio: set qdio_irq->cdev at allocation time
    s390/qdio: remove unused function declarations
    s390/ccwgroup: remove pm support
    s390/ap: remove power management code from ap bus and drivers
    s390/zcrypt: use kvmalloc instead of kmalloc for 256k alloc
    s390/mm: cleanup arch_get_unmapped_area() and friends
    s390/ism: remove pm support
    s390/cio: use fallthrough;
    s390/vfio: use fallthrough;
    s390/zcrypt: use fallthrough;
    s390: use fallthrough;
    s390/cpum_sf: Fix wrong page count in error message
    s390/diag: fix display of diagnose call statistics
    s390/ap: Remove ap device suspend and resume callbacks
    s390/pci: Improve handling of unset UID
    s390/pci: Fix zpci_alloc_domain() over allocation
    s390/qdio: pass ISC as parameter to chsc_sadc()
    ...

    Linus Torvalds
     

26 Mar, 2020

1 commit

  • When the support for polling drivers was initially added, it only
    considered Input Queue 0. But as QDIO interrupts are actually for the
    full device and not a single queue, this doesn't really fit for
    configurations where multiple Input Queues are used.

    Rework the qdio code so that interrupts for a polling driver are not
    split up into actions for each queue. Instead deliver the interrupt as
    a single event, and let the driver decide which queue needs what action.

    When re-enabling the QDIO interrupt via qdio_start_irq(), this means
    that the qdio code needs to
    (1) put _all_ eligible queues back into a state where they raise IRQs,
    (2) and afterwards check _all_ eligible queues for new work to bridge
    the race window.

    On the qeth side of things (as the only qdio polling driver), we can now
    add CQ polling support to the main NAPI poll routine. It doesn't consume
    NAPI budget, and to avoid hogging the CPU we yield control after
    completing one full queue worth of buffers.
    The subsequent qdio_start_irq() will check for any additional work, and
    have us re-schedule the NAPI instance accordingly.

    Signed-off-by: Julian Wiedmann
    Acked-by: Heiko Carstens
    Signed-off-by: David S. Miller

    Julian Wiedmann
     

23 Mar, 2020

1 commit


01 Nov, 2019

3 commits

  • On an interrupt, tiqdio_thinint_handler() walks a list of all objects
    that might require attention, and checks their DSCI. This list is
    awkwardly built from Input Queues, even though the IRQs are per-device
    and the queue is then only used to dereference its qdio_irq parent.

    To simplify the logic, change the code so that tiq_list contains
    qdio_irq entries.

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Benjamin Block
    Signed-off-by: Vasily Gorbik

    Julian Wiedmann
     
  • qperf_inc() takes a queue as input, but actually updates the statistics
    in its qdio_irq parent.
    In some contexts we already have access to the qdio_irq struct, and can
    avoid the additional dereference.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: Vasily Gorbik

    Julian Wiedmann
     
  • Shift the definition of tiqdio_airq around, so that it doesn't require a
    forward declaration for tiqdio_thinint_handler().

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Benjamin Block
    Signed-off-by: Vasily Gorbik

    Julian Wiedmann
     

02 Jul, 2019

2 commits

  • Current code sets the dsci to 0x00000080. Which doesn't make any sense,
    as the indicator area is located in the _left-most_ byte.

    Worse: if the dsci is the _shared_ indicator, this potentially clears
    the indication of activity for a _different_ device.
    tiqdio_thinint_handler() will then have no reason to call that device's
    IRQ handler, and the device ends up stalling.

    Fixes: d0c9d4a89fff ("[S390] qdio: set correct bit in dsci")
    Cc:
    Signed-off-by: Julian Wiedmann
    Signed-off-by: Vasily Gorbik

    Julian Wiedmann
     
  • When tiqdio_remove_input_queues() removes a queue from the tiq_list as
    part of qdio_shutdown(), it doesn't re-initialize the queue's list entry
    and the prev/next pointers go stale.

    If a subsequent qdio_establish() fails while sending the ESTABLISH cmd,
    it calls qdio_shutdown() again in QDIO_IRQ_STATE_ERR state and
    tiqdio_remove_input_queues() will attempt to remove the queue entry a
    second time. This dereferences the stale pointers, and bad things ensue.
    Fix this by re-initializing the list entry after removing it from the
    list.

    For good practice also initialize the list entry when the queue is first
    allocated, and remove the quirky checks that papered over this omission.
    Note that prior to
    commit e521813468f7 ("s390/qdio: fix access to uninitialized qdio_q fields"),
    these checks were bogus anyway.

    setup_queues_misc() clears the whole queue struct, and thus needs to
    re-init the prev/next pointers as well.

    Fixes: 779e6e1c724d ("[S390] qdio: new qdio driver.")
    Cc:
    Signed-off-by: Julian Wiedmann
    Signed-off-by: Vasily Gorbik

    Julian Wiedmann
     

07 Jun, 2019

1 commit


29 Apr, 2019

1 commit


13 Jun, 2018

1 commit

  • The kzalloc() function has a 2-factor argument form, kcalloc(). This
    patch replaces cases of:

    kzalloc(a * b, gfp)

    with:
    kcalloc(a * b, gfp)

    as well as handling cases of:

    kzalloc(a * b * c, gfp)

    with:

    kzalloc(array3_size(a, b, c), gfp)

    as it's slightly less ugly than:

    kzalloc_array(array_size(a, b), c, gfp)

    This does, however, attempt to ignore constant size factors like:

    kzalloc(4 * 1024, gfp)

    though any constants defined via macros get caught up in the conversion.

    Any factors with a sizeof() of "unsigned char", "char", and "u8" were
    dropped, since they're redundant.

    The Coccinelle script used for this was:

    // Fix redundant parens around sizeof().
    @@
    type TYPE;
    expression THING, E;
    @@

    (
    kzalloc(
    - (sizeof(TYPE)) * E
    + sizeof(TYPE) * E
    , ...)
    |
    kzalloc(
    - (sizeof(THING)) * E
    + sizeof(THING) * E
    , ...)
    )

    // Drop single-byte sizes and redundant parens.
    @@
    expression COUNT;
    typedef u8;
    typedef __u8;
    @@

    (
    kzalloc(
    - sizeof(u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * (COUNT)
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(__u8) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(char) * COUNT
    + COUNT
    , ...)
    |
    kzalloc(
    - sizeof(unsigned char) * COUNT
    + COUNT
    , ...)
    )

    // 2-factor product with sizeof(type/expression) and identifier or constant.
    @@
    type TYPE;
    expression THING;
    identifier COUNT_ID;
    constant COUNT_CONST;
    @@

    (
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_ID)
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_ID
    + COUNT_ID, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (COUNT_CONST)
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * COUNT_CONST
    + COUNT_CONST, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_ID)
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_ID
    + COUNT_ID, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (COUNT_CONST)
    + COUNT_CONST, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * COUNT_CONST
    + COUNT_CONST, sizeof(THING)
    , ...)
    )

    // 2-factor product, only identifiers.
    @@
    identifier SIZE, COUNT;
    @@

    - kzalloc
    + kcalloc
    (
    - SIZE * COUNT
    + COUNT, SIZE
    , ...)

    // 3-factor product with 1 sizeof(type) or sizeof(expression), with
    // redundant parens removed.
    @@
    expression THING;
    identifier STRIDE, COUNT;
    type TYPE;
    @@

    (
    kzalloc(
    - sizeof(TYPE) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(TYPE))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * (COUNT) * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * (STRIDE)
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    |
    kzalloc(
    - sizeof(THING) * COUNT * STRIDE
    + array3_size(COUNT, STRIDE, sizeof(THING))
    , ...)
    )

    // 3-factor product with 2 sizeof(variable), with redundant parens removed.
    @@
    expression THING1, THING2;
    identifier COUNT;
    type TYPE1, TYPE2;
    @@

    (
    kzalloc(
    - sizeof(TYPE1) * sizeof(TYPE2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(THING1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(THING1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * COUNT
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    |
    kzalloc(
    - sizeof(TYPE1) * sizeof(THING2) * (COUNT)
    + array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
    , ...)
    )

    // 3-factor product, only identifiers, with redundant parens removed.
    @@
    identifier STRIDE, SIZE, COUNT;
    @@

    (
    kzalloc(
    - (COUNT) * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * STRIDE * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - (COUNT) * (STRIDE) * (SIZE)
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    |
    kzalloc(
    - COUNT * STRIDE * SIZE
    + array3_size(COUNT, STRIDE, SIZE)
    , ...)
    )

    // Any remaining multi-factor products, first at least 3-factor products,
    // when they're not all constants...
    @@
    expression E1, E2, E3;
    constant C1, C2, C3;
    @@

    (
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(
    - (E1) * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * E3
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - (E1) * (E2) * (E3)
    + array3_size(E1, E2, E3)
    , ...)
    |
    kzalloc(
    - E1 * E2 * E3
    + array3_size(E1, E2, E3)
    , ...)
    )

    // And then all remaining 2 factors products when they're not all constants,
    // keeping sizeof() as the second factor argument.
    @@
    expression THING, E1, E2;
    type TYPE;
    constant C1, C2, C3;
    @@

    (
    kzalloc(sizeof(THING) * C2, ...)
    |
    kzalloc(sizeof(TYPE) * C2, ...)
    |
    kzalloc(C1 * C2 * C3, ...)
    |
    kzalloc(C1 * C2, ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * (E2)
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(TYPE) * E2
    + E2, sizeof(TYPE)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * (E2)
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - sizeof(THING) * E2
    + E2, sizeof(THING)
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * E2
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - (E1) * (E2)
    + E1, E2
    , ...)
    |
    - kzalloc
    + kcalloc
    (
    - E1 * E2
    + E1, E2
    , ...)
    )

    Signed-off-by: Kees Cook

    Kees Cook
     

14 Nov, 2017

1 commit

  • Pull s390 updates from Heiko Carstens:
    "Since Martin is on vacation you get the s390 pull request for the
    v4.15 merge window this time from me.

    Besides a lot of cleanups and bug fixes these are the most important
    changes:

    - a new regset for runtime instrumentation registers

    - hardware accelerated AES-GCM support for the aes_s390 module

    - support for the new CEX6S crypto cards

    - support for FORTIFY_SOURCE

    - addition of missing z13 and new z14 instructions to the in-kernel
    disassembler

    - generate opcode tables for the in-kernel disassembler out of a
    simple text file instead of having to manually maintain those
    tables

    - fast memset16, memset32 and memset64 implementations

    - removal of named saved segment support

    - hardware counter support for z14

    - queued spinlocks and queued rwlocks implementations for s390

    - use the stack_depth tracking feature for s390 BPF JIT

    - a new s390_sthyi system call which emulates the sthyi (store
    hypervisor information) instruction

    - removal of the old KVM virtio transport

    - an s390 specific CPU alternatives implementation which is used in
    the new spinlock code"

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (88 commits)
    MAINTAINERS: add virtio-ccw.h to virtio/s390 section
    s390/noexec: execute kexec datamover without DAT
    s390: fix transactional execution control register handling
    s390/bpf: take advantage of stack_depth tracking
    s390: simplify transactional execution elf hwcap handling
    s390/zcrypt: Rework struct ap_qact_ap_info.
    s390/virtio: remove unused header file kvm_virtio.h
    s390: avoid undefined behaviour
    s390/disassembler: generate opcode tables from text file
    s390/disassembler: remove insn_to_mnemonic()
    s390/dasd: avoid calling do_gettimeofday()
    s390: vfio-ccw: Do not attempt to free no-op, test and tic cda.
    s390: remove named saved segment support
    s390/archrandom: Reconsider s390 arch random implementation
    s390/pci: do not require AIS facility
    s390/qdio: sanitize put_indicator
    s390/qdio: use atomic_cmpxchg
    s390/nmi: avoid using long-displacement facility
    s390: pass endianness info to sparse
    s390/decompressor: remove informational messages
    ...

    Linus Torvalds
     

03 Nov, 2017

2 commits

  • qdio maintains an array of struct indicator_t. put_indicator takes a pointer
    to a member of a struct indicator_t within that array, calculates the index,
    and uses the array and the index to get the struct indicator_t.

    Simply use the pointer directly.

    Although the pointer happens to point to the first member of that struct
    use the container_of macro.

    Signed-off-by: Sebastian Ott
    Acked-by: Ursula Braun
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     
  • qdio uses atomic_read to find an unused indicator and atomic_set to
    flag it as used. This could lead to multiple users getting the same
    indicator. Use atomic_cmpxchg instead.

    Signed-off-by: Sebastian Ott
    Acked-by: Ursula Braun
    Signed-off-by: Martin Schwidefsky

    Sebastian Ott
     

02 Nov, 2017

1 commit

  • Many source files in the tree are missing licensing information, which
    makes it harder for compliance tools to determine the correct license.

    By default all files without license information are under the default
    license of the kernel, which is GPL version 2.

    Update the files which contain no license information with the 'GPL-2.0'
    SPDX license identifier. The SPDX identifier is a legally binding
    shorthand, which can be used instead of the full boiler plate text.

    This patch is based on work done by Thomas Gleixner and Kate Stewart and
    Philippe Ombredanne.

    How this work was done:

    Patches were generated and checked against linux-4.14-rc6 for a subset of
    the use cases:
    - file had no licensing information it it.
    - file was a */uapi/* one with no licensing information in it,
    - file was a */uapi/* one with existing licensing information,

    Further patches will be generated in subsequent months to fix up cases
    where non-standard license headers were used, and references to license
    had to be inferred by heuristics based on keywords.

    The analysis to determine which SPDX License Identifier to be applied to
    a file was done in a spreadsheet of side by side results from of the
    output of two independent scanners (ScanCode & Windriver) producing SPDX
    tag:value files created by Philippe Ombredanne. Philippe prepared the
    base worksheet, and did an initial spot review of a few 1000 files.

    The 4.13 kernel was the starting point of the analysis with 60,537 files
    assessed. Kate Stewart did a file by file comparison of the scanner
    results in the spreadsheet to determine which SPDX license identifier(s)
    to be applied to the file. She confirmed any determination that was not
    immediately clear with lawyers working with the Linux Foundation.

    Criteria used to select files for SPDX license identifier tagging was:
    - Files considered eligible had to be source code files.
    - Make and config files were included as candidates if they contained >5
    lines of source
    - File already had some variant of a license header in it (even if
    Reviewed-by: Philippe Ombredanne
    Reviewed-by: Thomas Gleixner
    Signed-off-by: Greg Kroah-Hartman

    Greg Kroah-Hartman
     

02 Mar, 2017

1 commit


03 Feb, 2017

3 commits

  • Missed in commit f4eae94f7137
    ("s390/airq: simplify adapter interrupt code")

    Signed-off-by: Julian Wiedmann
    Reviewed-by: Steffen Maier
    Signed-off-by: Martin Schwidefsky

    Julian Wiedmann
     
  • In tiqdio_call_inq_handlers(), we're looping over all
    input queues on the *same* irq. So instead of using the
    queues' back pointer, we can just access the irq directly.

    No functional change.

    Signed-off-by: Julian Wiedmann
    Signed-off-by: Martin Schwidefsky

    Julian Wiedmann
     
  • For devices with multiple input queues, tiqdio_call_inq_handlers()
    iterates over all input queues and clears the device's DSCI
    during each iteration. If the DSCI is re-armed during one
    of the later iterations, we therefore do not scan the previous
    queues again.
    The re-arming also raises a new adapter interrupt. But its
    handler does not trigger a rescan for the device, as the DSCI
    has already been erroneously cleared.
    This can result in queue stalls on devices with multiple
    input queues.

    Fix it by clearing the DSCI just once, prior to scanning the queues.

    As the code is moved in front of the loop, we also need to access
    the DSCI directly (ie irq->dsci) instead of going via each queue's
    parent pointer to the same irq. This is not a functional change,
    and a follow-up patch will clean up the other users.

    In practice, this bug only affects CQ-enabled HiperSockets devices,
    ie. devices with sysfs-attribute "hsuid" set. Setting a hsuid is
    needed for AF_IUCV socket applications that use HiperSockets
    communication.

    Fixes: 104ea556ee7f ("qdio: support asynchronous delivery of storage blocks")
    Cc: # v3.2+
    Reviewed-by: Ursula Braun
    Signed-off-by: Julian Wiedmann
    Signed-off-by: Martin Schwidefsky

    Julian Wiedmann