30 Dec, 2020

40 commits

  • commit 85b8350ae99d1300eb6dc072459246c2649a8e50 upstream.

    CAN0 and CAN1 instances share the same message ram configured
    at 0x210000 on sama5d2 Linux systems.
    According to current configuration of CAN0, we need 0x1c00 bytes
    so that the CAN1 don't overlap its message ram:
    64 x RX FIFO0 elements => 64 x 72 bytes
    32 x TXE (TX Event FIFO) elements => 32 x 8 bytes
    32 x TXB (TX Buffer) elements => 32 x 72 bytes
    So a total of 7168 bytes (0x1C00).

    Fix offset to match this needed size.
    Make the CAN0 message ram ioremap match exactly this size so that is
    easily understandable. Adapt CAN1 size accordingly.

    Fixes: bc6d5d7666b7 ("ARM: dts: at91: sama5d2: add m_can nodes")
    Reported-by: Dan Sneddon
    Signed-off-by: Nicolas Ferre
    Signed-off-by: Alexandre Belloni
    Tested-by: Cristian Birsan
    Cc: stable@vger.kernel.org # v4.13+
    Link: https://lore.kernel.org/r/20201203091949.9015-1-nicolas.ferre@microchip.com
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Ferre
     
  • commit df9dbaf2c415cd94ad520067a1eccfee62f00a33 upstream.

    The pinmux control register offset passed to OMAP4_IOPAD is odd.

    Fixes: ab9a13665e7c ("ARM: dts: pandaboard: add gpio user button")
    Cc: stable@vger.kernel.org
    Signed-off-by: H. Nikolaus Schaller
    Signed-off-by: Tony Lindgren
    Signed-off-by: Greg Kroah-Hartman

    H. Nikolaus Schaller
     
  • commit f9081b8ff5934b8d69c748d0200e844cadd2c667 upstream.

    The firmware found in some Qualcomm platforms intercepts writes to S2CR
    in order to replace bypass type streams with fault; and ignore S2CR
    updates of type fault.

    Detect this behavior and implement a custom write_s2cr function in order
    to trick the firmware into supporting bypass streams by the means of
    configuring the stream for translation using a reserved and disabled
    context bank.

    Also circumvent the problem of configuring faulting streams by
    configuring the stream as bypass.

    Cc:
    Signed-off-by: Bjorn Andersson
    Tested-by: Steev Klimaszewski
    Acked-by: Robin Murphy
    Link: https://lore.kernel.org/r/20201019182323.3162386-4-bjorn.andersson@linaro.org
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Bjorn Andersson
     
  • commit 07a7f2caaa5a2619934491bab3c47b261c554fb0 upstream.

    The Qualcomm boot loader configures stream mapping for the peripherals
    that it accesses and in particular it sets up the stream mapping for the
    display controller to be allowed to scan out a splash screen or EFI
    framebuffer.

    Read back the stream mappings during initialization and make the
    arm-smmu driver maintain the streams in bypass mode.

    Cc:
    Signed-off-by: Bjorn Andersson
    Tested-by: Steev Klimaszewski
    Acked-by: Robin Murphy
    Link: https://lore.kernel.org/r/20201019182323.3162386-3-bjorn.andersson@linaro.org
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Bjorn Andersson
     
  • commit 56b75b51ed6d5e7bffda59440404409bca2dff00 upstream.

    The firmware found in some Qualcomm platforms intercepts writes to the
    S2CR register in order to replace the BYPASS type with FAULT. Further
    more it treats faults at this level as catastrophic and restarts the
    device.

    Add support for providing implementation specific versions of the S2CR
    write function, to allow the Qualcomm driver to work around this
    behavior.

    Cc:
    Signed-off-by: Bjorn Andersson
    Tested-by: Steev Klimaszewski
    Reviewed-by: Robin Murphy
    Link: https://lore.kernel.org/r/20201019182323.3162386-2-bjorn.andersson@linaro.org
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Bjorn Andersson
     
  • commit 9d4747d02376aeb8de38afa25430de79129c5799 upstream.

    When both KVM support and the CCP driver are built into the kernel instead
    of as modules, KVM initialization can happen before CCP initialization. As
    a result, sev_platform_status() will return a failure when it is called
    from sev_hardware_setup(), when this isn't really an error condition.

    Since sev_platform_status() doesn't need to be called at this time anyway,
    remove the invocation from sev_hardware_setup().

    Signed-off-by: Tom Lendacky
    Message-Id:
    Cc: stable@vger.kernel.org
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Tom Lendacky
     
  • commit 39485ed95d6b83b62fa75c06c2c4d33992e0d971 upstream.

    Until commit e7c587da1252 ("x86/speculation: Use synthetic bits for
    IBRS/IBPB/STIBP"), KVM was testing both Intel and AMD CPUID bits before
    allowing the guest to write MSR_IA32_SPEC_CTRL and MSR_IA32_PRED_CMD.
    Testing only Intel bits on VMX processors, or only AMD bits on SVM
    processors, fails if the guests are created with the "opposite" vendor
    as the host.

    While at it, also tweak the host CPU check to use the vendor-agnostic
    feature bit X86_FEATURE_IBPB, since we only care about the availability
    of the MSR on the host here and not about specific CPUID bits.

    Fixes: e7c587da1252 ("x86/speculation: Use synthetic bits for IBRS/IBPB/STIBP")
    Cc: stable@vger.kernel.org
    Reported-by: Denis V. Lunev
    Signed-off-by: Paolo Bonzini
    Signed-off-by: Greg Kroah-Hartman

    Paolo Bonzini
     
  • commit ca4e514774930f30b66375a974b5edcbebaf0e7e upstream.

    ARMv8.2 introduced TTBCR2, which shares TCR_EL1 with TTBCR.
    Gracefully handle traps to this register when HCR_EL2.TVM is set.

    Cc: stable@vger.kernel.org
    Reported-by: James Morse
    Signed-off-by: Marc Zyngier
    Signed-off-by: Greg Kroah-Hartman

    Marc Zyngier
     
  • commit f43cadef2df260101497a6aace05e24201f00202 upstream.

    FW has to configure devices' StreamIDs so that SMMU is able to lookup
    context and do proper translation later on. For Armada 7040 & 8040 and
    publicly available FW, most of the devices are configured properly,
    but some like ap_sdhci0, PCIe, NIC still remain unassigned which
    results in SMMU faults about unmatched StreamID (assuming
    ARM_SMMU_DISABLE_BYPASS_BY_DEFAUL=y).

    Since there is dependency on custom FW let SMMU be disabled by default.
    People who still willing to use SMMU need to enable manually and
    use ARM_SMMU_DISABLE_BYPASS_BY_DEFAUL=n (or via kernel command line)
    with extra caution.

    Fixes: 83a3545d9c37 ("arm64: dts: marvell: add SMMU support")
    Cc: # 5.9+
    Signed-off-by: Tomasz Nowicki
    Signed-off-by: Gregory CLEMENT
    Signed-off-by: Greg Kroah-Hartman

    Tomasz Nowicki
     
  • commit 50301e8815c681bc5de8ca7050c4b426923d4e19 upstream.

    DSS is IO coherent on AM65, so we should mark it as such with
    'dma-coherent' property in the DT file.

    Fixes: fc539b90eda2 ("arm64: dts: ti: am654: Add DSS node")
    Signed-off-by: Tomi Valkeinen
    Signed-off-by: Nishanth Menon
    Acked-by: Nikhil Devshatwar
    Cc: stable@vger.kernel.org # v5.8+
    Link: https://lore.kernel.org/r/20201102134650.55321-1-tomi.valkeinen@ti.com
    Signed-off-by: Greg Kroah-Hartman

    Tomi Valkeinen
     
  • commit de043da0b9e71147ca610ed542d34858aadfc61c upstream.

    memblock_enforce_memory_limit accepts the maximum memory size not the
    maximum address that can be handled by kernel. Fix the function invocation
    accordingly.

    Fixes: 1bd14a66ee52 ("RISC-V: Remove any memblock representing unusable memory area")
    Cc: stable@vger.kernel.org
    Reported-by: Bin Meng
    Tested-by: Bin Meng
    Acked-by: Mike Rapoport
    Signed-off-by: Atish Patra
    Signed-off-by: Palmer Dabbelt
    Signed-off-by: Greg Kroah-Hartman

    Atish Patra
     
  • commit b08070eca9e247f60ab39d79b2c25d274750441f upstream.

    ext4_handle_error() with errors=continue mount option can accidentally
    remount the filesystem read-only when the system is rebooting. Fix that.

    Fixes: 1dc1097ff60e ("ext4: avoid panic during forced reboot")
    Signed-off-by: Jan Kara
    Reviewed-by: Andreas Dilger
    Cc: stable@kernel.org
    Link: https://lore.kernel.org/r/20201127113405.26867-2-jack@suse.cz
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit 46e294efc355c48d1dd4d58501aa56dac461792a upstream.

    Xattr code using inodes with large xattr data can end up dropping last
    inode reference (and thus deleting the inode) from places like
    ext4_xattr_set_entry(). That function is called with transaction started
    and so ext4_evict_inode() can deadlock against fs freezing like:

    CPU1 CPU2

    removexattr() freeze_super()
    vfs_removexattr()
    ext4_xattr_set()
    handle = ext4_journal_start()
    ...
    ext4_xattr_set_entry()
    iput(old_ea_inode)
    ext4_evict_inode(old_ea_inode)
    sb->s_writers.frozen = SB_FREEZE_FS;
    sb_wait_write(sb, SB_FREEZE_FS);
    ext4_freeze()
    jbd2_journal_lock_updates()
    -> blocks waiting for all
    handles to stop
    sb_start_intwrite()
    -> blocks as sb is already in SB_FREEZE_FS state

    Generally it is advisable to delete inodes from a separate transaction
    as it can consume quite some credits however in this case it would be
    quite clumsy and furthermore the credits for inode deletion are quite
    limited and already accounted for. So just tweak ext4_evict_inode() to
    avoid freeze protection if we have transaction already started and thus
    it is not really needed anyway.

    Cc: stable@vger.kernel.org
    Fixes: dec214d00e0d ("ext4: xattr inode deduplication")
    Signed-off-by: Jan Kara
    Reviewed-by: Andreas Dilger
    Link: https://lore.kernel.org/r/20201127110649.24730-1-jack@suse.cz
    Signed-off-by: Theodore Ts'o
    Signed-off-by: Greg Kroah-Hartman

    Jan Kara
     
  • commit cca415537244f6102cbb09b5b90db6ae2c953bdd upstream.

    When freeing metadata, we will create an ext4_free_data and
    insert it into the pending free list. After the current
    transaction is committed, the object will be freed.

    ext4_mb_free_metadata() will check whether the area to be freed
    overlaps with the pending free list. If true, return directly. At this
    time, ext4_free_data is leaked. Fortunately, the probability of this
    problem is small, since it only occurs if the file system is corrupted
    such that a block is claimed by more one inode and those inodes are
    deleted within a single jbd2 transaction.

    Signed-off-by: Chunguang Xu
    Link: https://lore.kernel.org/r/1604764698-4269-8-git-send-email-brookxu@tencent.com
    Signed-off-by: Theodore Ts'o
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Chunguang Xu
     
  • commit bc18546bf68e47996a359d2533168d5770a22024 upstream.

    The ext4_find_extent() function never returns NULL, it returns error
    pointers.

    Fixes: 44059e503b03 ("ext4: fast commit recovery path")
    Signed-off-by: Dan Carpenter
    Reviewed-by: Jan Kara
    Link: https://lore.kernel.org/r/20201023112232.GB282278@mwanda
    Signed-off-by: Theodore Ts'o
    Cc: stable@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Dan Carpenter
     
  • commit 7f458a3873ae94efe1f37c8b96c97e7298769e98 upstream.

    When defragmenting we skip ranges that have holes or inline extents, so that
    we don't do unnecessary IO and waste space. We do this check when calling
    should_defrag_range() at btrfs_defrag_file(). However we do it without
    holding the inode's lock. The reason we do it like this is to avoid
    blocking other tasks for too long, that possibly want to operate on other
    file ranges, since after the call to should_defrag_range() and before
    locking the inode, we trigger a synchronous page cache readahead. However
    before we were able to lock the inode, some other task might have punched
    a hole in our range, or we may now have an inline extent there, in which
    case we should not set the range for defrag anymore since that would cause
    unnecessary IO and make us waste space (i.e. allocating extents to contain
    zeros for a hole).

    So after we locked the inode and the range in the iotree, check again if
    we have holes or an inline extent, and if we do, just skip the range.

    I hit this while testing my next patch that fixes races when updating an
    inode's number of bytes (subject "btrfs: update the number of bytes used
    by an inode atomically"), and it depends on this change in order to work
    correctly. Alternatively I could rework that other patch to detect holes
    and flag their range with the 'new delalloc' bit, but this itself fixes
    an efficiency problem due a race that from a functional point of view is
    not harmful (it could be triggered with btrfs/062 from fstests).

    CC: stable@vger.kernel.org # 5.4+
    Reviewed-by: Josef Bacik
    Signed-off-by: Filipe Manana
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Filipe Manana
     
  • commit 27d56e62e4748c2135650c260024e9904b8c1a0a upstream.

    While writing an explanation for the need of the commit_root_sem for
    btrfs_prepare_extent_commit, I realized we have a slight hole that could
    result in leaked space if we have to do the old style caching. Consider
    the following scenario

    commit root
    +----+----+----+----+----+----+----+
    |\\\\| |\\\\|\\\\| |\\\\|\\\\|
    +----+----+----+----+----+----+----+
    0 1 2 3 4 5 6 7

    new commit root
    +----+----+----+----+----+----+----+
    | | | |\\\\| | |\\\\|
    +----+----+----+----+----+----+----+
    0 1 2 3 4 5 6 7

    Prior to this patch, we run btrfs_prepare_extent_commit, which updates
    the last_byte_to_unpin, and then we subsequently run
    switch_commit_roots. In this example lets assume that
    caching_ctl->progress == 1 at btrfs_prepare_extent_commit() time, which
    means that cache->last_byte_to_unpin == 1. Then we go and do the
    switch_commit_roots(), but in the meantime the caching thread has made
    some more progress, because we drop the commit_root_sem and re-acquired
    it. Now caching_ctl->progress == 3. We swap out the commit root and
    carry on to unpin.

    The race can happen like:

    1) The caching thread was running using the old commit root when it
    found the extent for [2, 3);

    2) Then it released the commit_root_sem because it was in the last
    item of a leaf and the semaphore was contended, and set ->progress
    to 3 (value of 'last'), as the last extent item in the current leaf
    was for the extent for range [2, 3);

    3) Next time it gets the commit_root_sem, will start using the new
    commit root and search for a key with offset 3, so it never finds
    the hole for [2, 3).

    So the caching thread never saw [2, 3) as free space in any of the
    commit roots, and by the time finish_extent_commit() was called for
    the range [0, 3), ->last_byte_to_unpin was 1, so it only returned the
    subrange [0, 1) to the free space cache, skipping [2, 3).

    In the unpin code we have last_byte_to_unpin == 1, so we unpin [0,1),
    but do not unpin [2,3). However because caching_ctl->progress == 3 we
    do not see the newly freed section of [2,3), and thus do not add it to
    our free space cache. This results in us missing a chunk of free space
    in memory (on disk too, unless we have a power failure before writing
    the free space cache to disk).

    Fix this by making sure the ->last_byte_to_unpin is set at the same time
    that we swap the commit roots, this ensures that we will always be
    consistent.

    CC: stable@vger.kernel.org # 5.8+
    Reviewed-by: Filipe Manana
    Signed-off-by: Josef Bacik
    [ update changelog with Filipe's review comments ]
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Josef Bacik
     
  • commit 9076dbd5ee837c3882fc42891c14cecd0354a849 upstream.

    While fixing up our ->last_byte_to_unpin locking I noticed that we will
    shorten len based on ->last_byte_to_unpin if we're caching when we're
    adding back the free space. This is correct for the free space, as we
    cannot unpin more than ->last_byte_to_unpin, however we use len to
    adjust the ->bytes_pinned counters and such, which need to track the
    actual pinned usage. This could result in
    WARN_ON(space_info->bytes_pinned) triggering at unmount time.

    Fix this by using a local variable for the amount to add to free space
    cache, and leave len untouched in this case.

    CC: stable@vger.kernel.org # 5.4+
    Reviewed-by: Filipe Manana
    Signed-off-by: Josef Bacik
    Signed-off-by: David Sterba
    Signed-off-by: Greg Kroah-Hartman

    Josef Bacik
     
  • commit 320f9028c7873c3c7710e8e93e5c979f4c857490 upstream.

    The driver did not update its view of the available device buffer space
    until write() was called in task context. This meant that write_room()
    would return 0 even after the device had sent a write-unthrottle
    notification, something which could lead to blocked writers not being
    woken up (e.g. when using OPOST).

    Note that we must also request an unthrottle notification is case a
    write() request fills the device buffer exactly.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Cc: stable
    Acked-by: Sebastian Andrzej Siewior
    Reviewed-by: Greg Kroah-Hartman
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • commit 49fbb8e37a961396a5b6c82937c70df91de45e9d upstream.

    The driver's transmit-unthrottle work was never flushed on disconnect,
    something which could lead to the driver port data being freed while the
    unthrottle work is still scheduled.

    Fix this by cancelling the unthrottle work when shutting down the port.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Cc: stable@vger.kernel.org
    Acked-by: Sebastian Andrzej Siewior
    Reviewed-by: Greg Kroah-Hartman
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • commit 37faf50615412947868c49aee62f68233307f4e4 upstream.

    The driver's deferred write wakeup was never flushed on disconnect,
    something which could lead to the driver port data being freed while the
    wakeup work is still scheduled.

    Fix this by using the usb-serial write wakeup which gets cancelled
    properly on disconnect.

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Cc: stable@vger.kernel.org
    Acked-by: Sebastian Andrzej Siewior
    Reviewed-by: Greg Kroah-Hartman
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • commit c01d2c58698f710c9e13ba3e2d296328606f74fd upstream.

    Make sure to clear the write-busy flag also in case no new data was
    submitted due to lack of device buffer space so that writing is
    resumed once space again becomes available.

    Fixes: 507ca9bc0476 ("[PATCH] USB: add ability for usb-serial drivers to determine if their write urb is currently being used.")
    Cc: stable # 2.6.13
    Acked-by: Sebastian Andrzej Siewior
    Reviewed-by: Greg Kroah-Hartman
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • commit 7353cad7ee4deaefc16e94727e69285563e219f6 upstream.

    The write() callback can be called in interrupt context (e.g. when used
    as a console) so interrupts must be disabled while holding the port lock
    to prevent a possible deadlock.

    Fixes: e81ee637e4ae ("usb-serial: possible irq lock inversion (PPP vs. usb/serial)")
    Fixes: 507ca9bc0476 ("[PATCH] USB: add ability for usb-serial drivers to determine if their write urb is currently being used.")
    Cc: stable # 2.6.19
    Acked-by: Sebastian Andrzej Siewior
    Reviewed-by: Greg Kroah-Hartman
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • commit 696c541c8c6cfa05d65aa24ae2b9e720fc01766e upstream.

    Commit c528fcb116e6 ("USB: serial: keyspan_pda: fix receive sanity
    checks") broke write-unthrottle handling by dropping well-formed
    unthrottle-interrupt packets which are precisely two bytes long. This
    could lead to blocked writers not being woken up when buffer space again
    becomes available.

    Instead, stop unconditionally printing the third byte which is
    (presumably) only valid on modem-line changes.

    Fixes: c528fcb116e6 ("USB: serial: keyspan_pda: fix receive sanity checks")
    Cc: stable # 4.11
    Acked-by: Sebastian Andrzej Siewior
    Reviewed-by: Greg Kroah-Hartman
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • commit 5098e77962e7c8947f87bd8c5869c83e000a522a upstream.

    The driver must not call tty_wakeup() while holding its private lock as
    line disciplines are allowed to call back into write() from
    write_wakeup(), leading to a deadlock.

    Also remove the unneeded work struct that was used to defer wakeup in
    order to work around a possible race in ancient times (see comment about
    n_tty write_chan() in commit 14b54e39b412 ("USB: serial: remove
    changelogs and old todo entries")).

    Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
    Cc: stable@vger.kernel.org
    Reviewed-by: Greg Kroah-Hartman
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • commit 975323ab8f116667676c30ca3502a6757bd89e8d upstream.

    The parallel-port restore operations is called when a driver claims the
    port and is supposed to restore the provided state (e.g. saved when
    releasing the port).

    Fixes: b69578df7e98 ("USB: usbserial: mos7720: add support for parallel port on moschip 7715")
    Cc: stable # 2.6.35
    Reviewed-by: Greg Kroah-Hartman
    Signed-off-by: Johan Hovold
    Signed-off-by: Greg Kroah-Hartman

    Johan Hovold
     
  • commit 3577afb0052fca65e67efdfc8e0859bb7bac87a6 upstream.

    In commit a2d375eda771 ("dyndbg: refine export, rename to
    dynamic_debug_exec_queries()"), a string is copied before checking it
    isn't NULL. Fix this, report a usage/interface error, and return the
    proper error code.

    Fixes: a2d375eda771 ("dyndbg: refine export, rename to dynamic_debug_exec_queries()")
    Cc: stable@vger.kernel.org
    Signed-off-by: Jim Cromie
    Link: https://lore.kernel.org/r/20201209183625.2432329-1-jim.cromie@gmail.com
    Signed-off-by: Greg Kroah-Hartman

    Jim Cromie
     
  • commit 406100f3da08066c00105165db8520bbc7694a36 upstream.

    One of our machines keeled over trying to rebuild the scheduler domains.
    Mainline produces the same splat:

    BUG: unable to handle page fault for address: 0000607f820054db
    CPU: 2 PID: 149 Comm: kworker/1:1 Not tainted 5.10.0-rc1-master+ #6
    Workqueue: events cpuset_hotplug_workfn
    RIP: build_sched_domains
    Call Trace:
    partition_sched_domains_locked
    rebuild_sched_domains_locked
    cpuset_hotplug_workfn

    It happens with cgroup2 and exclusive cpusets only. This reproducer
    triggers it on an 8-cpu vm and works most effectively with no
    preexisting child cgroups:

    cd $UNIFIED_ROOT
    mkdir cg1
    echo 4-7 > cg1/cpuset.cpus
    echo root > cg1/cpuset.cpus.partition

    # with smt/control reading 'on',
    echo off > /sys/devices/system/cpu/smt/control

    RIP maps to

    sd->shared = *per_cpu_ptr(sdd->sds, sd_id);

    from sd_init(). sd_id is calculated earlier in the same function:

    cpumask_and(sched_domain_span(sd), cpu_map, tl->mask(cpu));
    sd_id = cpumask_first(sched_domain_span(sd));

    tl->mask(cpu), which reads cpu_sibling_map on x86, returns an empty mask
    and so cpumask_first() returns >= nr_cpu_ids, which leads to the bogus
    value from per_cpu_ptr() above.

    The problem is a race between cpuset_hotplug_workfn() and a later
    offline of CPU N. cpuset_hotplug_workfn() updates the effective masks
    when N is still online, the offline clears N from cpu_sibling_map, and
    then the worker uses the stale effective masks that still have N to
    generate the scheduling domains, leading the worker to read
    N's empty cpu_sibling_map in sd_init().

    rebuild_sched_domains_locked() prevented the race during the cgroup2
    cpuset series up until the Fixes commit changed its check. Make the
    check more robust so that it can detect an offline CPU in any exclusive
    cpuset's effective mask, not just the top one.

    Fixes: 0ccea8feb980 ("cpuset: Make generate_sched_domains() work with partition")
    Signed-off-by: Daniel Jordan
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Tejun Heo
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20201112171711.639541-1-daniel.m.jordan@oracle.com
    Signed-off-by: Greg Kroah-Hartman

    Daniel Jordan
     
  • commit 706657b1febf446a9ba37dc51b89f46604f57ee9 upstream.

    In order to setup its PCI component, the driver needs any node private
    instance in order to get a reference to the PCI device and hand that
    into edac_pci_create_generic_ctl(). For convenience, it uses the 0th
    memory controller descriptor under the assumption that if any, the 0th
    will be always present.

    However, this assumption goes wrong when the 0th node doesn't have
    memory and the driver doesn't initialize an instance for it:

    EDAC amd64: F17h detected (node 0).
    ...
    EDAC amd64: Node 0: No DIMMs detected.

    But looking up node instances is not really needed - all one needs is
    the pointer to the proper device which gets discovered during instance
    init.

    So stash that pointer into a variable and use it when setting up the
    EDAC PCI component.

    Clear that variable when the driver needs to unwind due to some
    instances failing init to avoid any registration imbalance.

    Cc:
    Signed-off-by: Borislav Petkov
    Link: https://lkml.kernel.org/r/20201122150815.13808-1-bp@alien8.de
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     
  • commit 83ff51c4e3fecf6b8587ce4d46f6eac59f5d7c5a upstream.

    Instead of raw access, use readl() to access MMIO registers of
    memory controller to avoid possible compiler re-ordering.

    Fixes: d4dc89d069aa ("EDAC, i10nm: Add a driver for Intel 10nm server processors")
    Cc:
    Signed-off-by: Qiuxu Zhuo
    Signed-off-by: Tony Luck
    Signed-off-by: Greg Kroah-Hartman

    Qiuxu Zhuo
     
  • commit cf48647243cc28d15280600292db5777592606c5 upstream.

    Sequence counters with an associated write serialization lock are called
    seqcount_LOCKNAME_t. Fix the documentation accordingly.

    While at it, remove a paragraph that inappropriately discussed a
    seqlock.h implementation detail.

    Fixes: 6dd699b13d53 ("seqlock: seqcount_LOCKNAME_t: Standardize naming convention")
    Signed-off-by: Ahmed S. Darwish
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20201206162143.14387-2-a.darwish@linutronix.de
    Signed-off-by: Greg Kroah-Hartman

    Ahmed S. Darwish
     
  • commit a7b5458ce73b235be027cf2658c39b19b7e58cf2 upstream.

    Don't add platform resources that won't be used. This avoids a
    recently-added warning from the driver core, that can show up on a
    multi-platform kernel when !MACH_IS_MAC.

    ------------[ cut here ]------------
    WARNING: CPU: 0 PID: 0 at drivers/base/platform.c:224 platform_get_irq_optional+0x8e/0xce
    0 is an invalid IRQ number
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-multi #1
    Stack from 004b3f04:
    004b3f04 00462c2f 00462c2f 004b3f20 0002e128 004754db 004b6ad4 004b3f4c
    0002e19c 004754f7 000000e0 00285ba0 00000009 00000000 004b3f44 ffffffff
    004754db 004b3f64 004b3f74 00285ba0 004754f7 000000e0 00000009 004754db
    004fdf0c 005269e2 004fdf0c 00000000 004b3f88 00285cae 004b6964 00000000
    004fdf0c 004b3fac 0051cc68 004b6964 00000000 004b6964 00000200 00000000
    0051cc3e 0023c18a 004b3fc0 0051cd8a 004fdf0c 00000002 0052b43c 004b3fc8
    Call Trace: [] __warn+0xa6/0xd6
    [] warn_slowpath_fmt+0x44/0x76
    [] platform_get_irq_optional+0x8e/0xce
    [] platform_get_irq_optional+0x8e/0xce
    [] platform_get_irq+0x12/0x4c
    [] pmz_init_port+0x2a/0xa6
    [] pmz_init_port+0x0/0xa6
    [] strlen+0x0/0x22
    [] pmz_probe+0x34/0x88
    [] pmz_console_init+0x8/0x28
    [] console_init+0x1e/0x28
    [] printk+0x0/0x16
    [] start_kernel+0x368/0x4ce
    [] _sinittext+0x4f8/0xc48
    random: get_random_bytes called from print_oops_end_marker+0x56/0x80 with crng_init=0
    ---[ end trace 392d8e82eed68d6c ]---

    Commit a85a6c86c25b ("driver core: platform: Clarify that IRQ 0 is invalid"),
    which introduced the WARNING, suggests that testing for irq == 0 is
    undesirable. Instead of that comparison, just test for resource existence.

    Cc: Michael Ellerman
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Joshua Thompson
    Cc: Greg Kroah-Hartman
    Cc: Jiri Slaby
    Cc: stable@vger.kernel.org # v5.8+
    Reported-by: Laurent Vivier
    Signed-off-by: Finn Thain
    Link: https://lore.kernel.org/r/0c0fe1e4f11ccec202d4df09ea7d9d98155d101a.1606001297.git.fthain@telegraphics.com.au
    Signed-off-by: Geert Uytterhoeven
    Signed-off-by: Greg Kroah-Hartman

    Finn Thain
     
  • commit f3456b9fd269c6d0c973b136c5449d46b2510f4b upstream.

    ARM Cortex-A57 and Cortex-A72 cores running in 32-bit mode are affected
    by silicon errata #1742098 and #1655431, respectively, where the second
    instruction of a AES instruction pair may execute twice if an interrupt
    is taken right after the first instruction consumes an input register of
    which a single 32-bit lane has been updated the last time it was modified.

    This is not such a rare occurrence as it may seem: in counter mode, only
    the least significant 32-bit word is incremented in the absence of a
    carry, which makes our counter mode implementation susceptible to these
    errata.

    So let's shuffle the counter assignments around a bit so that the most
    recent updates when the AES instruction pair executes are 128-bit wide.

    [0] ARM-EPM-049219 v23 Cortex-A57 MPCore Software Developers Errata Notice
    [1] ARM-EPM-012079 v11.0 Cortex-A72 MPCore Software Developers Errata Notice

    Cc: # v5.4+
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Ard Biesheuvel
     
  • commit 17858b140bf49961b71d4e73f1c3ea9bc8e7dda0 upstream.

    ecdh_set_secret() casts a void* pointer to a const u64* in order to
    feed it into ecc_is_key_valid(). This is not generally permitted by
    the C standard, and leads to actual misalignment faults on ARMv6
    cores. In some cases, these are fixed up in software, but this still
    leads to performance hits that are entirely avoidable.

    So let's copy the key into the ctx buffer first, which we will do
    anyway in the common case, and which guarantees correct alignment.

    Cc:
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Herbert Xu
    Signed-off-by: Greg Kroah-Hartman

    Ard Biesheuvel
     
  • commit e40ad84c26b4deeee46666492ec66b9a534b8e59 upstream.

    When turbo has been disabled by the BIOS, but HWP_CAP.GUARANTEED is
    changed later, user space may want to take advantage of this increased
    guaranteed performance.

    HWP_CAP.GUARANTEED is not a static value. It can be adjusted by an
    out-of-band agent or during an Intel Speed Select performance level
    change. The HWP_CAP.MAX is still the maximum achievable performance
    with turbo disabled by the BIOS, so HWP_CAP.GUARANTEED can still
    change as long as it remains less than or equal to HWP_CAP.MAX.

    When HWP_CAP.GUARANTEED is changed, the sysfs base_frequency
    attribute shows the most recent guaranteed frequency value. This
    attribute can be used by user space software to update the scaling
    min/max limits of the CPU.

    Currently, the ->setpolicy() callback already uses the latest
    HWP_CAP values when setting HWP_REQ, but the ->verify() callback will
    restrict the user settings to the to old guaranteed performance value
    which prevents user space from making use of the extra CPU capacity
    theoretically available to it after increasing HWP_CAP.GUARANTEED.

    To address this, read HWP_CAP in intel_pstate_verify_cpu_policy()
    to obtain the maximum P-state that can be used and use that to
    confine the policy max limit instead of using the cached and
    possibly stale pstate.max_freq value for this purpose.

    For consistency, update intel_pstate_update_perf_limits() to use the
    maximum available P-state returned by intel_pstate_get_hwp_max() to
    compute the maximum frequency instead of using the return value of
    intel_pstate_get_max_freq() which, again, may be stale.

    This issue is a side-effect of fixing the scaling frequency limits in
    commit eacc9c5a927e ("cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max()
    for turbo disabled") which corrected the setting of the reduced scaling
    frequency values, but caused stale HWP_CAP.GUARANTEED to be used in
    the case at hand.

    Fixes: eacc9c5a927e ("cpufreq: intel_pstate: Fix intel_pstate_get_hwp_max() for turbo disabled")
    Reported-by: Srinivas Pandruvada
    Tested-by: Srinivas Pandruvada
    Cc: 5.8+ # 5.8+
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Greg Kroah-Hartman

    Rafael J. Wysocki
     
  • commit aa8e21c053d72b6639ea5a7f1d3a1d0209534c94 upstream.

    Perf event attritube supports exclude_kernel flag to avoid
    sampling/profiling in supervisor state (kernel). Based on this event
    attr flag, Monitor Mode Control Register bit is set to freeze on
    supervisor state. But sometimes (due to hardware limitation), Sampled
    Instruction Address Register (SIAR) locks on to kernel address even
    when freeze on supervisor is set. Patch here adds a check to drop
    those samples.

    Cc: stable@vger.kernel.org
    Signed-off-by: Athira Rajeev
    Signed-off-by: Michael Ellerman
    Link: https://lore.kernel.org/r/1606289215-1433-1-git-send-email-atrajeev@linux.vnet.ibm.com
    Signed-off-by: Greg Kroah-Hartman

    Athira Rajeev
     
  • commit f8129cd958b395575e5543ce25a8434874b04d3a upstream.

    The cycle count of a timed LBR is always 1 in perf record -D.

    The cycle count is stored in the first 16 bits of the IA32_LBR_x_INFO
    register, but the get_lbr_cycles() return Boolean type.

    Use u16 to replace the Boolean type.

    Fixes: 47125db27e47 ("perf/x86/intel/lbr: Support Architectural LBR")
    Reported-by: Stephane Eranian
    Signed-off-by: Kan Liang
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20201125213720.15692-2-kan.liang@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman

    Kan Liang
     
  • commit 46b72e1bf4fc571da0c29c6fb3e5b2a2107a4c26 upstream.

    According to the event list from icelake_core_v1.09.json, the encoding
    of the RTM_RETIRED.ABORTED event on Ice Lake should be,
    "EventCode": "0xc9",
    "UMask": "0x04",
    "EventName": "RTM_RETIRED.ABORTED",

    Correct the wrong encoding.

    Fixes: 6017608936c1 ("perf/x86/intel: Add Icelake support")
    Signed-off-by: Kan Liang
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20201125213720.15692-1-kan.liang@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman

    Kan Liang
     
  • commit 306e3e91edf1c6739a55312edd110d298ff498dd upstream.

    The event CYCLE_ACTIVITY.STALLS_MEM_ANY (0x14a3) should be available on
    all 8 GP counters on ICL, but it's only scheduled on the first four
    counters due to the current ICL constraint table.

    Add a line for the CYCLE_ACTIVITY.STALLS_MEM_ANY event in the ICL
    constraint table.
    Correct the comments for the CYCLE_ACTIVITY.CYCLES_MEM_ANY event.

    Fixes: 6017608936c1 ("perf/x86/intel: Add Icelake support")
    Reported-by: Andi Kleen
    Signed-off-by: Kan Liang
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20201019164529.32154-1-kan.liang@linux.intel.com
    Signed-off-by: Greg Kroah-Hartman

    Kan Liang
     
  • commit dcf5aedb24f899d537e21c18ea552c780598d352 upstream.

    Use temporary slots in reclaim function to avoid possible race when
    freeing those.

    While at it, make sure we check CLAIMED flag under page lock in the
    reclaim function to make sure we are not racing with z3fold_alloc().

    Link: https://lkml.kernel.org/r/20201209145151.18994-4-vitaly.wool@konsulko.com
    Signed-off-by: Vitaly Wool
    Cc:
    Cc: Mike Galbraith
    Cc: Sebastian Andrzej Siewior
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Vitaly Wool