25 Jan, 2021

2 commits


20 Jan, 2021

1 commit

  • This is the 5.10.8 stable release

    * tag 'v5.10.8': (104 commits)
    Linux 5.10.8
    tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
    drm/panfrost: Remove unused variables in panfrost_job_close()
    ...

    Signed-off-by: Jason Liu

    Jason Liu
     

19 Jan, 2021

1 commit

  • Changes in 5.10.8
    powerpc/32s: Fix RTAS machine check with VMAP stack
    io_uring: synchronise IOPOLL on task_submit fail
    io_uring: limit {io|sq}poll submit locking scope
    io_uring: patch up IOPOLL overflow_flush sync
    RDMA/hns: Avoid filling sl in high 3 bits of vlan_id
    iommu/arm-smmu-qcom: Initialize SCTLR of the bypass context
    drm/panfrost: Don't corrupt the queue mutex on open/close
    io_uring: Fix return value from alloc_fixed_file_ref_node
    scsi: ufs: Fix -Wsometimes-uninitialized warning
    btrfs: skip unnecessary searches for xattrs when logging an inode
    btrfs: fix deadlock when cloning inline extent and low on free metadata space
    btrfs: shrink delalloc pages instead of full inodes
    net: cdc_ncm: correct overhead in delayed_ndp_size
    net: hns3: fix incorrect handling of sctp6 rss tuple
    net: hns3: fix the number of queues actually used by ARQ
    net: hns3: fix a phy loopback fail issue
    net: stmmac: dwmac-sun8i: Fix probe error handling
    net: stmmac: dwmac-sun8i: Balance internal PHY resource references
    net: stmmac: dwmac-sun8i: Balance internal PHY power
    net: stmmac: dwmac-sun8i: Balance syscon (de)initialization
    net: vlan: avoid leaks on register_vlan_dev() failures
    net/sonic: Fix some resource leaks in error handling paths
    net: bareudp: add missing error handling for bareudp_link_config()
    ptp: ptp_ines: prevent build when HAS_IOMEM is not set
    net: ipv6: fib: flush exceptions when purging route
    tools: selftests: add test for changing routes with PTMU exceptions
    net: fix pmtu check in nopmtudisc mode
    net: ip: always refragment ip defragmented packets
    chtls: Fix hardware tid leak
    chtls: Remove invalid set_tcb call
    chtls: Fix panic when route to peer not configured
    chtls: Avoid unnecessary freeing of oreq pointer
    chtls: Replace skb_dequeue with skb_peek
    chtls: Added a check to avoid NULL pointer dereference
    chtls: Fix chtls resources release sequence
    octeontx2-af: fix memory leak of lmac and lmac->name
    nexthop: Fix off-by-one error in error path
    nexthop: Unlink nexthop group entry in error path
    nexthop: Bounce NHA_GATEWAY in FDB nexthop groups
    s390/qeth: fix deadlock during recovery
    s390/qeth: fix locking for discipline setup / removal
    s390/qeth: fix L2 header access in qeth_l3_osa_features_check()
    net: dsa: lantiq_gswip: Exclude RMII from modes that report 1 GbE
    net/mlx5: Use port_num 1 instead of 0 when delete a RoCE address
    net/mlx5e: ethtool, Fix restriction of autoneg with 56G
    net/mlx5e: In skb build skip setting mark in switchdev mode
    net/mlx5: Check if lag is supported before creating one
    scsi: lpfc: Fix variable 'vport' set but not used in lpfc_sli4_abts_err_handler()
    ionic: start queues before announcing link up
    HID: wacom: Fix memory leakage caused by kfifo_alloc
    fanotify: Fix sys_fanotify_mark() on native x86-32
    ARM: OMAP2+: omap_device: fix idling of devices during probe
    i2c: sprd: use a specific timeout to avoid system hang up issue
    dmaengine: dw-edma: Fix use after free in dw_edma_alloc_chunk()
    selftests/bpf: Clarify build error if no vmlinux
    can: tcan4x5x: fix bittiming const, use common bittiming from m_can driver
    can: m_can: m_can_class_unregister(): remove erroneous m_can_clk_stop()
    can: kvaser_pciefd: select CONFIG_CRC32
    spi: spi-geni-qcom: Fail new xfers if xfer/cancel/abort pending
    cpufreq: powernow-k8: pass policy rather than use cpufreq_cpu_get()
    spi: spi-geni-qcom: Fix geni_spi_isr() NULL dereference in timeout case
    spi: stm32: FIFO threshold level - fix align packet size
    i2c: i801: Fix the i2c-mux gpiod_lookup_table not being properly terminated
    i2c: mediatek: Fix apdma and i2c hand-shake timeout
    bcache: set bcache device into read-only mode for BCH_FEATURE_INCOMPAT_OBSO_LARGE_BUCKET
    interconnect: imx: Add a missing of_node_put after of_device_is_available
    interconnect: qcom: fix rpmh link failures
    dmaengine: mediatek: mtk-hsdma: Fix a resource leak in the error handling path of the probe function
    dmaengine: milbeaut-xdmac: Fix a resource leak in the error handling path of the probe function
    dmaengine: xilinx_dma: check dma_async_device_register return value
    dmaengine: xilinx_dma: fix incompatible param warning in _child_probe()
    dmaengine: xilinx_dma: fix mixed_enum_type coverity warning
    arm64: mm: Fix ARCH_LOW_ADDRESS_LIMIT when !CONFIG_ZONE_DMA
    qed: select CONFIG_CRC32
    phy: dp83640: select CONFIG_CRC32
    wil6210: select CONFIG_CRC32
    block: rsxx: select CONFIG_CRC32
    lightnvm: select CONFIG_CRC32
    zonefs: select CONFIG_CRC32
    iommu/vt-d: Fix misuse of ALIGN in qi_flush_piotlb()
    iommu/intel: Fix memleak in intel_irq_remapping_alloc
    bpftool: Fix compilation failure for net.o with older glibc
    nvme-tcp: Fix possible race of io_work and direct send
    net/mlx5e: Fix memleak in mlx5e_create_l2_table_groups
    net/mlx5e: Fix two double free cases
    regmap: debugfs: Fix a memory leak when calling regmap_attach_dev
    wan: ds26522: select CONFIG_BITREVERSE
    arm64: cpufeature: remove non-exist CONFIG_KVM_ARM_HOST
    regulator: qcom-rpmh-regulator: correct hfsmps515 definition
    net: mvpp2: disable force link UP during port init procedure
    drm/i915/dp: Track pm_qos per connector
    net: mvneta: fix error message when MTU too large for XDP
    selftests: fib_nexthops: Fix wrong mausezahn invocation
    KVM: arm64: Don't access PMCR_EL0 when no PMU is available
    xsk: Fix race in SKB mode transmit with shared cq
    xsk: Rollback reservation at NETDEV_TX_BUSY
    block/rnbd-clt: avoid module unload race with close confirmation
    can: isotp: isotp_getname(): fix kernel information leak
    block: fix use-after-free in disk_part_iter_next
    net: drop bogus skb with CHECKSUM_PARTIAL and offset beyond end of trimmed packet
    regmap: debugfs: Fix a reversed if statement in regmap_debugfs_init()
    drm/panfrost: Remove unused variables in panfrost_job_close()
    tools headers UAPI: Sync linux/fscrypt.h with the kernel sources
    Linux 5.10.8

    Signed-off-by: Greg Kroah-Hartman
    Change-Id: Ib8272ec9f47a3c3813509bcacece3b16137332e1

    Greg Kroah-Hartman
     

18 Jan, 2021

1 commit


17 Jan, 2021

1 commit

  • commit 095507dc1350b3a2b8b39fdc05edba0c10859eca upstream.

    Systems configured with CONFIG_ZONE_DMA32, CONFIG_ZONE_NORMAL and
    !CONFIG_ZONE_DMA will fail to properly setup ARCH_LOW_ADDRESS_LIMIT. The
    limit will default to ~0ULL, effectively spanning the whole memory,
    which is too high for a configuration that expects low memory to be
    capped at 4GB.

    Fix ARCH_LOW_ADDRESS_LIMIT by falling back to arm64_dma32_phys_limit
    when arm64_dma_phys_limit isn't set. arm64_dma32_phys_limit will honour
    CONFIG_ZONE_DMA32, or span the entire memory when not enabled.

    Fixes: 1a8e1cef7603 ("arm64: use both ZONE_DMA and ZONE_DMA32")
    Signed-off-by: Nicolas Saenz Julienne
    Link: https://lore.kernel.org/r/20201218163307.10150-1-nsaenzjulienne@suse.de
    Signed-off-by: Catalin Marinas
    Signed-off-by: Greg Kroah-Hartman

    Nicolas Saenz Julienne
     

14 Jan, 2021

1 commit

  • Non-coherent devices on systems that support a system or
    last level cache may want to request that allocations be
    cached in the system cache. For memory that is allocated
    by the kernel, and used for DMA with devices, the memory
    attributes used for CPU access should match the memory
    attributes that will be used for device access.

    The memory attributes that need to be programmed into
    the MAIR for system cache usage are:

    0xf4 - Normal memory, outer write back read/write allocate,
    inner non-cacheable.

    There is currently no support for this memory attribute for
    CPU mappings, so add it.

    Bug: 176778547
    Change-Id: I3abc7becd408f20ac5499cbbe3c6c6f53f784107
    Signed-off-by: Isaac J. Manjarres

    Isaac J. Manjarres
     

13 Jan, 2021

1 commit

  • After certain memory blocks are offlined, some usecases require that the
    page-table mappings are removed while still keeping the memblock device nodes,
    memory resources and the memmap entries intact. This is to avoid the overhead
    involved in using 'remove_memory' if the offlined blocks will be added/onlined
    back into the system at later point. {populate/depopulate}_range_driver_managed
    provide the abilty to drivers to tear-down and create page-table mappings of
    memory blocks that its managing, without having to use 'remove_memory' and
    friends for the purpose.
    These functions does not interfere with mappings of boot memory or resources
    that isn't owned by the driver.

    Bug: 171907330
    Change-Id: Ie11201334bd7438bf87f933ccc81814fa99162c4
    Signed-off-by: Sudarshan Rajagopalan

    Sudarshan Rajagopalan
     

18 Dec, 2020

1 commit

  • * arch/next: (106 commits)
    soc: fsl: qbman: Ensure device cleanup is run for kexec
    drivers/soc/fsl: add EPU FSM configuration for deep sleep
    fsl_pmc: update device bindings
    powerpc/pm: Fix suspend=n in menuconfig for e500mc platforms.
    powerpc/pm: add sleep and deep sleep on QorIQ SoCs
    ...

    BJ DevOps Team
     

16 Dec, 2020

7 commits

  • When section mappings are enabled, we allocate vmemmap pages from
    physically continuous memory of size PMD_SIZE using
    vmemmap_alloc_block_buf(). Section mappings are good to reduce TLB
    pressure. But when system is highly fragmented and memory blocks are
    being hot-added at runtime, its possible that such physically continuous
    memory allocations can fail. Rather than failing the memory hot-add
    procedure, add a fallback option to allocate vmemmap pages from
    discontinuous pages using vmemmap_populate_basepages().

    Signed-off-by: Sudarshan Rajagopalan
    Reviewed-by: Gavin Shan
    Reviewed-by: Anshuman Khandual
    Acked-by: Will Deacon
    Cc: Will Deacon
    Cc: Anshuman Khandual
    Cc: Mark Rutland
    Cc: Logan Gunthorpe
    Cc: David Hildenbrand
    Cc: Andrew Morton
    Cc: Steven Price
    Link: https://lore.kernel.org/r/d6c06f2ef39bbe6c715b2f6db76eb16155fdcee6.1602722808.git.sudaraja@codeaurora.org
    Signed-off-by: Catalin Marinas

    (cherry picked from commit 9f84f39f5515fd412398a1019e3f50ac3ab51a80)

    Bug: 170202780
    Signed-off-by: Suren Baghdasaryan
    Change-Id: I449b6d2dfab6b480e6d9487aa5e25f4281d0eabd

    Sudarshan Rajagopalan
     
  • We recently introduced a 1 GB sized ZONE_DMA to cater for platforms
    incorporating masters that can address less than 32 bits of DMA, in
    particular the Raspberry Pi 4, which has 4 or 8 GB of DRAM, but has
    peripherals that can only address up to 1 GB (and its PCIe host
    bridge can only access the bottom 3 GB)

    Instructing the DMA layer about these limitations is straight-forward,
    even though we had to fix some issues regarding memory limits set in
    the IORT for named components, and regarding the handling of ACPI _DMA
    methods. However, the DMA layer also needs to be able to allocate
    memory that is guaranteed to meet those DMA constraints, for bounce
    buffering as well as allocating the backing for consistent mappings.

    This is why the 1 GB ZONE_DMA was introduced recently. Unfortunately,
    it turns out the having a 1 GB ZONE_DMA as well as a ZONE_DMA32 causes
    problems with kdump, and potentially in other places where allocations
    cannot cross zone boundaries. Therefore, we should avoid having two
    separate DMA zones when possible.

    So let's do an early scan of the IORT, and only create the ZONE_DMA
    if we encounter any devices that need it. This puts the burden on
    the firmware to describe such limitations in the IORT, which may be
    redundant (and less precise) if _DMA methods are also being provided.
    However, it should be noted that this situation is highly unusual for
    arm64 ACPI machines. Also, the DMA subsystem still gives precedence to
    the _DMA method if implemented, and so we will not lose the ability to
    perform streaming DMA outside the ZONE_DMA if the _DMA method permits
    it.

    [nsaenz: unified implementation with DT's counterpart]

    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Nicolas Saenz Julienne
    Tested-by: Jeremy Linton
    Acked-by: Lorenzo Pieralisi
    Acked-by: Hanjun Guo
    Cc: Jeremy Linton
    Cc: Lorenzo Pieralisi
    Cc: Nicolas Saenz Julienne
    Cc: Rob Herring
    Cc: Christoph Hellwig
    Cc: Robin Murphy
    Cc: Hanjun Guo
    Cc: Sudeep Holla
    Cc: Anshuman Khandual
    Link: https://lore.kernel.org/r/20201119175400.9995-7-nsaenzjulienne@suse.de
    Signed-off-by: Catalin Marinas

    Ard Biesheuvel
     
  • We recently introduced a 1 GB sized ZONE_DMA to cater for platforms
    incorporating masters that can address less than 32 bits of DMA, in
    particular the Raspberry Pi 4, which has 4 or 8 GB of DRAM, but has
    peripherals that can only address up to 1 GB (and its PCIe host
    bridge can only access the bottom 3 GB)

    The DMA layer also needs to be able to allocate memory that is
    guaranteed to meet those DMA constraints, for bounce buffering as well
    as allocating the backing for consistent mappings. This is why the 1 GB
    ZONE_DMA was introduced recently. Unfortunately, it turns out the having
    a 1 GB ZONE_DMA as well as a ZONE_DMA32 causes problems with kdump, and
    potentially in other places where allocations cannot cross zone
    boundaries. Therefore, we should avoid having two separate DMA zones
    when possible.

    So, with the help of of_dma_get_max_cpu_address() get the topmost
    physical address accessible to all DMA masters in system and use that
    information to fine-tune ZONE_DMA's size. In the absence of addressing
    limited masters ZONE_DMA will span the whole 32-bit address space,
    otherwise, in the case of the Raspberry Pi 4 it'll only span the 30-bit
    address space, and have ZONE_DMA32 cover the rest of the 32-bit address
    space.

    Signed-off-by: Nicolas Saenz Julienne
    Link: https://lore.kernel.org/r/20201119175400.9995-6-nsaenzjulienne@suse.de
    Signed-off-by: Catalin Marinas

    Nicolas Saenz Julienne
     
  • zone_dma_bits's initialization happens earlier that it's actually
    needed, in arm64_memblock_init(). So move it into the more suitable
    zone_sizes_init().

    Signed-off-by: Nicolas Saenz Julienne
    Tested-by: Jeremy Linton
    Link: https://lore.kernel.org/r/20201119175400.9995-3-nsaenzjulienne@suse.de
    Signed-off-by: Catalin Marinas

    Nicolas Saenz Julienne
     
  • crashkernel might reserve memory located in ZONE_DMA. We plan to delay
    ZONE_DMA's initialization after unflattening the devicetree and ACPI's
    boot table initialization, so move it later in the boot process.
    Specifically into bootmem_init() since request_standard_resources()
    depends on it.

    Signed-off-by: Nicolas Saenz Julienne
    Tested-by: Jeremy Linton
    Link: https://lore.kernel.org/r/20201119175400.9995-2-nsaenzjulienne@suse.de
    Signed-off-by: Catalin Marinas

    Nicolas Saenz Julienne
     
  • mem_init() currently relies on knowing the boundaries of the crashkernel
    reservation to map such region with page granularity for later
    unmapping via set_memory_valid(..., 0). If the crashkernel reservation
    is deferred, such boundaries are not known when the linear mapping is
    created. Simply parse the command line for "crashkernel" and, if found,
    create the linear map with NO_BLOCK_MAPPINGS.

    Signed-off-by: Catalin Marinas
    Tested-by: Nicolas Saenz Julienne
    Reviewed-by: Nicolas Saenz Julienne
    Acked-by: James Morse
    Cc: James Morse
    Cc: Nicolas Saenz Julienne
    Link: https://lore.kernel.org/r/20201119175556.18681-1-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas

    Catalin Marinas
     
  • Currently, the kernel assumes that if RAM starts above 32-bit (or
    zone_bits), there is still a ZONE_DMA/DMA32 at the bottom of the RAM and
    such constrained devices have a hardwired DMA offset. In practice, we
    haven't noticed any such hardware so let's assume that we can expand
    ZONE_DMA32 to the available memory if no RAM below 4GB. Similarly,
    ZONE_DMA is expanded to the 4GB limit if no RAM addressable by
    zone_bits.

    Signed-off-by: Catalin Marinas
    Tested-by: Nicolas Saenz Julienne
    Reviewed-by: Nicolas Saenz Julienne
    Cc: Nicolas Saenz Julienne
    Cc: Robin Murphy
    Link: https://lore.kernel.org/r/20201118185809.1078362-1-catalin.marinas@arm.com
    Signed-off-by: Catalin Marinas

    Catalin Marinas
     

14 Dec, 2020

1 commit


04 Dec, 2020

1 commit


02 Dec, 2020

1 commit

  • As a hardening measure, we currently randomize the placement of
    physical memory inside the linear region when KASLR is in effect.
    Since the random offset at which to place the available physical
    memory inside the linear region is chosen early at boot, it is
    based on the memblock description of memory, which does not cover
    hotplug memory. The consequence of this is that the randomization
    offset may be chosen such that any hotplugged memory located above
    memblock_end_of_DRAM() that appears later is pushed off the end of
    the linear region, where it cannot be accessed.

    So let's limit this randomization of the linear region to ensure
    that this can no longer happen, by using the CPU's addressable PA
    range instead. As it is guaranteed that no hotpluggable memory will
    appear that falls outside of that range, we can safely put this PA
    range sized window anywhere in the linear region.

    Signed-off-by: Ard Biesheuvel
    Cc: Anshuman Khandual
    Cc: Will Deacon
    Cc: Steven Price
    Cc: Robin Murphy
    Link: https://lore.kernel.org/r/20201014081857.3288-1-ardb@kernel.org
    Signed-off-by: Catalin Marinas

    Bug: 173725282
    (cherry picked from commit 97d6786e0669daa5c2f2d07a057f574e849dfd3e
    git: https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git)
    Link: https://lore.kernel.org/linux-arm-kernel/20201014081857.3288-1-ardb@kernel.org/
    Signed-off-by: Suren Baghdasaryan
    Change-Id: Ia7ef47090fe334b29aab711d608f424e23e7fb92

    Ard Biesheuvel
     

30 Nov, 2020

2 commits

  • In debug_exception_enter() and debug_exception_exit() we trace hardirqs
    on/off while RCU isn't guaranteed to be watching, and we don't save and
    restore the hardirq state, and so may return with this having changed.

    Handle this appropriately with new entry/exit helpers which do the bare
    minimum to ensure this is appropriately maintained, without marking
    debug exceptions as NMIs. These are placed in entry-common.c with the
    other entry/exit helpers.

    In future we'll want to reconsider whether some debug exceptions should
    be NMIs, but this will require a significant refactoring, and for now
    this should prevent issues with lockdep and RCU.

    Signed-off-by: Mark Rutland
    Cc: Catalin Marins
    Cc: James Morse
    Cc: Will Deacon
    Link: https://lore.kernel.org/r/20201130115950.22492-12-mark.rutland@arm.com
    Signed-off-by: Will Deacon

    Mark Rutland
     
  • When built with PROVE_LOCKING, NO_HZ_FULL, and CONTEXT_TRACKING_FORCE
    will WARN() at boot time that interrupts are enabled when we call
    context_tracking_user_enter(), despite the DAIF flags indicating that
    IRQs are masked.

    The problem is that we're not tracking IRQ flag changes accurately, and
    so lockdep believes interrupts are enabled when they are not (and
    vice-versa). We can shuffle things so to make this more accurate. For
    kernel->user transitions there are a number of constraints we need to
    consider:

    1) When we call __context_tracking_user_enter() HW IRQs must be disabled
    and lockdep must be up-to-date with this.

    2) Userspace should be treated as having IRQs enabled from the PoV of
    both lockdep and tracing.

    3) As context_tracking_user_enter() stops RCU from watching, we cannot
    use RCU after calling it.

    4) IRQ flag tracing and lockdep have state that must be manipulated
    before RCU is disabled.

    ... with similar constraints applying for user->kernel transitions, with
    the ordering reversed.

    The generic entry code has enter_from_user_mode() and
    exit_to_user_mode() helpers to handle this. We can't use those directly,
    so we add arm64 copies for now (without the instrumentation markers
    which aren't used on arm64). These replace the existing user_exit() and
    user_exit_irqoff() calls spread throughout handlers, and the exception
    unmasking is left as-is.

    Note that:

    * The accounting for debug exceptions from userspace now happens in
    el0_dbg() and ret_to_user(), so this is removed from
    debug_exception_enter() and debug_exception_exit(). As
    user_exit_irqoff() wakes RCU, the userspace-specific check is removed.

    * The accounting for syscalls now happens in el0_svc(),
    el0_svc_compat(), and ret_to_user(), so this is removed from
    el0_svc_common(). This does not adversely affect the workaround for
    erratum 1463225, as this does not depend on any of the state tracking.

    * In ret_to_user() we mask interrupts with local_daif_mask(), and so we
    need to inform lockdep and tracing. Here a trace_hardirqs_off() is
    sufficient and safe as we have not yet exited kernel context and RCU
    is usable.

    * As PROVE_LOCKING selects TRACE_IRQFLAGS, the ifdeferry in entry.S only
    needs to check for the latter.

    * EL0 SError handling will be dealt with in a subsequent patch, as this
    needs to be treated as an NMI.

    Prior to this patch, booting an appropriately-configured kernel would
    result in spats as below:

    | DEBUG_LOCKS_WARN_ON(lockdep_hardirqs_enabled())
    | WARNING: CPU: 2 PID: 1 at kernel/locking/lockdep.c:5280 check_flags.part.54+0x1dc/0x1f0
    | Modules linked in:
    | CPU: 2 PID: 1 Comm: init Not tainted 5.10.0-rc3 #3
    | Hardware name: linux,dummy-virt (DT)
    | pstate: 804003c5 (Nzcv DAIF +PAN -UAO -TCO BTYPE=--)
    | pc : check_flags.part.54+0x1dc/0x1f0
    | lr : check_flags.part.54+0x1dc/0x1f0
    | sp : ffff80001003bd80
    | x29: ffff80001003bd80 x28: ffff66ce801e0000
    | x27: 00000000ffffffff x26: 00000000000003c0
    | x25: 0000000000000000 x24: ffffc31842527258
    | x23: ffffc31842491368 x22: ffffc3184282d000
    | x21: 0000000000000000 x20: 0000000000000001
    | x19: ffffc318432ce000 x18: 0080000000000000
    | x17: 0000000000000000 x16: ffffc31840f18a78
    | x15: 0000000000000001 x14: ffffc3184285c810
    | x13: 0000000000000001 x12: 0000000000000000
    | x11: ffffc318415857a0 x10: ffffc318406614c0
    | x9 : ffffc318415857a0 x8 : ffffc31841f1d000
    | x7 : 647261685f706564 x6 : ffffc3183ff7c66c
    | x5 : ffff66ce801e0000 x4 : 0000000000000000
    | x3 : ffffc3183fe00000 x2 : ffffc31841500000
    | x1 : e956dc24146b3500 x0 : 0000000000000000
    | Call trace:
    | check_flags.part.54+0x1dc/0x1f0
    | lock_is_held_type+0x10c/0x188
    | rcu_read_lock_sched_held+0x70/0x98
    | __context_tracking_enter+0x310/0x350
    | context_tracking_enter.part.3+0x5c/0xc8
    | context_tracking_user_enter+0x6c/0x80
    | finish_ret_to_user+0x2c/0x13cr

    Signed-off-by: Mark Rutland
    Cc: Catalin Marinas
    Cc: James Morse
    Cc: Will Deacon
    Link: https://lore.kernel.org/r/20201130115950.22492-8-mark.rutland@arm.com
    Signed-off-by: Will Deacon

    Mark Rutland
     

13 Nov, 2020

1 commit

  • During memory hotplug process, the linear mapping should not be created for
    a given memory range if that would fall outside the maximum allowed linear
    range. Else it might cause memory corruption in the kernel virtual space.

    Maximum linear mapping region is [PAGE_OFFSET..(PAGE_END -1)] accommodating
    both its ends but excluding PAGE_END. Max physical range that can be mapped
    inside this linear mapping range, must also be derived from its end points.

    This ensures that arch_add_memory() validates memory hot add range for its
    potential linear mapping requirements, before creating it with
    __create_pgd_mapping().

    Fixes: 4ab215061554 ("arm64: Add memory hotplug support")
    Signed-off-by: Anshuman Khandual
    Reviewed-by: Ard Biesheuvel
    Cc: Catalin Marinas
    Cc: Will Deacon
    Cc: Mark Rutland
    Cc: Ard Biesheuvel
    Cc: Steven Price
    Cc: Robin Murphy
    Cc: David Hildenbrand
    Cc: Andrew Morton
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: linux-kernel@vger.kernel.org
    Link: https://lore.kernel.org/r/1605252614-761-1-git-send-email-anshuman.khandual@arm.com
    Signed-off-by: Will Deacon

    Anshuman Khandual
     

29 Oct, 2020

1 commit

  • On Cortex-A77 r0p0 and r1p0, a sequence of a non-cacheable or device load
    and a store exclusive or PAR_EL1 read can cause a deadlock.

    The workaround requires a DMB SY before and after a PAR_EL1 register
    read. In addition, it's possible an interrupt (doing a device read) or
    KVM guest exit could be taken between the DMB and PAR read, so we
    also need a DMB before returning from interrupt and before returning to
    a guest.

    A deadlock is still possible with the workaround as KVM guests must also
    have the workaround. IOW, a malicious guest can deadlock an affected
    systems.

    This workaround also depends on a firmware counterpart to enable the h/w
    to insert DMB SY after load and store exclusive instructions. See the
    errata document SDEN-1152370 v10 [1] for more information.

    [1] https://static.docs.arm.com/101992/0010/Arm_Cortex_A77_MP074_Software_Developer_Errata_Notice_v10.pdf

    Signed-off-by: Rob Herring
    Reviewed-by: Catalin Marinas
    Acked-by: Marc Zyngier
    Cc: Catalin Marinas
    Cc: James Morse
    Cc: Suzuki K Poulose
    Cc: Will Deacon
    Cc: Julien Thierry
    Cc: kvmarm@lists.cs.columbia.edu
    Link: https://lore.kernel.org/r/20201028182839.166037-2-robh@kernel.org
    Signed-off-by: Will Deacon

    Rob Herring
     

26 Oct, 2020

1 commit

  • Use a more generic form for __section that requires quotes to avoid
    complications with clang and gcc differences.

    Remove the quote operator # from compiler_attributes.h __section macro.

    Convert all unquoted __section(foo) uses to quoted __section("foo").
    Also convert __attribute__((section("foo"))) uses to __section("foo")
    even if the __attribute__ has multiple list entry forms.

    Conversion done using the script at:

    https://lore.kernel.org/lkml/75393e5ddc272dc7403de74d645e6c6e0f4e70eb.camel@perches.com/2-convert_section.pl

    Signed-off-by: Joe Perches
    Reviewed-by: Nick Desaulniers
    Reviewed-by: Miguel Ojeda
    Signed-off-by: Linus Torvalds

    Joe Perches
     

24 Oct, 2020

1 commit

  • Pull more arm64 updates from Will Deacon:
    "A small selection of further arm64 fixes and updates. Most of these
    are fixes that came in during the merge window, with the exception of
    the HAVE_MOVE_PMD mremap() speed-up which we discussed back in 2018
    and somehow forgot to enable upstream.

    - Improve performance of Spectre-v2 mitigation on Falkor CPUs (if
    you're lucky enough to have one)

    - Select HAVE_MOVE_PMD. This has been shown to improve mremap()
    performance, which is used heavily by the Android runtime GC, and
    it seems we forgot to enable this upstream back in 2018.

    - Ensure linker flags are consistent between LLVM and BFD

    - Fix stale comment in Spectre mitigation rework

    - Fix broken copyright header

    - Fix KASLR randomisation of the linear map

    - Prevent arm64-specific prctl()s from compat tasks (return -EINVAL)"

    Link: https://lore.kernel.org/kvmarm/20181108181201.88826-3-joelaf@google.com/

    * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: proton-pack: Update comment to reflect new function name
    arm64: spectre-v2: Favour CPU-specific mitigation at EL2
    arm64: link with -z norelro regardless of CONFIG_RELOCATABLE
    arm64: Fix a broken copyright header in gen_vdso_offsets.sh
    arm64: mremap speedup - Enable HAVE_MOVE_PMD
    arm64: mm: use single quantity to represent the PA to VA translation
    arm64: reject prctl(PR_PAC_RESET_KEYS) on compat tasks

    Linus Torvalds
     

16 Oct, 2020

1 commit

  • Pull dma-mapping updates from Christoph Hellwig:

    - rework the non-coherent DMA allocator

    - move private definitions out of

    - lower CMA_ALIGNMENT (Paul Cercueil)

    - remove the omap1 dma address translation in favor of the common code

    - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

    - support per-node DMA CMA areas (Barry Song)

    - increase the default seg boundary limit (Nicolin Chen)

    - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

    - various cleanups

    * tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
    ARM/ixp4xx: add a missing include of dma-map-ops.h
    dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
    dma-direct: factor out a dma_direct_alloc_from_pool helper
    dma-direct check for highmem pages in dma_direct_alloc_pages
    dma-mapping: merge into
    dma-mapping: move large parts of to kernel/dma
    dma-mapping: move dma-debug.h to kernel/dma/
    dma-mapping: remove
    dma-mapping: merge into
    dma-contiguous: remove dma_contiguous_set_default
    dma-contiguous: remove dev_set_cma_area
    dma-contiguous: remove dma_declare_contiguous
    dma-mapping: split
    cma: decrease CMA_ALIGNMENT lower limit to 2
    firewire-ohci: use dma_alloc_pages
    dma-iommu: implement ->alloc_noncoherent
    dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
    dma-mapping: add a new dma_alloc_pages API
    dma-mapping: remove dma_cache_sync
    53c700: convert to dma_alloc_noncoherent
    ...

    Linus Torvalds
     

15 Oct, 2020

1 commit

  • On arm64, the global variable memstart_addr represents the physical
    address of PAGE_OFFSET, and so physical to virtual translations or
    vice versa used to come down to simple additions or subtractions
    involving the values of PAGE_OFFSET and memstart_addr.

    When support for 52-bit virtual addressing was introduced, we had to
    deal with PAGE_OFFSET potentially being outside of the region that
    can be covered by the virtual range (as the 52-bit VA capable build
    needs to be able to run on systems that are only 48-bit VA capable),
    and for this reason, another translation was introduced, and recorded
    in the global variable physvirt_offset.

    However, if we go back to the original definition of memstart_addr,
    i.e., the physical address of PAGE_OFFSET, it turns out that there is
    no need for two separate translations: instead, we can simply subtract
    the size of the unaddressable VA space from memstart_addr to make the
    available physical memory appear in the 48-bit addressable VA region.

    This simplifies things, but also fixes a bug on KASLR builds, which
    may update memstart_addr later on in arm64_memblock_init(), but fails
    to update vmemmap and physvirt_offset accordingly.

    Fixes: 5383cc6efed1 ("arm64: mm: Introduce vabits_actual")
    Signed-off-by: Ard Biesheuvel
    Reviewed-by: Steve Capper
    Link: https://lore.kernel.org/r/20201008153602.9467-2-ardb@kernel.org
    Signed-off-by: Will Deacon

    Ard Biesheuvel
     

14 Oct, 2020

4 commits

  • for_each_memblock() is used to iterate over memblock.memory in a few
    places that use data from memblock_region rather than the memory ranges.

    Introduce separate for_each_mem_region() and
    for_each_reserved_mem_region() to improve encapsulation of memblock
    internals from its users.

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Acked-by: Ingo Molnar [x86]
    Acked-by: Thomas Bogendoerfer [MIPS]
    Acked-by: Miguel Ojeda [.clang-format]
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Catalin Marinas
    Cc: Christoph Hellwig
    Cc: Daniel Axtens
    Cc: Dave Hansen
    Cc: Emil Renner Berthing
    Cc: Hari Bathini
    Cc: Ingo Molnar
    Cc: Jonathan Cameron
    Cc: Marek Szyprowski
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Mackerras
    Cc: Paul Walmsley
    Cc: Peter Zijlstra
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: https://lkml.kernel.org/r/20200818151634.14343-18-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • There are several occurrences of the following pattern:

    for_each_memblock(memory, reg) {
    start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
    end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));

    /* do something with start and end */
    }

    Using for_each_mem_range() iterator is more appropriate in such cases and
    allows simpler and cleaner code.

    [akpm@linux-foundation.org: fix arch/arm/mm/pmsa-v7.c build]
    [rppt@linux.ibm.com: mips: fix cavium-octeon build caused by memblock refactoring]
    Link: http://lkml.kernel.org/r/20200827124549.GD167163@linux.ibm.com

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Baoquan He
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Catalin Marinas
    Cc: Christoph Hellwig
    Cc: Daniel Axtens
    Cc: Dave Hansen
    Cc: Emil Renner Berthing
    Cc: Hari Bathini
    Cc: Ingo Molnar
    Cc: Ingo Molnar
    Cc: Jonathan Cameron
    Cc: Marek Szyprowski
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Miguel Ojeda
    Cc: Palmer Dabbelt
    Cc: Paul Mackerras
    Cc: Paul Walmsley
    Cc: Peter Zijlstra
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: https://lkml.kernel.org/r/20200818151634.14343-13-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • There are several occurrences of the following pattern:

    for_each_memblock(memory, reg) {
    start_pfn = memblock_region_memory_base_pfn(reg);
    end_pfn = memblock_region_memory_end_pfn(reg);

    /* do something with start_pfn and end_pfn */
    }

    Rather than iterate over all memblock.memory regions and each time query
    for their start and end PFNs, use for_each_mem_pfn_range() iterator to get
    simpler and clearer code.

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Reviewed-by: Baoquan He
    Acked-by: Miguel Ojeda [.clang-format]
    Cc: Andy Lutomirski
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Catalin Marinas
    Cc: Christoph Hellwig
    Cc: Daniel Axtens
    Cc: Dave Hansen
    Cc: Emil Renner Berthing
    Cc: Hari Bathini
    Cc: Ingo Molnar
    Cc: Ingo Molnar
    Cc: Jonathan Cameron
    Cc: Marek Szyprowski
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Palmer Dabbelt
    Cc: Paul Mackerras
    Cc: Paul Walmsley
    Cc: Peter Zijlstra
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: https://lkml.kernel.org/r/20200818151634.14343-12-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     
  • dummy_numa_init() loops over memblock.memory and passes nid=0 to
    numa_add_memblk() which essentially wraps memblock_set_node(). However,
    memblock_set_node() can cope with entire memory span itself, so the loop
    over memblock.memory regions is redundant.

    Using a single call to memblock_set_node() rather than a loop also fixes
    an issue with a buggy ACPI firmware in which the SRAT table covers some
    but not all of the memory in the EFI memory map.

    Jonathan Cameron says:

    This issue can be easily triggered by having an SRAT table which fails
    to cover all elements of the EFI memory map.

    This firmware error is detected and a warning printed. e.g.
    "NUMA: Warning: invalid memblk node 64 [mem 0x240000000-0x27fffffff]"
    At that point we fall back to dummy_numa_init().

    However, the failed ACPI init has left us with our memblocks all broken
    up as we split them when trying to assign them to NUMA nodes.

    We then iterate over the memblocks and add them to node 0.

    numa_add_memblk() calls memblock_set_node() which merges regions that
    were previously split up during the earlier attempt to add them to
    different nodes during parsing of SRAT.

    This means elements are moved in the memblock array and we can end up
    in a different memblock after the call to numa_add_memblk().
    Result is:

    Unable to handle kernel paging request at virtual address 0000000000003a40
    Mem abort info:
    ESR = 0x96000004
    EC = 0x25: DABT (current EL), IL = 32 bits
    SET = 0, FnV = 0
    EA = 0, S1PTW = 0
    Data abort info:
    ISV = 0, ISS = 0x00000004
    CM = 0, WnR = 0
    [0000000000003a40] user address but active_mm is swapper
    Internal error: Oops: 96000004 [#1] PREEMPT SMP

    ...

    Call trace:
    sparse_init_nid+0x5c/0x2b0
    sparse_init+0x138/0x170
    bootmem_init+0x80/0xe0
    setup_arch+0x2a0/0x5fc
    start_kernel+0x8c/0x648

    Replace the loop with a single call to memblock_set_node() to the entire
    memory.

    Signed-off-by: Mike Rapoport
    Signed-off-by: Andrew Morton
    Acked-by: Jonathan Cameron
    Acked-by: Catalin Marinas
    Cc: Andy Lutomirski
    Cc: Baoquan He
    Cc: Benjamin Herrenschmidt
    Cc: Borislav Petkov
    Cc: Christoph Hellwig
    Cc: Daniel Axtens
    Cc: Dave Hansen
    Cc: Emil Renner Berthing
    Cc: Hari Bathini
    Cc: Ingo Molnar
    Cc: Ingo Molnar
    Cc: Marek Szyprowski
    Cc: Max Filippov
    Cc: Michael Ellerman
    Cc: Michal Simek
    Cc: Miguel Ojeda
    Cc: Palmer Dabbelt
    Cc: Paul Mackerras
    Cc: Paul Walmsley
    Cc: Peter Zijlstra
    Cc: Russell King
    Cc: Stafford Horne
    Cc: Thomas Bogendoerfer
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Cc: Yoshinori Sato
    Link: https://lkml.kernel.org/r/20200818151634.14343-5-rppt@kernel.org
    Signed-off-by: Linus Torvalds

    Mike Rapoport
     

13 Oct, 2020

1 commit

  • Pull orphan section checking from Ingo Molnar:
    "Orphan link sections were a long-standing source of obscure bugs,
    because the heuristics that various linkers & compilers use to handle
    them (include these bits into the output image vs discarding them
    silently) are both highly idiosyncratic and also version dependent.

    Instead of this historically problematic mess, this tree by Kees Cook
    (et al) adds build time asserts and build time warnings if there's any
    orphan section in the kernel or if a section is not sized as expected.

    And because we relied on so many silent assumptions in this area, fix
    a metric ton of dependencies and some outright bugs related to this,
    before we can finally enable the checks on the x86, ARM and ARM64
    platforms"

    * tag 'core-build-2020-10-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
    x86/boot/compressed: Warn on orphan section placement
    x86/build: Warn on orphan section placement
    arm/boot: Warn on orphan section placement
    arm/build: Warn on orphan section placement
    arm64/build: Warn on orphan section placement
    x86/boot/compressed: Add missing debugging sections to output
    x86/boot/compressed: Remove, discard, or assert for unwanted sections
    x86/boot/compressed: Reorganize zero-size section asserts
    x86/build: Add asserts for unwanted sections
    x86/build: Enforce an empty .got.plt section
    x86/asm: Avoid generating unused kprobe sections
    arm/boot: Handle all sections explicitly
    arm/build: Assert for unwanted sections
    arm/build: Add missing sections
    arm/build: Explicitly keep .ARM.attributes sections
    arm/build: Refactor linker script headers
    arm64/build: Assert for unwanted sections
    arm64/build: Add missing DWARF sections
    arm64/build: Use common DISCARDS in linker script
    arm64/build: Remove .eh_frame* sections due to unwind tables
    ...

    Linus Torvalds
     

06 Oct, 2020

3 commits


02 Oct, 2020

2 commits

  • Add userspace support for the Memory Tagging Extension introduced by
    Armv8.5.

    (Catalin Marinas and others)
    * for-next/mte: (30 commits)
    arm64: mte: Fix typo in memory tagging ABI documentation
    arm64: mte: Add Memory Tagging Extension documentation
    arm64: mte: Kconfig entry
    arm64: mte: Save tags when hibernating
    arm64: mte: Enable swap of tagged pages
    mm: Add arch hooks for saving/restoring tags
    fs: Handle intra-page faults in copy_mount_options()
    arm64: mte: ptrace: Add NT_ARM_TAGGED_ADDR_CTRL regset
    arm64: mte: ptrace: Add PTRACE_{PEEK,POKE}MTETAGS support
    arm64: mte: Allow {set,get}_tagged_addr_ctrl() on non-current tasks
    arm64: mte: Restore the GCR_EL1 register after a suspend
    arm64: mte: Allow user control of the generated random tags via prctl()
    arm64: mte: Allow user control of the tag check mode via prctl()
    mm: Allow arm64 mmap(PROT_MTE) on RAM-based files
    arm64: mte: Validate the PROT_MTE request via arch_validate_flags()
    mm: Introduce arch_validate_flags()
    arm64: mte: Add PROT_MTE support to mmap() and mprotect()
    mm: Introduce arch_calc_vm_flag_bits()
    arm64: mte: Tags-aware aware memcmp_pages() implementation
    arm64: Avoid unnecessary clear_user_page() indirection
    ...

    Will Deacon
     
  • …fo', 'for-next/fpsimd', 'for-next/misc', 'for-next/mm', 'for-next/pci', 'for-next/perf', 'for-next/ptrauth', 'for-next/sdei', 'for-next/selftests', 'for-next/stacktrace', 'for-next/svm', 'for-next/topology', 'for-next/tpyos' and 'for-next/vdso' into for-next/core

    Remove unused functions and parameters from ACPI IORT code.
    (Zenghui Yu via Lorenzo Pieralisi)
    * for-next/acpi:
    ACPI/IORT: Remove the unused inline functions
    ACPI/IORT: Drop the unused @ops of iort_add_device_replay()

    Remove redundant code and fix documentation of caching behaviour for the
    HVC_SOFT_RESTART hypercall.
    (Pingfan Liu)
    * for-next/boot:
    Documentation/kvm/arm: improve description of HVC_SOFT_RESTART
    arm64/relocate_kernel: remove redundant code

    Improve reporting of unexpected kernel traps due to BPF JIT failure.
    (Will Deacon)
    * for-next/bpf:
    arm64: Improve diagnostics when trapping BRK with FAULT_BRK_IMM

    Improve robustness of user-visible HWCAP strings and their corresponding
    numerical constants.
    (Anshuman Khandual)
    * for-next/cpuinfo:
    arm64/cpuinfo: Define HWCAP name arrays per their actual bit definitions

    Cleanups to handling of SVE and FPSIMD register state in preparation
    for potential future optimisation of handling across syscalls.
    (Julien Grall)
    * for-next/fpsimd:
    arm64/sve: Implement a helper to load SVE registers from FPSIMD state
    arm64/sve: Implement a helper to flush SVE registers
    arm64/fpsimdmacros: Allow the macro "for" to be used in more cases
    arm64/fpsimdmacros: Introduce a macro to update ZCR_EL1.LEN
    arm64/signal: Update the comment in preserve_sve_context
    arm64/fpsimd: Update documentation of do_sve_acc

    Miscellaneous changes.
    (Tian Tao and others)
    * for-next/misc:
    arm64/mm: return cpu_all_mask when node is NUMA_NO_NODE
    arm64: mm: Fix missing-prototypes in pageattr.c
    arm64/fpsimd: Fix missing-prototypes in fpsimd.c
    arm64: hibernate: Remove unused including <linux/version.h>
    arm64/mm: Refactor {pgd, pud, pmd, pte}_ERROR()
    arm64: Remove the unused include statements
    arm64: get rid of TEXT_OFFSET
    arm64: traps: Add str of description to panic() in die()

    Memory management updates and cleanups.
    (Anshuman Khandual and others)
    * for-next/mm:
    arm64: dbm: Invalidate local TLB when setting TCR_EL1.HD
    arm64: mm: Make flush_tlb_fix_spurious_fault() a no-op
    arm64/mm: Unify CONT_PMD_SHIFT
    arm64/mm: Unify CONT_PTE_SHIFT
    arm64/mm: Remove CONT_RANGE_OFFSET
    arm64/mm: Enable THP migration
    arm64/mm: Change THP helpers to comply with generic MM semantics
    arm64/mm/ptdump: Add address markers for BPF regions

    Allow prefetchable PCI BARs to be exposed to userspace using normal
    non-cacheable mappings.
    (Clint Sbisa)
    * for-next/pci:
    arm64: Enable PCI write-combine resources under sysfs

    Perf/PMU driver updates.
    (Julien Thierry and others)
    * for-next/perf:
    perf: arm-cmn: Fix conversion specifiers for node type
    perf: arm-cmn: Fix unsigned comparison to less than zero
    arm_pmu: arm64: Use NMIs for PMU
    arm_pmu: Introduce pmu_irq_ops
    KVM: arm64: pmu: Make overflow handler NMI safe
    arm64: perf: Defer irq_work to IPI_IRQ_WORK
    arm64: perf: Remove PMU locking
    arm64: perf: Avoid PMXEV* indirection
    arm64: perf: Add missing ISB in armv8pmu_enable_counter()
    perf: Add Arm CMN-600 PMU driver
    perf: Add Arm CMN-600 DT binding
    arm64: perf: Add support caps under sysfs
    drivers/perf: thunderx2_pmu: Fix memory resource error handling
    drivers/perf: xgene_pmu: Fix uninitialized resource struct
    perf: arm_dsu: Support DSU ACPI devices
    arm64: perf: Remove unnecessary event_idx check
    drivers/perf: hisi: Add missing include of linux/module.h
    arm64: perf: Add general hardware LLC events for PMUv3

    Support for the Armv8.3 Pointer Authentication enhancements.
    (By Amit Daniel Kachhap)
    * for-next/ptrauth:
    arm64: kprobe: clarify the comment of steppable hint instructions
    arm64: kprobe: disable probe of fault prone ptrauth instruction
    arm64: cpufeature: Modify address authentication cpufeature to exact
    arm64: ptrauth: Introduce Armv8.3 pointer authentication enhancements
    arm64: traps: Allow force_signal_inject to pass esr error code
    arm64: kprobe: add checks for ARMv8.3-PAuth combined instructions

    Tonnes of cleanup to the SDEI driver.
    (Gavin Shan)
    * for-next/sdei:
    firmware: arm_sdei: Remove _sdei_event_unregister()
    firmware: arm_sdei: Remove _sdei_event_register()
    firmware: arm_sdei: Introduce sdei_do_local_call()
    firmware: arm_sdei: Cleanup on cross call function
    firmware: arm_sdei: Remove while loop in sdei_event_unregister()
    firmware: arm_sdei: Remove while loop in sdei_event_register()
    firmware: arm_sdei: Remove redundant error message in sdei_probe()
    firmware: arm_sdei: Remove duplicate check in sdei_get_conduit()
    firmware: arm_sdei: Unregister driver on error in sdei_init()
    firmware: arm_sdei: Avoid nested statements in sdei_init()
    firmware: arm_sdei: Retrieve event number from event instance
    firmware: arm_sdei: Common block for failing path in sdei_event_create()
    firmware: arm_sdei: Remove sdei_is_err()

    Selftests for Pointer Authentication and FPSIMD/SVE context-switching.
    (Mark Brown and Boyan Karatotev)
    * for-next/selftests:
    selftests: arm64: Add build and documentation for FP tests
    selftests: arm64: Add wrapper scripts for stress tests
    selftests: arm64: Add utility to set SVE vector lengths
    selftests: arm64: Add stress tests for FPSMID and SVE context switching
    selftests: arm64: Add test for the SVE ptrace interface
    selftests: arm64: Test case for enumeration of SVE vector lengths
    kselftests/arm64: add PAuth tests for single threaded consistency and differently initialized keys
    kselftests/arm64: add PAuth test for whether exec() changes keys
    kselftests/arm64: add nop checks for PAuth tests
    kselftests/arm64: add a basic Pointer Authentication test

    Implementation of ARCH_STACKWALK for unwinding.
    (Mark Brown)
    * for-next/stacktrace:
    arm64: Move console stack display code to stacktrace.c
    arm64: stacktrace: Convert to ARCH_STACKWALK
    arm64: stacktrace: Make stack walk callback consistent with generic code
    stacktrace: Remove reliable argument from arch_stack_walk() callback

    Support for ASID pinning, which is required when sharing page-tables with
    the SMMU.
    (Jean-Philippe Brucker)
    * for-next/svm:
    arm64: cpufeature: Export symbol read_sanitised_ftr_reg()
    arm64: mm: Pin down ASIDs for sharing mm with devices

    Rely on firmware tables for establishing CPU topology.
    (Valentin Schneider)
    * for-next/topology:
    arm64: topology: Stop using MPIDR for topology information

    Spelling fixes.
    (Xiaoming Ni and Yanfei Xu)
    * for-next/tpyos:
    arm64/numa: Fix a typo in comment of arm64_numa_init
    arm64: fix some spelling mistakes in the comments by codespell

    vDSO cleanups.
    (Will Deacon)
    * for-next/vdso:
    arm64: vdso: Fix unusual formatting in *setup_additional_pages()
    arm64: vdso32: Remove a bunch of #ifdef CONFIG_COMPAT_VDSO guards

    Will Deacon
     

01 Oct, 2020

1 commit

  • Our use of broadcast TLB maintenance means that spurious page-faults
    that have been handled already by another CPU do not require additional
    TLB maintenance.

    Make flush_tlb_fix_spurious_fault() a no-op and rely on the existing TLB
    invalidation instead. Add an explicit flush_tlb_page() when making a page
    dirty, as the TLB is permitted to cache the old read-only entry.

    Reviewed-by: Catalin Marinas
    Link: https://lore.kernel.org/r/20200728092220.GA21800@willie-the-truck
    Signed-off-by: Will Deacon

    Will Deacon
     

29 Sep, 2020

1 commit

  • To enable address space sharing with the IOMMU, introduce
    arm64_mm_context_get() and arm64_mm_context_put(), that pin down a
    context and ensure that it will keep its ASID after a rollover. Export
    the symbols to let the modular SMMUv3 driver use them.

    Pinning is necessary because a device constantly needs a valid ASID,
    unlike tasks that only require one when running. Without pinning, we would
    need to notify the IOMMU when we're about to use a new ASID for a task,
    and it would get complicated when a new task is assigned a shared ASID.
    Consider the following scenario with no ASID pinned:

    1. Task t1 is running on CPUx with shared ASID (gen=1, asid=1)
    2. Task t2 is scheduled on CPUx, gets ASID (1, 2)
    3. Task tn is scheduled on CPUy, a rollover occurs, tn gets ASID (2, 1)
    We would now have to immediately generate a new ASID for t1, notify
    the IOMMU, and finally enable task tn. We are holding the lock during
    all that time, since we can't afford having another CPU trigger a
    rollover. The IOMMU issues invalidation commands that can take tens of
    milliseconds.

    It gets needlessly complicated. All we wanted to do was schedule task tn,
    that has no business with the IOMMU. By letting the IOMMU pin tasks when
    needed, we avoid stalling the slow path, and let the pinning fail when
    we're out of shareable ASIDs.

    After a rollover, the allocator expects at least one ASID to be available
    in addition to the reserved ones (one per CPU). So (NR_ASIDS - NR_CPUS -
    1) is the maximum number of ASIDs that can be shared with the IOMMU.

    Signed-off-by: Jean-Philippe Brucker
    Reviewed-by: Jonathan Cameron
    Link: https://lore.kernel.org/r/20200918101852.582559-5-jean-philippe@linaro.org
    Signed-off-by: Will Deacon

    Jean-Philippe Brucker
     

22 Sep, 2020

1 commit

  • The @node passed to cpumask_of_node() can be NUMA_NO_NODE, in that
    case it will trigger the following WARN_ON(node >= nr_node_ids) due to
    mismatched data types of @node and @nr_node_ids. Actually we should
    return cpu_all_mask just like most other architectures do if passed
    NUMA_NO_NODE.

    Also add a similar check to the inline cpumask_of_node() in numa.h.

    Signed-off-by: Zhengyuan Liu
    Reviewed-by: Gavin Shan
    Link: https://lore.kernel.org/r/20200921023936.21846-1-liuzhengyuan@tj.kylinos.cn
    Signed-off-by: Will Deacon

    Zhengyuan Liu