18 Apr, 2019

1 commit

  • on i.MX8QM TO1.0, there is an issue: the bus width between A53-CCI-A72
    is limited to 36bits.TLB maintenance through DVM messages over AR channel,
    some bits will be forced(truncated) to zero as the followings:

    ASID[15:12] is forced to 0
    VA[48:45] is forced to 0
    VA[44:41] is forced to 0
    VA[39:36] is forced to 0

    This issue will result in the TLB aintenance across the clusters not working
    as expected due to some VA and ASID bits get truncated and forced to be zero.

    The SW workaround is: use the vmalle1is if VA larger than 36bits or
    ASID[15:12] is not zero, otherwise, we use original TLB maintenance path.

    Signed-off-by: Jason Liu
    Reviewed-by: Anson Huang

    (cherry picked from commit c9eb1788558f07dfda0c15b684f79aedb4bfa623)
    This is still required for current imx8qm B0 chips.
    Signed-off-by: Leonard Crestez

    Jason Liu
     

17 Apr, 2019

1 commit

  • commit 045afc24124d80c6998d9c770844c67912083506 upstream.

    Rather embarrassingly, our futex() FUTEX_WAKE_OP implementation doesn't
    explicitly set the return value on the non-faulting path and instead
    leaves it holding the result of the underlying atomic operation. This
    means that any FUTEX_WAKE_OP atomic operation which computes a non-zero
    value will be reported as having failed. Regrettably, I wrote the buggy
    code back in 2011 and it was upstreamed as part of the initial arm64
    support in 2012.

    The reasons we appear to get away with this are:

    1. FUTEX_WAKE_OP is rarely used and therefore doesn't appear to get
    exercised by futex() test applications

    2. If the result of the atomic operation is zero, the system call
    behaves correctly

    3. Prior to version 2.25, the only operation used by GLIBC set the
    futex to zero, and therefore worked as expected. From 2.25 onwards,
    FUTEX_WAKE_OP is not used by GLIBC at all.

    Fix the implementation by ensuring that the return value is either 0
    to indicate that the atomic operation completed successfully, or -EFAULT
    if we encountered a fault when accessing the user mapping.

    Cc:
    Fixes: 6170a97460db ("arm64: Atomic operations")
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Will Deacon
     

24 Mar, 2019

2 commits

  • commit 5870970b9a828d8693aa6d15742573289d7dbcd0 upstream.

    When using VHE, the host needs to clear HCR_EL2.TGE bit in order
    to interact with guest TLBs, switching from EL2&0 translation regime
    to EL1&0.

    However, some non-maskable asynchronous event could happen while TGE is
    cleared like SDEI. Because of this address translation operations
    relying on EL2&0 translation regime could fail (tlb invalidation,
    userspace access, ...).

    Fix this by properly setting HCR_EL2.TGE when entering NMI context and
    clear it if necessary when returning to the interrupted context.

    Signed-off-by: Julien Thierry
    Suggested-by: Marc Zyngier
    Reviewed-by: Marc Zyngier
    Reviewed-by: James Morse
    Cc: Arnd Bergmann
    Cc: Will Deacon
    Cc: Marc Zyngier
    Cc: James Morse
    Cc: linux-arch@vger.kernel.org
    Cc: stable@vger.kernel.org
    Signed-off-by: Catalin Marinas
    Signed-off-by: Greg Kroah-Hartman

    Julien Thierry
     
  • [ Upstream commit 358b28f09f0ab074d781df72b8a671edb1547789 ]

    The current kvm_psci_vcpu_on implementation will directly try to
    manipulate the state of the VCPU to reset it. However, since this is
    not done on the thread that runs the VCPU, we can end up in a strangely
    corrupted state when the source and target VCPUs are running at the same
    time.

    Fix this by factoring out all reset logic from the PSCI implementation
    and forwarding the required information along with a request to the
    target VCPU.

    Reviewed-by: Andrew Jones
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall
    Signed-off-by: Sasha Levin

    Marc Zyngier
     

13 Feb, 2019

3 commits

  • [ Upstream commit ee1b465b303591d3a04d403122bbc0d7026520fb ]

    SVE_PT_REGS_OFFSET is supposed to indicate the offset for skipping
    over the ptrace NT_ARM_SVE header (struct user_sve_header) to the
    start of the SVE register data proper.

    However, currently SVE_PT_REGS_OFFSET is defined in terms of struct
    sve_context, which is wrong: that structure describes the SVE
    header in the signal frame, not in the ptrace regset.

    This patch fixes the definition to use the ptrace header structure
    struct user_sve_header instead.

    By good fortune, the two structures are the same size anyway, so
    there is no functional or ABI change.

    Signed-off-by: Dave Martin
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Dave Martin
     
  • [ Upstream commit 1b57ec8c75279b873639eb44a215479236f93481 ]

    As of commit 6460d3201471 ("arm64: io: Ensure calls to delay routines
    are ordered against prior readX()"), MMIO reads smaller than 64 bits
    fail to compile under clang because we end up mixing 32-bit and 64-bit
    register operands for the same data processing instruction:

    ./include/asm-generic/io.h:695:9: warning: value size does not match register size specified by the constraint and modifier [-Wasm-operand-widths]
    return readb(addr);
    ^
    ./arch/arm64/include/asm/io.h:147:58: note: expanded from macro 'readb'
    ^
    ./include/asm-generic/io.h:695:9: note: use constraint modifier "w"
    ./arch/arm64/include/asm/io.h:147:50: note: expanded from macro 'readb'
    ^
    ./arch/arm64/include/asm/io.h:118:24: note: expanded from macro '__iormb'
    asm volatile("eor %0, %1, %1\n" \
    ^

    Fix the build by casting the macro argument to 'unsigned long' when used
    as an input to the inline asm.

    Reported-by: Nick Desaulniers
    Reported-by: Nathan Chancellor
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Will Deacon
     
  • [ Upstream commit 6460d32014717686d3b7963595950ba2c6d1bb5e ]

    A relatively standard idiom for ensuring that a pair of MMIO writes to a
    device arrive at that device with a specified minimum delay between them
    is as follows:

    writel_relaxed(42, dev_base + CTL1);
    readl(dev_base + CTL1);
    udelay(10);
    writel_relaxed(42, dev_base + CTL2);

    the intention being that the read-back from the device will push the
    prior write to CTL1, and the udelay will hold up the write to CTL1 until
    at least 10us have elapsed.

    Unfortunately, on arm64 where the underlying delay loop is implemented
    as a read of the architected counter, the CPU does not guarantee
    ordering from the readl() to the delay loop and therefore the delay loop
    could in theory be speculated and not provide the desired interval
    between the two writes.

    Fix this in a similar manner to PowerPC by introducing a dummy control
    dependency on the output of readX() which, combined with the ISB in the
    read of the architected counter, guarantees that a subsequent delay loop
    can not be executed until the readX() has returned its result.

    Cc: Benjamin Herrenschmidt
    Cc: Arnd Bergmann
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Will Deacon
     

26 Jan, 2019

2 commits

  • [ Upstream commit 33309ecda0070506c49182530abe7728850ebe78 ]

    The dcache_by_line_op macro suffers from a couple of small problems:

    First, the GAS directives that are currently being used rely on
    assembler behavior that is not documented, and probably not guaranteed
    to produce the correct behavior going forward. As a result, we end up
    with some undefined symbols in cache.o:

    $ nm arch/arm64/mm/cache.o
    ...
    U civac
    ...
    U cvac
    U cvap
    U cvau

    This is due to the fact that the comparisons used to select the
    operation type in the dcache_by_line_op macro are comparing symbols
    not strings, and even though it seems that GAS is doing the right
    thing here (undefined symbols by the same name are equal to each
    other), it seems unwise to rely on this.

    Second, when patching in a DC CVAP instruction on CPUs that support it,
    the fallback path consists of a DC CVAU instruction which may be
    affected by CPU errata that require ARM64_WORKAROUND_CLEAN_CACHE.

    Solve these issues by unrolling the various maintenance routines and
    using the conditional directives that are documented as operating on
    strings. To avoid the complexity of nested alternatives, we move the
    DC CVAP patching to __clean_dcache_area_pop, falling back to a branch
    to __clean_dcache_area_poc if DCPOP is not supported by the CPU.

    Reported-by: Ard Biesheuvel
    Suggested-by: Robin Murphy
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Will Deacon
     
  • [ Upstream commit 6e8830674ea77f57d57a33cca09083b117a71f41 ]

    If the kernel is configured with KASAN_EXTRA, the stack size is
    increased significantly due to setting the GCC -fstack-reuse option to
    "none" [1]. As a result, it can trigger a stack overrun quite often with
    32k stack size compiled using GCC 8. For example, this reproducer

    https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/madvise/madvise06.c

    can trigger a "corrupted stack end detected inside scheduler" very
    reliably with CONFIG_SCHED_STACK_END_CHECK enabled. There are other
    reports at:

    https://lore.kernel.org/lkml/1542144497.12945.29.camel@gmx.us/
    https://lore.kernel.org/lkml/721E7B42-2D55-4866-9C1A-3E8D64F33F9C@gmx.us/

    There are just too many functions that could have a large stack with
    KASAN_EXTRA due to large local variables that have been called over and
    over again without being able to reuse the stacks. Some noticiable ones
    are,

    size
    7536 shrink_inactive_list
    7440 shrink_page_list
    6560 fscache_stats_show
    3920 jbd2_journal_commit_transaction
    3216 try_to_unmap_one
    3072 migrate_page_move_mapping
    3584 migrate_misplaced_transhuge_page
    3920 ip_vs_lblcr_schedule
    4304 lpfc_nvme_info_show
    3888 lpfc_debugfs_nvmestat_data.constprop

    There are other 49 functions over 2k in size while compiling kernel with
    "-Wframe-larger-than=" on this machine. Hence, it is too much work to
    change Makefiles for each object to compile without
    -fsanitize-address-use-after-scope individually.

    [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23

    Signed-off-by: Qian Cai
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Qian Cai
     

23 Jan, 2019

2 commits

  • [ Upstream commit b3669b1e1c09890d61109a1a8ece2c5b66804714 ]

    To allow EL0 (and/or EL1) to use pointer authentication functionality,
    we must ensure that pointer authentication instructions and accesses to
    pointer authentication keys are not trapped to EL2.

    This patch ensures that HCR_EL2 is configured appropriately when the
    kernel is booted at EL2. For non-VHE kernels we set HCR_EL2.{API,APK},
    ensuring that EL1 can access keys and permit EL0 use of instructions.
    For VHE kernels host EL0 (TGE && E2H) is unaffected by these settings,
    and it doesn't matter how we configure HCR_EL2.{API,APK}, so we don't
    bother setting them.

    This does not enable support for KVM guests, since KVM manages HCR_EL2
    itself when running VMs.

    Reviewed-by: Richard Henderson
    Signed-off-by: Mark Rutland
    Signed-off-by: Kristina Martsenko
    Acked-by: Christoffer Dall
    Cc: Catalin Marinas
    Cc: Marc Zyngier
    Cc: Will Deacon
    Cc: kvmarm@lists.cs.columbia.edu
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Mark Rutland
     
  • [ Upstream commit 4eaed6aa2c628101246bcabc91b203bfac1193f8 ]

    In KVM we define the configuration of HCR_EL2 for a VHE HOST in
    HCR_HOST_VHE_FLAGS, but we don't have a similar definition for the
    non-VHE host flags, and open-code HCR_RW. Further, in head.S we
    open-code the flags for VHE and non-VHE configurations.

    In future, we're going to want to configure more flags for the host, so
    lets add a HCR_HOST_NVHE_FLAGS defintion, and consistently use both
    HCR_HOST_VHE_FLAGS and HCR_HOST_NVHE_FLAGS in the kvm code and head.S.

    We now use mov_q to generate the HCR_EL2 value, as we use when
    configuring other registers in head.S.

    Reviewed-by: Marc Zyngier
    Reviewed-by: Richard Henderson
    Signed-off-by: Mark Rutland
    Signed-off-by: Kristina Martsenko
    Reviewed-by: Christoffer Dall
    Cc: Catalin Marinas
    Cc: Marc Zyngier
    Cc: Will Deacon
    Cc: kvmarm@lists.cs.columbia.edu
    Signed-off-by: Will Deacon
    Signed-off-by: Sasha Levin

    Mark Rutland
     

10 Jan, 2019

2 commits

  • commit 169113ece0f29ebe884a6cfcf57c1ace04d8a36a upstream.

    The ARM Linux kernel handles the EABI syscall numbers as follows:

    0 - NR_SYSCALLS-1 : Invoke syscall via syscall table
    NR_SYSCALLS - 0xeffff : -ENOSYS (to be allocated in future)
    0xf0000 - 0xf07ff : Private syscall or -ENOSYS if not allocated
    > 0xf07ff : SIGILL

    Our compat code gets this wrong and ends up sending SIGILL in response
    to all syscalls greater than NR_SYSCALLS which have a value greater
    than 0x7ff in the bottom 16 bits.

    Fix this by defining the end of the ARM private syscall region and
    checking the syscall number against that directly. Update the comment
    while we're at it.

    Cc:
    Cc: Dave Martin
    Reported-by: Pi-Hsun Shih
    Signed-off-by: Will Deacon
    Signed-off-by: Greg Kroah-Hartman

    Will Deacon
     
  • commit df655b75c43fba0f2621680ab261083297fd6d16 upstream.

    Although bit 31 of VTCR_EL2 is RES1, we inadvertently end up setting all
    of the upper 32 bits to 1 as well because we define VTCR_EL2_RES1 as
    signed, which is sign-extended when assigning to kvm->arch.vtcr.

    Lucky for us, the architecture currently treats these upper bits as RES0
    so, whilst we've been naughty, we haven't set fire to anything yet.

    Cc:
    Cc: Marc Zyngier
    Cc: Christoffer Dall
    Signed-off-by: Will Deacon
    Signed-off-by: Marc Zyngier
    Signed-off-by: Greg Kroah-Hartman

    Will Deacon
     

08 Dec, 2018

1 commit

  • commit 874bfc6e5422d2421f7e4d5ea318d30e91679dfe upstream.

    Since commit 4378a7d4be30 ("arm64: implement syscall wrappers")
    introduced "__arm64_" prefix to all syscall wrapper symbols in
    sys_call_table, syscall tracer can not find corresponding
    metadata from syscall name. In the result, we have no syscall
    ftrace events on arm64 kernel, and some bpf testcases are failed
    on arm64.

    To fix this issue, this introduces custom
    arch_syscall_match_sym_name() which skips first 8 bytes when
    comparing the syscall and symbol names.

    Fixes: 4378a7d4be30 ("arm64: implement syscall wrappers")
    Reported-by: Naresh Kamboju
    Signed-off-by: Masami Hiramatsu
    Acked-by: Will Deacon
    Tested-by: Naresh Kamboju
    Cc: stable@vger.kernel.org
    Signed-off-by: Catalin Marinas
    Signed-off-by: Greg Kroah-Hartman

    Masami Hiramatsu
     

27 Nov, 2018

1 commit

  • [ Upstream commit b5bb425871186303e6936fa2581521bdd1964a58 ]

    Clang warns that if the default case is taken, ret will be
    uninitialized.

    ./arch/arm64/include/asm/percpu.h:196:2: warning: variable 'ret' is used
    uninitialized whenever switch default is taken
    [-Wsometimes-uninitialized]
    default:
    ^~~~~~~
    ./arch/arm64/include/asm/percpu.h:200:9: note: uninitialized use occurs
    here
    return ret;
    ^~~
    ./arch/arm64/include/asm/percpu.h:157:19: note: initialize the variable
    'ret' to silence this warning
    unsigned long ret, loop;
    ^
    = 0

    This warning appears several times while building the erofs filesystem.
    While it's not strictly wrong, the BUILD_BUG will prevent this from
    becoming a true problem. Initialize ret to 0 in the default case right
    before the BUILD_BUG to silence all of these warnings.

    Reported-by: Prasad Sodagudi
    Signed-off-by: Nathan Chancellor
    Reviewed-by: Nick Desaulniers
    Signed-off-by: Dennis Zhou
    Signed-off-by: Sasha Levin

    Nathan Chancellor
     

11 Sep, 2018

1 commit


07 Sep, 2018

2 commits

  • The lock has never been used and the page tables are protected by
    mmu_lock in struct kvm.

    Reviewed-by: Suzuki K Poulose
    Signed-off-by: Steven Price
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Steven Price
     
  • kvm_unmap_hva is long gone, and we only have kvm_unmap_hva_range to
    deal with. Drop the now obsolete code.

    Fixes: fb1522e099f0 ("KVM: update to new mmu_notifier semantic v2")
    Cc: James Hogan
    Reviewed-by: Paolo Bonzini
    Signed-off-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Marc Zyngier
     

24 Aug, 2018

1 commit


22 Aug, 2018

1 commit


15 Aug, 2018

1 commit

  • Pull arm64 updates from Will Deacon:
    "A bunch of good stuff in here. Worth noting is that we've pulled in
    the x86/mm branch from -tip so that we can make use of the core
    ioremap changes which allow us to put down huge mappings in the
    vmalloc area without screwing up the TLB. Much of the positive
    diffstat is because of the rseq selftest for arm64.

    Summary:

    - Wire up support for qspinlock, replacing our trusty ticket lock
    code

    - Add an IPI to flush_icache_range() to ensure that stale
    instructions fetched into the pipeline are discarded along with the
    I-cache lines

    - Support for the GCC "stackleak" plugin

    - Support for restartable sequences, plus an arm64 port for the
    selftest

    - Kexec/kdump support on systems booting with ACPI

    - Rewrite of our syscall entry code in C, which allows us to zero the
    GPRs on entry from userspace

    - Support for chained PMU counters, allowing 64-bit event counters to
    be constructed on current CPUs

    - Ensure scheduler topology information is kept up-to-date with CPU
    hotplug events

    - Re-enable support for huge vmalloc/IO mappings now that the core
    code has the correct hooks to use break-before-make sequences

    - Miscellaneous, non-critical fixes and cleanups"

    * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (90 commits)
    arm64: alternative: Use true and false for boolean values
    arm64: kexec: Add comment to explain use of __flush_icache_range()
    arm64: sdei: Mark sdei stack helper functions as static
    arm64, kaslr: export offset in VMCOREINFO ELF notes
    arm64: perf: Add cap_user_time aarch64
    efi/libstub: Only disable stackleak plugin for arm64
    arm64: drop unused kernel_neon_begin_partial() macro
    arm64: kexec: machine_kexec should call __flush_icache_range
    arm64: svc: Ensure hardirq tracing is updated before return
    arm64: mm: Export __sync_icache_dcache() for xen-privcmd
    drivers/perf: arm-ccn: Use devm_ioremap_resource() to map memory
    arm64: Add support for STACKLEAK gcc plugin
    arm64: Add stack information to on_accessible_stack
    drivers/perf: hisi: update the sccl_id/ccl_id when MT is supported
    arm64: fix ACPI dependencies
    rseq/selftests: Add support for arm64
    arm64: acpi: fix alignment fault in accessing ACPI
    efi/arm: map UEFI memory map even w/o runtime services enabled
    efi/arm: preserve early mapping of UEFI memory map longer for BGRT
    drivers: acpi: add dependency of EFI for arm64
    ...

    Linus Torvalds
     

14 Aug, 2018

4 commits

  • Pull perf update from Thomas Gleixner:
    "The perf crowd presents:

    Kernel updates:

    - Removal of jprobes

    - Cleanup and consolidatation the handling of kprobes

    - Cleanup and consolidation of hardware breakpoints

    - The usual pile of fixes and updates to PMUs and event descriptors

    Tooling updates:

    - Updates and improvements all over the place. Nothing outstanding,
    just the (good) boring incremental grump work"

    * 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (103 commits)
    perf trace: Do not require --no-syscalls to suppress strace like output
    perf bpf: Include uapi/linux/bpf.h from the 'perf trace' script's bpf.h
    perf tools: Allow overriding MAX_NR_CPUS at compile time
    perf bpf: Show better message when failing to load an object
    perf list: Unify metric group description format with PMU event description
    perf vendor events arm64: Update ThunderX2 implementation defined pmu core events
    perf cs-etm: Generate branch sample for CS_ETM_TRACE_ON packet
    perf cs-etm: Generate branch sample when receiving a CS_ETM_TRACE_ON packet
    perf cs-etm: Support dummy address value for CS_ETM_TRACE_ON packet
    perf cs-etm: Fix start tracing packet handling
    perf build: Fix installation directory for eBPF
    perf c2c report: Fix crash for empty browser
    perf tests: Fix indexing when invoking subtests
    perf trace: Beautify the AF_INET & AF_INET6 'socket' syscall 'protocol' args
    perf trace beauty: Add beautifiers for 'socket''s 'protocol' arg
    perf trace beauty: Do not print NULL strarray entries
    perf beauty: Add a generator for IPPROTO_ socket's protocol constants
    tools include uapi: Grab a copy of linux/in.h
    perf tests: Fix complex event name parsing
    perf evlist: Fix error out while applying initial delay and LBR
    ...

    Linus Torvalds
     
  • Pull locking/atomics update from Thomas Gleixner:
    "The locking, atomics and memory model brains delivered:

    - A larger update to the atomics code which reworks the ordering
    barriers, consolidates the atomic primitives, provides the new
    atomic64_fetch_add_unless() primitive and cleans up the include
    hell.

    - Simplify cmpxchg() instrumentation and add instrumentation for
    xchg() and cmpxchg_double().

    - Updates to the memory model and documentation"

    * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (48 commits)
    locking/atomics: Rework ordering barriers
    locking/atomics: Instrument cmpxchg_double*()
    locking/atomics: Instrument xchg()
    locking/atomics: Simplify cmpxchg() instrumentation
    locking/atomics/x86: Reduce arch_cmpxchg64*() instrumentation
    tools/memory-model: Rename litmus tests to comply to norm7
    tools/memory-model/Documentation: Fix typo, smb->smp
    sched/Documentation: Update wake_up() & co. memory-barrier guarantees
    locking/spinlock, sched/core: Clarify requirements for smp_mb__after_spinlock()
    sched/core: Use smp_mb() in wake_woken_function()
    tools/memory-model: Add informal LKMM documentation to MAINTAINERS
    locking/atomics/Documentation: Describe atomic_set() as a write operation
    tools/memory-model: Make scripts executable
    tools/memory-model: Remove ACCESS_ONCE() from model
    tools/memory-model: Remove ACCESS_ONCE() from recipes
    locking/memory-barriers.txt/kokr: Update Korean translation to fix broken DMA vs. MMIO ordering example
    MAINTAINERS: Add Daniel Lustig as an LKMM reviewer
    tools/memory-model: Fix ISA2+pooncelock+pooncelock+pombonce name
    tools/memory-model: Add litmus test for full multicopy atomicity
    locking/refcount: Always allow checked forms
    ...

    Linus Torvalds
     
  • Pull genirq updates from Thomas Gleixner:
    "The irq departement provides:

    - A synchronization fix for free_irq() to synchronize just the
    removed interrupt thread on shared interrupt lines.

    - Consolidate the multi low level interrupt entry handling and mvoe
    it to the generic code instead of adding yet another copy for
    RISC-V

    - Refactoring of the ARM LPI allocator and LPI exposure to the
    hypervisor

    - Yet another interrupt chip driver for the JZ4725B SoC

    - Speed up for /proc/interrupts as people seem to love reading this
    file with high frequency

    - Miscellaneous fixes and updates"

    * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
    irqchip/gic-v3-its: Make its_lock a raw_spin_lock_t
    genirq/irqchip: Remove MULTI_IRQ_HANDLER as it's now obselete
    openrisc: Use the new GENERIC_IRQ_MULTI_HANDLER
    arm64: Use the new GENERIC_IRQ_MULTI_HANDLER
    ARM: Convert to GENERIC_IRQ_MULTI_HANDLER
    irqchip: Port the ARM IRQ drivers to GENERIC_IRQ_MULTI_HANDLER
    irqchip/gic-v3-its: Reduce minimum LPI allocation to 1 for PCI devices
    dt-bindings: irqchip: renesas-irqc: Document r8a77980 support
    dt-bindings: irqchip: renesas-irqc: Document r8a77470 support
    irqchip/ingenic: Add support for the JZ4725B SoC
    irqchip/stm32: Add exti0 translation for stm32mp1
    genirq: Remove redundant NULL pointer check in __free_irq()
    irqchip/gic-v3-its: Honor hypervisor enforced LPI range
    irqchip/gic-v3: Expose GICD_TYPER in the rdist structure
    irqchip/gic-v3-its: Drop chunk allocation compatibility
    irqchip/gic-v3-its: Move minimum LPI requirements to individual busses
    irqchip/gic-v3-its: Use full range of LPIs
    irqchip/gic-v3-its: Refactor LPI allocator
    genirq: Synchronize only with single thread on free_irq()
    genirq: Update code comments wrt recycled thread_mask
    ...

    Linus Torvalds
     
  • Pull EFI updates from Thomas Gleixner:
    "The EFI pile:

    - Make mixed mode UEFI runtime service invocations mutually
    exclusive, as mandated by the UEFI spec

    - Perform UEFI runtime services calls from a work queue so the calls
    into the firmware occur from a kernel thread

    - Honor the UEFI memory map attributes for live memory regions
    configured by UEFI as a framebuffer. This works around a coherency
    problem with KVM guests running on ARM.

    - Cleanups, improvements and fixes all over the place"

    * 'efi-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    efivars: Call guid_parse() against guid_t type of variable
    efi/cper: Use consistent types for UUIDs
    efi/x86: Replace references to efi_early->is64 with efi_is_64bit()
    efi: Deduplicate efi_open_volume()
    efi/x86: Add missing NULL initialization in UGA draw protocol discovery
    efi/x86: Merge 32-bit and 64-bit UGA draw protocol setup routines
    efi/x86: Align efi_uga_draw_protocol typedef names to convention
    efi/x86: Merge the setup_efi_pci32() and setup_efi_pci64() routines
    efi/x86: Prevent reentrant firmware calls in mixed mode
    efi/esrt: Only call efi_mem_reserve() for boot services memory
    fbdev/efifb: Honour UEFI memory map attributes when mapping the FB
    efi: Drop type and attribute checks in efi_mem_desc_lookup()
    efi/libstub/arm: Add opt-in Kconfig option for the DTB loader
    efi: Remove the declaration of efi_late_init() as the function is unused
    efi/cper: Avoid using get_seconds()
    efi: Use a work queue to invoke EFI Runtime Services
    efi/x86: Use non-blocking SetVariable() for efi_delete_dummy_variable()
    efi/x86: Clean up the eboot code

    Linus Torvalds
     

12 Aug, 2018

1 commit


06 Aug, 2018

1 commit


03 Aug, 2018

1 commit

  • It appears arm64 copied arm's GENERIC_IRQ_MULTI_HANDLER code, but made
    it unconditional.

    Converts the arm64 code to use the new generic code, which simply consists
    of deleting the arm64 code and setting MULTI_IRQ_HANDLER instead.

    Signed-off-by: Palmer Dabbelt
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Christoph Hellwig
    Cc: linux@armlinux.org.uk
    Cc: catalin.marinas@arm.com
    Cc: Will Deacon
    Cc: jonas@southpole.se
    Cc: stefan.kristiansson@saunalahti.fi
    Cc: shorne@gmail.com
    Cc: jason@lakedaemon.net
    Cc: marc.zyngier@arm.com
    Cc: Arnd Bergmann
    Cc: nicolas.pitre@linaro.org
    Cc: vladimir.murzin@arm.com
    Cc: keescook@chromium.org
    Cc: jinb.park7@gmail.com
    Cc: yamada.masahiro@socionext.com
    Cc: alexandre.belloni@bootlin.com
    Cc: pombredanne@nexb.com
    Cc: Greg KH
    Cc: kstewart@linuxfoundation.org
    Cc: jhogan@kernel.org
    Cc: mark.rutland@arm.com
    Cc: ard.biesheuvel@linaro.org
    Cc: james.morse@arm.com
    Cc: linux-arm-kernel@lists.infradead.org
    Cc: openrisc@lists.librecores.org
    Link: https://lkml.kernel.org/r/20180622170126.6308-4-palmer@sifive.com

    Palmer Dabbelt
     

02 Aug, 2018

2 commits

  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Commit 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack and
    data segments") tried to initialize various left-over ad-hoc vma's
    "properly", but actually made things worse for the temporary vma's used
    for TLB flushing.

    vma_init() doesn't actually initialize all of the vma, just a few
    fields, so doing something like

    - struct vm_area_struct vma = { .vm_mm = tlb->mm, };
    + struct vm_area_struct vma;
    +
    + vma_init(&vma, tlb->mm);

    was actually very bad: instead of having a nicely initialized vma with
    every field but "vm_mm" zeroed, you'd have an entirely uninitialized vma
    with only a couple of fields initialized. And they weren't even fields
    that the code in question mostly cared about.

    The flush_tlb_range() function takes a "struct vma" rather than a
    "struct mm_struct", because a few architectures actually care about what
    kind of range it is - being able to only do an ITLB flush if it's a
    range that doesn't have data accesses enabled, for example. And all the
    normal users already have the vma for doing the range invalidation.

    But a few people want to call flush_tlb_range() with a range they just
    made up, so they also end up using a made-up vma. x86 just has a
    special "flush_tlb_mm_range()" function for this, but other
    architectures (arm and ia64) do the "use fake vma" thing instead, and
    thus got caught up in the vma_init() changes.

    At the same time, the TLB flushing code really doesn't care about most
    other fields in the vma, so vma_init() is just unnecessary and
    pointless.

    This fixes things by having an explicit "this is just an initializer for
    the TLB flush" initializer macro, which is used by the arm/arm64/ia64
    people who mis-use this interface with just a dummy vma.

    Fixes: 2c4541e24c55 ("mm: use vma_init() to initialize VMAs on stack and data segments")
    Cc: Dmitry Vyukov
    Cc: Oleg Nesterov
    Cc: Andrea Arcangeli
    Cc: Kirill Shutemov
    Cc: Andrew Morton
    Cc: John Stultz
    Cc: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

31 Jul, 2018

1 commit

  • When kernel mode NEON was first introduced to the arm64 kernel,
    every call to kernel_neon_begin()/_end() stacked resp. unstacked
    the entire NEON register file, making it worthwile to reduce the
    number of used NEON registers to a bare minimum, and only stack
    those. kernel_neon_begin_partial() was introduced for this purpose,
    but after the refactoring for SVE and other changes, it no longer
    exists and was simply #define'd to kernel_neon_begin() directly.

    In the mean time, all users have been updated, so let's remove
    the fallback macro.

    Reviewed-by: Dave Martin
    Signed-off-by: Ard Biesheuvel
    Signed-off-by: Will Deacon

    Ard Biesheuvel
     

27 Jul, 2018

1 commit

  • Make sure to initialize all VMAs properly, not only those which come
    from vm_area_cachep.

    Link: http://lkml.kernel.org/r/20180724121139.62570-3-kirill.shutemov@linux.intel.com
    Signed-off-by: Kirill A. Shutemov
    Acked-by: Linus Torvalds
    Reviewed-by: Andrew Morton
    Cc: Dmitry Vyukov
    Cc: Oleg Nesterov
    Cc: Andrea Arcangeli
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

26 Jul, 2018

2 commits

  • This adds support for the STACKLEAK gcc plugin to arm64 by implementing
    stackleak_check_alloca(), based heavily on the x86 version, and adding the
    two helpers used by the stackleak common code: current_top_of_stack() and
    on_thread_stack(). The stack erasure calls are made at syscall returns.
    Additionally, this disables the plugin in hypervisor and EFI stub code,
    which are out of scope for the protection.

    Acked-by: Alexander Popov
    Reviewed-by: Mark Rutland
    Reviewed-by: Kees Cook
    Signed-off-by: Laura Abbott
    Signed-off-by: Will Deacon

    Laura Abbott
     
  • In preparation for enabling the stackleak plugin on arm64,
    we need a way to get the bounds of the current stack. Extend
    on_accessible_stack to get this information.

    Acked-by: Alexander Popov
    Reviewed-by: Mark Rutland
    Signed-off-by: Laura Abbott
    [will: folded in fix for allmodconfig build breakage w/ sdei]
    Signed-off-by: Will Deacon

    Laura Abbott
     

25 Jul, 2018

1 commit


23 Jul, 2018

1 commit

  • This is a fix against the issue that crash dump kernel may hang up
    during booting, which can happen on any ACPI-based system with "ACPI
    Reclaim Memory."

    (kernel messages after panic kicked off kdump)
    (snip...)
    Bye!
    (snip...)
    ACPI: Core revision 20170728
    pud=000000002e7d0003, *pmd=000000002e7c0003, *pte=00e8000039710707
    Internal error: Oops: 96000021 [#1] SMP
    Modules linked in:
    CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.14.0-rc6 #1
    task: ffff000008d05180 task.stack: ffff000008cc0000
    PC is at acpi_ns_lookup+0x25c/0x3c0
    LR is at acpi_ds_load1_begin_op+0xa4/0x294
    (snip...)
    Process swapper/0 (pid: 0, stack limit = 0xffff000008cc0000)
    Call trace:
    (snip...)
    [] acpi_ns_lookup+0x25c/0x3c0
    [] acpi_ds_load1_begin_op+0xa4/0x294
    [] acpi_ps_build_named_op+0xc4/0x198
    [] acpi_ps_create_op+0x14c/0x270
    [] acpi_ps_parse_loop+0x188/0x5c8
    [] acpi_ps_parse_aml+0xb0/0x2b8
    [] acpi_ns_one_complete_parse+0x144/0x184
    [] acpi_ns_parse_table+0x48/0x68
    [] acpi_ns_load_table+0x4c/0xdc
    [] acpi_tb_load_namespace+0xe4/0x264
    [] acpi_load_tables+0x48/0xc0
    [] acpi_early_init+0x9c/0xd0
    [] start_kernel+0x3b4/0x43c
    Code: b9008fb9 2a000318 36380054 32190318 (b94002c0)
    ---[ end trace c46ed37f9651c58e ]---
    Kernel panic - not syncing: Fatal exception
    Rebooting in 10 seconds..

    (diagnosis)
    * This fault is a data abort, alignment fault (ESR=0x96000021)
    during reading out ACPI table.
    * Initial ACPI tables are normally stored in system ram and marked as
    "ACPI Reclaim memory" by the firmware.
    * After the commit f56ab9a5b73c ("efi/arm: Don't mark ACPI reclaim
    memory as MEMBLOCK_NOMAP"), those regions are differently handled
    as they are "memblock-reserved", without NOMAP bit.
    * So they are now excluded from device tree's "usable-memory-range"
    which kexec-tools determines based on a current view of /proc/iomem.
    * When crash dump kernel boots up, it tries to accesses ACPI tables by
    mapping them with ioremap(), not ioremap_cache(), in acpi_os_ioremap()
    since they are no longer part of mapped system ram.
    * Given that ACPI accessor/helper functions are compiled in without
    unaligned access support (ACPI_MISALIGNMENT_NOT_SUPPORTED),
    any unaligned access to ACPI tables can cause a fatal panic.

    With this patch, acpi_os_ioremap() always honors memory attribute
    information provided by the firmware (EFI) and retaining cacheability
    allows the kernel safe access to ACPI tables.

    Signed-off-by: AKASHI Takahiro
    Reviewed-by: James Morse
    Reviewed-by: Ard Biesheuvel
    Reported-by and Tested-by: Bhupesh Sharma
    Signed-off-by: Will Deacon

    AKASHI Takahiro
     

22 Jul, 2018

1 commit

  • There's one ARM, one x86_32 and one x86_64 version of efi_open_volume()
    which can be folded into a single shared version by masking their
    differences with the efi_call_proto() macro introduced by commit:

    3552fdf29f01 ("efi: Allow bitness-agnostic protocol calls").

    To be able to dereference the device_handle attribute from the
    efi_loaded_image_t table in an arch- and bitness-agnostic manner,
    introduce the efi_table_attr() macro (which already exists for x86)
    to arm and arm64.

    No functional change intended.

    Signed-off-by: Lukas Wunner
    Signed-off-by: Ard Biesheuvel
    Cc: Andy Shevchenko
    Cc: Hans de Goede
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-efi@vger.kernel.org
    Link: http://lkml.kernel.org/r/20180720014726.24031-7-ard.biesheuvel@linaro.org
    Signed-off-by: Ingo Molnar

    Lukas Wunner
     

21 Jul, 2018

3 commits

  • The get/set events helpers to do some work to check reserved
    and padding fields are zero. This is useful on 32bit too.

    Move this code into virt/kvm/arm/arm.c, and give the arch
    code some underscores.

    This is temporarily hidden behind __KVM_HAVE_VCPU_EVENTS until
    32bit is wired up.

    Signed-off-by: James Morse
    Reviewed-by: Dongjiu Geng
    Signed-off-by: Marc Zyngier

    James Morse
     
  • For the migrating VMs, user space may need to know the exception
    state. For example, in the machine A, KVM make an SError pending,
    when migrate to B, KVM also needs to pend an SError.

    This new IOCTL exports user-invisible states related to SError.
    Together with appropriate user space changes, user space can get/set
    the SError exception state to do migrate/snapshot/suspend.

    Signed-off-by: Dongjiu Geng
    Reviewed-by: James Morse
    [expanded documentation wording]
    Signed-off-by: James Morse
    Signed-off-by: Marc Zyngier

    Dongjiu Geng
     
  • When running on a non-VHE system, we initialize tpidr_el2 to
    contain the per-CPU offset required to reach per-cpu variables.

    Actually, we initialize it twice: the first time as part of the
    EL2 initialization, by copying tpidr_el1 into its el2 counterpart,
    and another time by calling into __kvm_set_tpidr_el2.

    It turns out that the first part is wrong, as it includes the
    distance between the kernel mapping and the linear mapping, while
    EL2 only cares about the linear mapping. This was the last vestige
    of the first per-cpu use of tpidr_el2 that came in with SDEI.
    The only caller then was hyp_panic(), and its now using the
    pc-relative get_host_ctxt() stuff, instead of kimage addresses
    from the literal pool.

    It is not a big deal, as we override it straight away, but it is
    slightly confusing. In order to clear said confusion, let's
    set this directly as part of the hyp-init code, and drop the
    ad-hoc HYP helper.

    Reviewed-by: James Morse
    Acked-by: Christoffer Dall
    Signed-off-by: Marc Zyngier

    Marc Zyngier