29 Aug, 2016

1 commit


28 Aug, 2016

1 commit

  • Pull KVM fixes from Paolo Bonzini:
    "ARM:
    - fixes for ITS init issues, error handling, IRQ leakage, race
    conditions
    - an erratum workaround for timers
    - some removal of misleading use of errors and comments
    - a fix for GICv3 on 32-bit guests

    MIPS:
    - fix for where the guest could wrongly map the first page of
    physical memory

    x86:
    - nested virtualization fixes"

    * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
    MIPS: KVM: Check for pfn noslot case
    kvm: nVMX: fix nested tsc scaling
    KVM: nVMX: postpone VMCS changes on MSR_IA32_APICBASE write
    KVM: nVMX: fix msr bitmaps to prevent L2 from accessing L0 x2APIC
    arm64: KVM: report configured SRE value to 32-bit world
    arm64: KVM: remove misleading comment on pmu status
    KVM: arm/arm64: timer: Workaround misconfigured timer interrupt
    arm64: Document workaround for Cortex-A72 erratum #853709
    KVM: arm/arm64: Change misleading use of is_error_pfn
    KVM: arm64: ITS: avoid re-mapping LPIs
    KVM: arm64: check for ITS device on MSI injection
    KVM: arm64: ITS: move ITS registration into first VCPU run
    KVM: arm64: vgic-its: Make updates to propbaser/pendbaser atomic
    KVM: arm64: vgic-its: Plug race in vgic_put_irq
    KVM: arm64: vgic-its: Handle errors from vgic_add_lpi
    KVM: arm64: ITS: return 1 on successful MSI injection

    Linus Torvalds
     

27 Aug, 2016

4 commits

  • Merge fixes from Andrew Morton:
    "11 fixes"

    * emailed patches from Andrew Morton :
    mm: silently skip readahead for DAX inodes
    dax: fix device-dax region base
    fs/seq_file: fix out-of-bounds read
    mm: memcontrol: avoid unused function warning
    mm: clarify COMPACTION Kconfig text
    treewide: replace config_enabled() with IS_ENABLED() (2nd round)
    printk: fix parsing of "brl=" option
    soft_dirty: fix soft_dirty during THP split
    sysctl: handle error writing UINT_MAX to u32 fields
    get_maintainer: quiet noisy implicit -f vcs_file_exists checking
    byteswap: don't use __builtin_bswap*() with sparse

    Linus Torvalds
     
  • Pull ARM64 fix from Catalin Marinas:
    "ARM64 fix to avoid potential TLB conflict when CONFIG_RANDOMIZE_BASE
    is enabled"

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: avoid TLB conflict with CONFIG_RANDOMIZE_BASE

    Linus Torvalds
     
  • Pull PCI fixes from Bjorn Helgaas:
    "Resource management:
    - Update "pci=resource_alignment" documentation (Mathias Koehrer)

    MSI:
    - Use positive flags in pci_alloc_irq_vectors() (Christoph Hellwig)
    - Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors() (Christoph Hellwig)

    Intel VMD host bridge driver:
    - Fix infinite loop executing irq's (Keith Busch)"

    * tag 'pci-v4.8-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
    x86/PCI: VMD: Fix infinite loop executing irq's
    PCI: Call pci_intx() when using legacy interrupts in pci_alloc_irq_vectors()
    PCI: Use positive flags in pci_alloc_irq_vectors()
    PCI: Update "pci=resource_alignment" documentation

    Linus Torvalds
     
  • Commit 97f2645f358b ("tree-wide: replace config_enabled() with
    IS_ENABLED()") mostly killed config_enabled(), but some new users have
    appeared for v4.8-rc1. They are all used for a boolean option, so can
    be replaced with IS_ENABLED() safely.

    Link: http://lkml.kernel.org/r/1471970749-24867-1-git-send-email-yamada.masahiro@socionext.com
    Signed-off-by: Masahiro Yamada
    Acked-by: Kees Cook
    Acked-by: Peter Oberparleiter
    Cc: Martin Schwidefsky
    Cc: Heiko Carstens
    Cc: Ralf Baechle
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Masahiro Yamada
     

25 Aug, 2016

4 commits

  • When CONFIG_RANDOMIZE_BASE is selected, we modify the page tables to remap the
    kernel at a newly-chosen VA range. We do this with the MMU disabled, but do not
    invalidate TLBs prior to re-enabling the MMU with the new tables. Thus the old
    mappings entries may still live in TLBs, and we risk violating
    Break-Before-Make requirements, leading to TLB conflicts and/or other issues.

    We invalidate TLBs when we uninsall the idmap in early setup code, but prior to
    this we are subject to issues relating to the Break-Before-Make violation.

    Avoid these issues by invalidating the TLBs before the new mappings can be
    used by the hardware.

    Fixes: f80fb3a3d508 ("arm64: add support for kernel ASLR")
    Cc: # 4.6+
    Acked-by: Ard Biesheuvel
    Acked-by: Will Deacon
    Signed-off-by: Mark Rutland
    Signed-off-by: Catalin Marinas

    Mark Rutland
     
  • Pull UML fix from Richard Weinberger:
    "This contains a fix for a build regression introduced during the merge
    window"

    * 'for-linus-4.8-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml:
    um: Don't discard .text.exit section

    Linus Torvalds
     
  • Pull xen regression fix from David Vrabel:
    "Fix a regression in the xenbus device preventing userspace tools from
    working"

    * tag 'for-linus-4.8b-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen: change the type of xen_vcpu_id to uint32_t
    xenbus: don't look up transaction IDs for ordinary writes

    Linus Torvalds
     
  • We pass xen_vcpu_id mapping information to hypercalls which require
    uint32_t type so it would be cleaner to have it as uint32_t. The
    initializer to -1 can be dropped as we always do the mapping before using
    it and we never check the 'not set' value anyway.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: David Vrabel

    Vitaly Kuznetsov
     

24 Aug, 2016

4 commits

  • native_smp_prepare_cpus
    -> default_setup_apic_routing
    -> enable_IR_x2apic
    -> irq_remapping_prepare
    -> intel_prepare_irq_remapping
    -> intel_setup_irq_remapping

    So IR table is setup even if "noapic" boot parameter is added. As a result we
    crash later when the interrupt affinity is set due to a half initialized
    remapping infrastructure.

    Prevent remap initialization when IOAPIC is disabled.

    Signed-off-by: Wanpeng Li
    Cc: Peter Zijlstra
    Cc: Joerg Roedel
    Link: http://lkml.kernel.org/r/1471954039-3942-1-git-send-email-wanpeng.li@hotmail.com
    Cc: stable@vger.kernel.org
    Signed-off-by: Thomas Gleixner

    Wanpeng Li
     
  • We can't initialize the list head on deletion as this causes the node to
    point to itself, which causes an infinite loop if vmd_irq() happens to be
    servicing that node.

    The list initialization was trying to fix a bug from multiple calls to
    disable the same IRQ. Fix this instead by having the VMD driver track if
    the interrupt is enabled.

    [bhelgaas: changelog, add "Fixes"]
    Fixes: 97e923063575 ("x86/PCI: VMD: Initialize list item in IRQ disable")
    Reported-by: Grzegorz Koczot
    Tested-by: Miroslaw Drost
    Signed-off-by: Keith Busch
    Signed-off-by: Bjorn Helgaas
    Acked-by Jon Derrick:

    Keith Busch
     
  • Commit e41f501d3912 ("vmlinux.lds: account for destructor sections")
    added '.text.exit' to EXIT_TEXT which is discarded at link time by default.
    This breaks compilation of UML:
    `.text.exit' referenced in section `.fini_array' of
    /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/libc.a(sdlerror.o):
    defined in discarded section `.text.exit' of
    /usr/lib/gcc/x86_64-linux-gnu/6/../../../x86_64-linux-gnu/libc.a(sdlerror.o)

    Apparently UML doesn't want to discard exit text, so let's place all EXIT_TEXT
    sections in .exit.text.

    Fixes: e41f501d3912 ("vmlinux.lds: account for destructor sections")
    Reported-by: Stefan Traby
    Signed-off-by: Andrey Ryabinin
    Cc:
    Acked-by: Dmitry Vyukov
    Signed-off-by: Richard Weinberger

    Andrey Ryabinin
     
  • Pull crypto fixes from Herbert Xu:
    "This fixes a number of memory corruption bugs in the newly added
    sha256-mb/sha256-mb code"

    * 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
    crypto: sha512-mb - fix ctx pointer
    crypto: sha256-mb - fix ctx pointer and digest copy

    Linus Torvalds
     

23 Aug, 2016

1 commit

  • Pull ARC fixes from Vineet Gupta:

    - support for Syscall ABI v4 with upstream gcc 6.x

    - lockdep fix (Daniel Mentz)

    - gdb register clobber (Liav Rehana)

    - couple of missing exports for modules

    - other fixes here and there

    * tag 'arc-4.8-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
    ARC: export __udivdi3 for modules
    ARC: mm: fix build breakage with STRICT_MM_TYPECHECKS
    ARC: export kmap
    ARC: Support syscall ABI v4
    ARC: use correct offset in pt_regs for saving/restoring user mode r25
    ARC: Elide redundant setup of DMA callbacks
    ARC: Call trace_hardirqs_on() before enabling irqs

    Linus Torvalds
     

20 Aug, 2016

7 commits

  • When building gccgo in userspace, errno.h gets parsed and the go include file
    sysinfo.go is generated.

    Since EREFUSED is defined to the same value as ECONNREFUSED, and ECONNREFUSED
    is defined later on in errno.h, this leads to go complaining that EREFUSED
    isn't defined yet.

    Fix this trivial problem by moving the define of EREFUSED down after
    ECONNREFUSED in errno.h (and clean up the indenting while touching this line).

    Signed-off-by: Helge Deller
    Cc: stable@vger.kernel.org

    Helge Deller
     
  • Commit 54b66800907 (parisc: Add native high-resolution sched_clock()
    implementation) added support to use the CPU-internal cr16 counters as reliable
    clocksource with the help of HAVE_UNSTABLE_SCHED_CLOCK.

    Sadly the commit missed to remove the hack which prevented cr16 to become the
    default clocksource even on SMP systems.

    Signed-off-by: Helge Deller
    Cc: stable@vger.kernel.org # 4.7+

    Helge Deller
     
  • Some module using div_u64() was failing to link because the libgcc 64-bit
    divide assist routine was not being exported for modules

    Reported-by: avinashp@quantenna.com
    Cc: stable@vger.kernel.org
    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • | CC mm/memory.o
    | In file included from ../mm/memory.c:53:0:
    | ../include/linux/pfn_t.h: In function ‘pfn_t_pte’:
    | ../include/linux/pfn_t.h:78:2: error: conversion to non-scalar type requested
    | return pfn_pte(pfn_t_to_pfn(pfn), pgprot);

    With STRICT_MM_TYPECHECKS pte_t is a struct and the offending code
    forces a cast which ends up shifting a struct and hence the gcc warning.

    Note that in recent past some of the arches (aarch64, s390) made
    STRICT_MM_TYPECHECKS default, but we don't for ARC as this leads to slightly
    worse generated code, given ARC ABI definition of returning structs
    (which pte_t would become)

    Quoting from ARC ABI...

    "Results of type struct are returned in a caller-supplied temporary
    variable whose address is passed in r0.
    For such functions, the arguments are shifted so that they are
    passed in r1 and up."

    So
    - struct to be returned would be allocated on stack requiring extra
    code at call sites
    - callee updates stack memory to facilitate the return (vs. simple
    MOV into return reg r0)

    Hence STRICT_MM_TYPECHECKS is not enabled by default for ARC

    Cc: #4.4+
    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • | MODPOST 7 modules
    | ERROR: "kmap" [fs/ext2/ext2.ko] undefined!
    | ../scripts/Makefile.modpost:91: recipe for target '__modpost' failed

    Cc:
    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • The syscall ABI includes the gcc functional calling ABI since a syscall
    implies userland caller and kernel callee.

    The current gcc ABI (v3) for ARCv2 ISA required 64-bit data be passed in
    even-odd register pairs, (potentially punching reg holes when passing such
    values as args). This was partly driven by the fact that the double-word
    LDD/STD instructions in ARCv2 expect the register alignment and thus gcc
    forcing this avoids extra MOV at the cost of a few unused register (which we
    have plenty anyways).

    This however was rejected as part of upstreaming gcc port to HS. So the new
    ABI v4 doesn't enforce the even-odd reg restriction.

    Do note that for ARCompact ISA builds v3 and v4 are practically the same in
    terms of gcc code generation.

    In terms of change management, we infer the new ABI if gcc 6.x onwards
    is used for building the kernel.

    This also needs a stable backport to enable older kernels to work with
    new tools/user-space

    Cc:
    Signed-off-by: Vineet Gupta

    Vineet Gupta
     
  • User mode callee regs are explicitly collected before signal delivery or
    breakpoint trap. r25 is special for kernel as it serves as task pointer,
    so user mode value is clobbered very early. It is saved in pt_regs where
    generally only scratch (aka caller saved) regs are saved.

    The code to access the corresponding pt_regs location had a subtle bug as
    it was using load/store with scaling of offset, whereas the offset was already
    byte wise correct. So fix this by replacing LD.AS with a standard LD

    Cc:
    Signed-off-by: Liav Rehana
    Reviewed-by: Alexey Brodkin
    [vgupta: rewrote title and commit log]
    Signed-off-by: Vineet Gupta

    Liav Rehana
     

19 Aug, 2016

6 commits

  • When mapping a page into the guest we error check using is_error_pfn(),
    however this doesn't detect a value of KVM_PFN_NOSLOT, indicating an
    error HVA for the page. This can only happen on MIPS right now due to
    unusual memslot management (e.g. being moved / removed / resized), or
    with an Enhanced Virtual Memory (EVA) configuration where the default
    KVM_HVA_ERR_* and kvm_is_error_hva() definitions are unsuitable (fixed
    in a later patch). This case will be treated as a pfn of zero, mapping
    the first page of physical memory into the guest.

    It would appear the MIPS KVM port wasn't updated prior to being merged
    (in v3.10) to take commit 81c52c56e2b4 ("KVM: do not treat noslot pfn as
    a error pfn") into account (merged v3.8), which converted a bunch of
    is_error_pfn() calls to is_error_noslot_pfn(). Switch to using
    is_error_noslot_pfn() instead to catch this case properly.

    Fixes: 858dd5d45733 ("KVM/MIPS32: MMU/TLB operations for the Guest.")
    Signed-off-by: James Hogan
    Cc: Paolo Bonzini
    Cc: Radim Krčmář
    Cc: Ralf Baechle
    Cc: linux-mips@linux-mips.org
    Cc: kvm@vger.kernel.org
    Cc: # 3.10.y-
    Signed-off-by: Paolo Bonzini

    James Hogan
     
  • Pull DeviceTree fixes from Rob Herring:

    - a couple of DT node ref counting fixes

    - fix __unflatten_device_tree for PPC PCI hotplug case

    - rework marking irq controllers as OF_POPULATED in cases where real
    driver is used.

    - disable of_platform_default_populate_init on PPC. The change in
    initcall order causes problems which need to be sorted out later.

    * tag 'devicetree-fixes-for-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
    of: fix reference counting in of_graph_get_endpoint_by_regs
    of/platform: disable the of_platform_default_populate_init() for all the ppc boards
    ARM: imx6: mark GPC node as not populated after irq init to probe pm domain driver
    of/irq: Mark interrupt controllers as populated before initialisation
    drivers/of: Validate device node in __unflatten_device_tree()
    of: Delete an unnecessary check before the function call "of_node_put"

    Linus Torvalds
     
  • Pull x86 fixes from Ingo Molnar:
    "An initrd microcode loading fix, and an SMP bootup topology setup fix
    to resolve crashes on SGI/UV systems if the BIOS is configured in a
    certain way"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/smp: Fix __max_logical_packages value setup
    x86/microcode/AMD: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y

    Linus Torvalds
     
  • Pull arm64 fixes from Catalin Marinas:

    - Avoid a literal load with the MMU off on the CPU resume path
    (potential inconsistency between cache and RAM)

    - Build error with CONFIG_ACPI=n fixed

    - Compiler warning in the arch/arm64/mm/dump.c code fixed

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: Fix shift warning in arch/arm64/mm/dump.c
    arm64: kernel: avoid literal load of virtual address with MMU off
    arm64: Fix NUMA build error when !CONFIG_ACPI

    Linus Torvalds
     
  • Pull ARM fixes from Russell King:
    "Only three fixes this time:

    - Emil found an overflow problem with the memory layout sanity check.

    - Ard Biesheuvel noticed that late-allocated page tables (for EFI)
    weren't being properly constructed.

    - Guenter Roeck reported a problem found on qemu caused by the recent
    addr_limit changes"

    * 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
    ARM: fix address limit restoration for undefined instructions
    ARM: 8591/1: mm: use fully constructed struct pages for EFI pgd allocations
    ARM: 8590/1: sanity_check_meminfo(): avoid overflow on vmalloc_limit

    Linus Torvalds
     
  • Pull power management fixes from Rafael Wysocki:
    "More hibernation-related material: one fix for a recent regression in
    the core, one small cleanup of the x86-64 resume code and a
    documentation update.

    Specifics:

    - Fix a hibernate core regression resulting from uncovering a latent
    bug in its implementation of memory bitmaps by a recent commit
    (James Morse).

    - Use __pa() to compute a physical address in the x86-64 code
    finalizing resume from hibernation (Rafael Wysocki).

    - Update power management documentation related to system sleep
    states to remove outdated information from it and to add a
    description of a recently introduced hibernation debug feature to
    it (Rafael Wysocki)"

    * tag 'pm-4.8-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
    PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
    x86/power/64: Use __pa() for physical address computation
    PM / sleep: Update some system sleep documentation

    Linus Torvalds
     

18 Aug, 2016

10 commits

  • When building with 48-bit VAs and 16K page configuration, it's possible
    to get the following warning when building the arm64 page table dumping
    code:

    arch/arm64/mm/dump.c: In function ‘walk_pud’:
    arch/arm64/mm/dump.c:274:102: warning: right shift count >= width of type [-Wshift-count-overflow]

    This is because pud_offset(pgd, 0) performs a shift to the right by 36
    while the value 0 has the type 'int' by default, therefore 32-bit.

    This patch modifies all the p*_offset() uses in arch/arm64/mm/dump.c to
    use 0UL for the address argument.

    Acked-by: Mark Rutland
    Signed-off-by: Catalin Marinas

    Catalin Marinas
     
  • …t/kvmarm/kvmarm into HEAD

    KVM/ARM Fixes for v4.8-rc3

    This tag contains the following fixes on top of v4.8-rc1:
    - ITS init issues
    - ITS error handling issues
    - ITS IRQ leakage fix
    - Plug a couple of ITS race conditions
    - An erratum workaround for timers
    - Some removal of misleading use of errors and comments
    - A fix for GICv3 on 32-bit guests

    Paolo Bonzini
     
  • When the host supported TSC scaling, L2 would use a TSC multiplier of
    0, which causes a VM entry failure. Now L2's TSC uses the same
    multiplier as L1.

    Signed-off-by: Peter Feiner
    Signed-off-by: Paolo Bonzini

    Peter Feiner
     
  • If vmcs12 does not intercept APIC_BASE writes, then KVM will handle the
    write with vmcs02 as the current VMCS.
    This will incorrectly apply modifications intended for vmcs01 to vmcs02
    and L2 can use it to gain access to L0's x2APIC registers by disabling
    virtualized x2APIC while using msr bitmap that assumes enabled.

    Postpone execution of vmx_set_virtual_x2apic_mode until vmcs01 is the
    current VMCS. An alternative solution would temporarily make vmcs01 the
    current VMCS, but it requires more care.

    Fixes: 8d14695f9542 ("x86, apicv: add virtual x2apic support")
    Reported-by: Jim Mattson
    Reviewed-by: Wanpeng Li
    Signed-off-by: Radim Krčmář

    Radim Krčmář
     
  • msr bitmap can be used to avoid a VM exit (interception) on guest MSR
    accesses. In some configurations of VMX controls, the guest can even
    directly access host's x2APIC MSRs. See SDM 29.5 VIRTUALIZING MSR-BASED
    APIC ACCESSES.

    L2 could read all L0's x2APIC MSRs and write TPR, EOI, and SELF_IPI.
    To do so, L1 would first trick KVM to disable all possible interceptions
    by enabling APICv features and then would turn those features off;
    nested_vmx_merge_msr_bitmap() only disabled interceptions, so VMX would
    not intercept previously enabled MSRs even though they were not safe
    with the new configuration.

    Correctly re-enabling interceptions is not enough as a second bug would
    still allow L1+L2 to access host's MSRs: msr bitmap was shared for all
    VMCSs, so L1 could trigger a race to get the desired combination of msr
    bitmap and VMX controls.

    This fix allocates a msr bitmap for every L1 VCPU, allows only safe
    x2APIC MSRs from L1's msr bitmap, and disables msr bitmaps if they would
    have to intercept everything anyway.

    Fixes: 3af18d9c5fe9 ("KVM: nVMX: Prepare for using hardware MSR bitmap")
    Reported-by: Jim Mattson
    Suggested-by: Wincy Van
    Reviewed-by: Wanpeng Li
    Signed-off-by: Radim Krčmář

    Radim Krčmář
     
  • Frank reported kernel panic when he disabled several cores in BIOS
    via following option:

    Core Disable Bitmap(Hex) [0]

    with number 0xFFE, which leaves 16 CPUs in system (out of 48).

    The kernel panic below goes along with following messages:

    smpboot: Max logical packages: 2^M
    smpboot: APIC(0) Converting physical 0 to logical package 0^M
    smpboot: APIC(20) Converting physical 1 to logical package 1^M
    smpboot: APIC(40) Package 2 exceeds logical package map^M
    smpboot: CPU 8 APICId 40 disabled^M
    smpboot: APIC(60) Package 3 exceeds logical package map^M
    smpboot: CPU 12 APICId 60 disabled^M
    ...
    general protection fault: 0000 [#1] SMP^M
    Modules linked in:^M
    CPU: 15 PID: 1 Comm: swapper/0 Not tainted 4.7.0-rc5+ #1^M
    Hardware name: SGI UV300/UV300, BIOS SGI UV 300 series BIOS 05/25/2016^M
    task: ffff8801673e0000 ti: ffff8801673ac000 task.ti: ffff8801673ac000^M
    RIP: 0010:[] [] uncore_change_context+0xd4/0x180^M
    ...
    [] uncore_event_init_cpu+0x6c/0x70^M
    [] intel_uncore_init+0x1c2/0x2dd^M
    [] ? uncore_cpu_setup+0x17/0x17^M
    [] do_one_initcall+0x50/0x190^M
    [] ? parse_args+0x293/0x480^M
    [] kernel_init_freeable+0x1a5/0x249^M
    [] ? set_debug_rodata+0x12/0x12^M
    [] kernel_init+0xe/0x110^M
    [] ret_from_fork+0x1f/0x40^M
    [] ? rest_init+0x80/0x80^M

    The reason for the panic is wrong value of __max_logical_packages,
    which lets logical_package_map uninitialized and the uncore code
    relying on this map being properly initialized (maybe we should
    add some safety checks there as well).

    The __max_logical_packages is computed as:

    DIV_ROUND_UP(total_cpus, ncpus);
    - ncpus being number of cores

    With above BIOS setup we get total_cpus == 16 which set
    __max_logical_packages to 2 (ncpus is 12).

    Once topology_update_package_map processes CPU with logical
    pkg over 2 we display above messages and fail to initialize
    the physical_to_logical_pkg map, which makes the uncore code
    crash.

    The fix is to remove logical_package_map bitmap completely
    and keep and update the logical_packages number instead.

    After we enumerate all the present CPUs, we check if the
    enumerated logical packages count is within its computed
    maximum from BIOS data.

    If it's not the case, we set this maximum to the new enumerated
    value and freeze any new addition of logical packages.

    The freeze is because lot of init code like uncore/rapl/cqm
    depends on having maximum logical package value set to allocate
    their data, so we can't change it later on.

    Prarit Bhargava tested the patch and confirms that it solves
    the problem:

    From dmidecode:
    Core Count: 24
    Core Enabled: 24
    Thread Count: 48

    Orig kernel boot log:

    [ 0.464981] smpboot: Max logical packages: 19
    [ 0.469861] smpboot: APIC(0) Converting physical 0 to logical package 0
    [ 0.477261] smpboot: APIC(40) Converting physical 1 to logical package 1
    [ 0.484760] smpboot: APIC(80) Converting physical 2 to logical package 2
    [ 0.492258] smpboot: APIC(c0) Converting physical 3 to logical package 3

    1. nr_cpus=8, should stop enumerating in package 0:

    [ 0.533664] smpboot: APIC(0) Converting physical 0 to logical package 0
    [ 0.539596] smpboot: Max logical packages: 19

    2. max_cpus=8, should still enumerate all packages:

    [ 0.526494] smpboot: APIC(0) Converting physical 0 to logical package 0
    [ 0.532428] smpboot: APIC(40) Converting physical 1 to logical package 1
    [ 0.538456] smpboot: APIC(80) Converting physical 2 to logical package 2
    [ 0.544486] smpboot: APIC(c0) Converting physical 3 to logical package 3
    [ 0.550524] smpboot: Max logical packages: 19

    3. nr_cpus=49 ( 2 socket + 1 core on 3rd socket), should stop enumerating in
    package 2:

    [ 0.521378] smpboot: APIC(0) Converting physical 0 to logical package 0
    [ 0.527314] smpboot: APIC(40) Converting physical 1 to logical package 1
    [ 0.533345] smpboot: APIC(80) Converting physical 2 to logical package 2
    [ 0.539368] smpboot: Max logical packages: 19

    4. maxcpus=49, should still enumerate all packages:

    [ 0.525591] smpboot: APIC(0) Converting physical 0 to logical package 0
    [ 0.531525] smpboot: APIC(40) Converting physical 1 to logical package 1
    [ 0.537547] smpboot: APIC(80) Converting physical 2 to logical package 2
    [ 0.543579] smpboot: APIC(c0) Converting physical 3 to logical package 3
    [ 0.549624] smpboot: Max logical packages: 19

    5. kdump (nr_cpus=1) works as well.

    Reported-by: Frank Ramsay
    Tested-by: Prarit Bhargava
    Signed-off-by: Jiri Olsa
    Reviewed-by: Prarit Bhargava
    Acked-by: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20160815101700.GA30090@krava
    Signed-off-by: Ingo Molnar

    Jiri Olsa
     
  • Similar to:

    efaad554b4ff ("x86/microcode/intel: Fix initrd loading with CONFIG_RANDOMIZE_MEMORY=y")

    ... fix microcode loading from the initrd on AMD by adding the
    randomization offset to the microcode patch container within the initrd.

    Reported-and-tested-by: Brian Gerst
    Signed-off-by: Borislav Petkov
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-tip-commits@vger.kernel.org
    Link: http://lkml.kernel.org/r/20160817113314.GA19221@nazgul.tnic
    Signed-off-by: Ingo Molnar

    Borislav Petkov
     
  • * pm-sleep:
    PM / hibernate: Fix rtree_next_node() to avoid walking off list ends
    x86/power/64: Use __pa() for physical address computation
    PM / sleep: Update some system sleep documentation

    Rafael J. Wysocki
     
  • Literal loads of virtual addresses are subject to runtime relocation when
    CONFIG_RELOCATABLE=y, and given that the relocation routines run with the
    MMU and caches enabled, literal loads of relocated values performed with
    the MMU off are not guaranteed to return the latest value unless the
    memory covering the literal is cleaned to the PoC explicitly.

    So defer the literal load until after the MMU has been enabled, just like
    we do for primary_switch() and secondary_switch() in head.S.

    Fixes: 1e48ef7fcc37 ("arm64: add support for building vmlinux as a relocatable PIE binary")
    Cc: # 4.6+
    Signed-off-by: Ard Biesheuvel
    Acked-by: Mark Rutland
    Signed-off-by: Catalin Marinas

    Ard Biesheuvel
     
  • Since asm/acpi.h is only included by linux/acpi.h when CONFIG_ACPI is
    enabled, disabling the latter leads to the following build error on
    arm64:

    arch/arm64/mm/numa.c: In function ‘arm64_numa_init’:
    arch/arm64/mm/numa.c:395:24: error: ‘arm64_acpi_numa_init’ undeclared (first use in this function)
    if (!acpi_disabled && !numa_init(arm64_acpi_numa_init))

    This patch include the asm/acpi.h explicitly in arch/arm64/mm/numa.c for
    the arm64_acpi_numa_init() definition.

    Fixes: d8b47fca8c23 ("arm64, ACPI, NUMA: NUMA support based on SRAT and SLIT")
    Reviewed-by: Hanjun Guo
    Signed-off-by: Catalin Marinas

    Catalin Marinas
     

17 Aug, 2016

2 commits

  • After commit b34f2bc ("arm64: KVM: Make ICC_SRE_EL1 access return the
    configured SRE value") we report SRE value to 64-bit guest, but 32-bit
    one still handled as RAZ/WI what leads to funny promise we do not keep:

    "GICv3: GIC: unable to set SRE (disabled at EL2), panic ahead"

    Instead, return the actual value of the ICC_SRE_EL1 register that the
    guest should see.

    [ Tweaked commit message - Christoffer ]

    Signed-off-by: Vladimir Murzin
    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Vladimir Murzin
     
  • Comment about how PMU access is handled is not relavant since v4.6
    where proper PMU support was added in.

    Signed-off-by: Vladimir Murzin
    Acked-by: Marc Zyngier
    Signed-off-by: Christoffer Dall

    Vladimir Murzin