19 Feb, 2018

1 commit

  • Pull x86 Kconfig fixes from Thomas Gleixner:
    "Three patchlets to correct HIGHMEM64G and CMPXCHG64 dependencies in
    Kconfig when CPU selections are explicitely set to M586 or M686"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/Kconfig: Explicitly enumerate i686-class CPUs in Kconfig
    x86/Kconfig: Exclude i586-class CPUs lacking PAE support from the HIGHMEM64G Kconfig group
    x86/Kconfig: Add missing i586-class CPUs to the X86_CMPXCHG64 Kconfig group

    Linus Torvalds
     

18 Feb, 2018

2 commits

  • Pull powerpc fixes from Michael Ellerman:
    "The main attraction is a fix for a bug in the new drmem code, which
    was causing an oops on boot on some versions of Qemu.

    There's also a fix for XIVE (Power9 interrupt controller) on KVM, as
    well as a few other minor fixes.

    Thanks to: Corentin Labbe, Cyril Bur, Cédric Le Goater, Daniel Black,
    Nathan Fontenot, Nicholas Piggin"

    * tag 'powerpc-4.16-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    powerpc/pseries: Check for zero filled ibm,dynamic-memory property
    powerpc/pseries: Add empty update_numa_cpu_lookup_table() for NUMA=n
    powerpc/powernv: IMC fix out of bounds memory access at shutdown
    powerpc/xive: Use hw CPU ids when configuring the CPU queues
    powerpc: Expose TSCR via sysfs only on powernv

    Linus Torvalds
     
  • Pull arm64 fixes from Catalin Marinas:
    "The bulk of this is the pte accessors annotation to READ/WRITE_ONCE
    (we tried to avoid pushing this during the merge window to avoid
    conflicts)

    - Updated the page table accessors to use READ/WRITE_ONCE and prevent
    compiler transformation that could lead to an apparent loss of
    coherency

    - Enabled branch predictor hardening for the Falkor CPU

    - Fix interaction between kpti enabling and KASan causing the
    recursive page table walking to take a significant time

    - Fix some sparse warnings"

    * tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
    arm64: cputype: Silence Sparse warnings
    arm64: mm: Use READ_ONCE/WRITE_ONCE when accessing page tables
    arm64: proc: Set PTE_NG for table entries to avoid traversing them twice
    arm64: Add missing Falkor part number for branch predictor hardening

    Linus Torvalds
     

17 Feb, 2018

5 commits

  • The kernel panics on PV domains because native_smp_cpus_done() is
    only called for HVM domains.

    Calculate __max_logical_packages for PV domains.

    Fixes: b4c0a7326f5d ("x86/smpboot: Fix __max_logical_packages estimate")
    Signed-off-by: Prarit Bhargava
    Tested-and-reported-by: Simon Gaiser
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: x86@kernel.org
    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: Dou Liyang
    Cc: Prarit Bhargava
    Cc: Kate Stewart
    Cc: Greg Kroah-Hartman
    Cc: Andy Lutomirski
    Cc: Andi Kleen
    Cc: Vitaly Kuznetsov
    Cc: xen-devel@lists.xenproject.org
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross

    Prarit Bhargava
     
  • Sparse makes a fair bit of noise about our MPIDR mask being implicitly
    long - let's explicitly describe it as such rather than just relying on
    the value forcing automatic promotion.

    Signed-off-by: Robin Murphy
    Signed-off-by: Catalin Marinas

    Robin Murphy
     
  • Pull dma-mapping fixes from Christoph Hellwig:
    "A few dma-mapping fixes for the fallout from the changes in rc1"

    * tag 'dma-mapping-4.16-2' of git://git.infradead.org/users/hch/dma-mapping:
    powerpc/macio: set a proper dma_coherent_mask
    dma-mapping: fix a comment typo
    dma-direct: comment the dma_direct_free calling convention
    dma-direct: mark as is_phys
    ia64: fix build failure with CONFIG_SWIOTLB

    Linus Torvalds
     
  • In many cases, page tables can be accessed concurrently by either another
    CPU (due to things like fast gup) or by the hardware page table walker
    itself, which may set access/dirty bits. In such cases, it is important
    to use READ_ONCE/WRITE_ONCE when accessing page table entries so that
    entries cannot be torn, merged or subject to apparent loss of coherence
    due to compiler transformations.

    Whilst there are some scenarios where this cannot happen (e.g. pinned
    kernel mappings for the linear region), the overhead of using READ_ONCE
    /WRITE_ONCE everywhere is minimal and makes the code an awful lot easier
    to reason about. This patch consistently uses these macros in the arch
    code, as well as explicitly namespacing pointers to page table entries
    from the entries themselves by using adopting a 'p' suffix for the former
    (as is sometimes used elsewhere in the kernel source).

    Tested-by: Yury Norov
    Tested-by: Richard Ruigrok
    Reviewed-by: Marc Zyngier
    Signed-off-by: Will Deacon
    Signed-off-by: Catalin Marinas

    Will Deacon
     
  • Pull MIPS fixes from James Hogan:
    "A few fixes for outstanding MIPS issues:

    - an __init section mismatch warning when brcmstb_pm is enabled

    - a regression handling multiple mem=X@Y arguments (4.11)

    - a USB Kconfig select warning, and related sparc cleanup (4.16)"

    * tag 'mips_fixes_4.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
    sparc,leon: Select USB_UHCI_BIG_ENDIAN_{MMIO,DESC}
    usb: Move USB_UHCI_BIG_ENDIAN_* out of USB_SUPPORT
    MIPS: Fix incorrect mem=X@Y handling
    MIPS: BMIPS: Fix section mismatch warning

    Linus Torvalds
     

16 Feb, 2018

5 commits

  • Some versions of QEMU will produce an ibm,dynamic-reconfiguration-memory
    node with a ibm,dynamic-memory property that is zero-filled. This
    causes the drmem code to oops trying to parse this property.

    The fix for this is to validate that the property does contain LMB
    entries before trying to parse it and bail if the count is zero.

    Oops: Kernel access of bad area, sig: 11 [#1]
    DAR: 0000000000000010
    NIP read_drconf_v1_cell+0x54/0x9c
    LR read_drconf_v1_cell+0x48/0x9c
    Call Trace:
    __param_initcall_debug+0x0/0x28 (unreliable)
    drmem_init+0x144/0x2f8
    do_one_initcall+0x64/0x1d0
    kernel_init_freeable+0x298/0x38c
    kernel_init+0x24/0x160
    ret_from_kernel_thread+0x5c/0xb4

    The ibm,dynamic-reconfiguration-memory device tree property generated
    that causes this:

    ibm,dynamic-reconfiguration-memory {
    ibm,lmb-size = ;
    ibm,memory-flags-mask = ;
    ibm,dynamic-memory = ;
    linux,phandle = ;
    ibm,associativity-lookup-arrays = ;
    ibm,memory-preservation-time = ;
    };

    Signed-off-by: Nathan Fontenot
    Reviewed-by: Cyril Bur
    Tested-by: Daniel Black
    [mpe: Trim oops report]
    Signed-off-by: Michael Ellerman

    Nathan Fontenot
     
  • The X86_P6_NOP config class leaves out many i686-class CPUs. Instead,
    explicitly enumerate all these CPUs.

    Using a configuration with M686 currently sets X86_MINIMUM_CPU_FAMILY=5
    instead of the correct value of 6.

    Booting on an i586 it will fail to generate the "This kernel
    requires an i686 CPU, but only detected an i586 CPU" message and
    intentional halt as expected. It will instead just silently hang
    when it hits i686-specific instructions.

    Signed-off-by: Matthew Whitehead
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1518713696-11360-3-git-send-email-tedheadster@gmail.com
    Signed-off-by: Ingo Molnar

    Matthew Whitehead
     
  • i586-class machines also lack support for Physical Address Extension (PAE),
    so add them to the exclusion list.

    Signed-off-by: Matthew Whitehead
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1518713696-11360-2-git-send-email-tedheadster@gmail.com
    Signed-off-by: Ingo Molnar

    Matthew Whitehead
     
  • Several i586-class CPUs supporting this instruction are missing from
    the X86_CMPXCHG64 config group.

    Using a configuration with either M586TSC or M586MMX currently sets
    X86_MINIMUM_CPU_FAMILY=4 instead of the correct value of 5.

    Booting on an i486 it will fail to generate the "This kernel
    requires an i586 CPU, but only detected an i486 CPU" message and
    intentional halt as expected. It will instead just silently hang
    when it hits i586-specific instructions.

    The M586 CPU is not in this list because at least the Cyrix 5x86
    lacks this instruction, and perhaps others.

    Signed-off-by: Matthew Whitehead
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/1518713696-11360-1-git-send-email-tedheadster@gmail.com
    Signed-off-by: Ingo Molnar

    Matthew Whitehead
     
  • Now that USB_UHCI_BIG_ENDIAN_MMIO and USB_UHCI_BIG_ENDIAN_DESC are moved
    outside of the USB_SUPPORT conditional, simply select them from
    SPARC_LEON rather than by the symbol's defaults in drivers/usb/Kconfig,
    similar to how it is done for USB_EHCI_BIG_ENDIAN_MMIO and
    USB_EHCI_BIG_ENDIAN_DESC.

    Signed-off-by: James Hogan
    Cc: "David S. Miller"
    Cc: Greg Kroah-Hartman
    Cc: Corentin Labbe
    Cc: sparclinux@vger.kernel.org
    Cc: linux-usb@vger.kernel.org
    Acked-by: David S. Miller
    Patchwork: https://patchwork.linux-mips.org/patch/18560/

    James Hogan
     

15 Feb, 2018

17 commits

  • Pull x86 fixes from Ingo Molnar:
    "Misc fixes all across the map:

    - /proc/kcore vsyscall related fixes
    - LTO fix
    - build warning fix
    - CPU hotplug fix
    - Kconfig NR_CPUS cleanups
    - cpu_has() cleanups/robustification
    - .gitignore fix
    - memory-failure unmapping fix
    - UV platform fix"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages
    x86/error_inject: Make just_return_func() globally visible
    x86/platform/UV: Fix GAM Range Table entries less than 1GB
    x86/build: Add arch/x86/tools/insn_decoder_test to .gitignore
    x86/smpboot: Fix uncore_pci_remove() indexing bug when hot-removing a physical CPU
    x86/mm/kcore: Add vsyscall page to /proc/kcore conditionally
    vfs/proc/kcore, x86/mm/kcore: Fix SMAP fault when dumping vsyscall user page
    x86/Kconfig: Further simplify the NR_CPUS config
    x86/Kconfig: Simplify NR_CPUS config
    x86/MCE: Fix build warning introduced by "x86: do not use print_symbol()"
    x86/cpufeature: Update _static_cpu_has() to use all named variables
    x86/cpufeature: Reindent _static_cpu_has()

    Linus Torvalds
     
  • Pull x86 PTI and Spectre related fixes and updates from Ingo Molnar:
    "Here's the latest set of Spectre and PTI related fixes and updates:

    Spectre:
    - Add entry code register clearing to reduce the Spectre attack
    surface
    - Update the Spectre microcode blacklist
    - Inline the KVM Spectre helpers to get close to v4.14 performance
    again.
    - Fix indirect_branch_prediction_barrier()
    - Fix/improve Spectre related kernel messages
    - Fix array_index_nospec_mask() asm constraint
    - KVM: fix two MSR handling bugs

    PTI:
    - Fix a paranoid entry PTI CR3 handling bug
    - Fix comments

    objtool:
    - Fix paranoid_entry() frame pointer warning
    - Annotate WARN()-related UD2 as reachable
    - Various fixes
    - Add Add Peter Zijlstra as objtool co-maintainer

    Misc:
    - Various x86 entry code self-test fixes
    - Improve/simplify entry code stack frame generation and handling
    after recent heavy-handed PTI and Spectre changes. (There's two
    more WIP improvements expected here.)
    - Type fix for cache entries

    There's also some low risk non-fix changes I've included in this
    branch to reduce backporting conflicts:

    - rename a confusing x86_cpu field name
    - de-obfuscate the naming of single-TLB flushing primitives"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (41 commits)
    x86/entry/64: Fix CR3 restore in paranoid_exit()
    x86/cpu: Change type of x86_cache_size variable to unsigned int
    x86/spectre: Fix an error message
    x86/cpu: Rename cpu_data.x86_mask to cpu_data.x86_stepping
    selftests/x86/mpx: Fix incorrect bounds with old _sigfault
    x86/mm: Rename flush_tlb_single() and flush_tlb_one() to __flush_tlb_one_[user|kernel]()
    x86/speculation: Add dependency
    nospec: Move array_index_nospec() parameter checking into separate macro
    x86/speculation: Fix up array_index_nospec_mask() asm constraint
    x86/debug: Use UD2 for WARN()
    x86/debug, objtool: Annotate WARN()-related UD2 as reachable
    objtool: Fix segfault in ignore_unreachable_insn()
    selftests/x86: Disable tests requiring 32-bit support on pure 64-bit systems
    selftests/x86: Do not rely on "int $0x80" in single_step_syscall.c
    selftests/x86: Do not rely on "int $0x80" in test_mremap_vdso.c
    selftests/x86: Fix build bug caused by the 5lvl test which has been moved to the VM directory
    selftests/x86/pkeys: Remove unused functions
    selftests/x86: Clean up and document sscanf() usage
    selftests/x86: Fix vDSO selftest segfault for vsyscall=none
    x86/entry/64: Remove the unused 'icebp' macro
    ...

    Linus Torvalds
     
  • Josh Poimboeuf noticed the following bug:

    "The paranoid exit code only restores the saved CR3 when it switches back
    to the user GS. However, even in the kernel GS case, it's possible that
    it needs to restore a user CR3, if for example, the paranoid exception
    occurred in the syscall exit path between SWITCH_TO_USER_CR3_STACK and
    SWAPGS."

    Josh also confirmed via targeted testing that it's possible to hit this bug.

    Fix the bug by also restoring CR3 in the paranoid_exit_no_swapgs branch.

    The reason we haven't seen this bug reported by users yet is probably because
    "paranoid" entry points are limited to the following cases:

    idtentry double_fault do_double_fault has_error_code=1 paranoid=2
    idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
    idtentry int3 do_int3 has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
    idtentry machine_check do_mce has_error_code=0 paranoid=1

    Amongst those entry points only machine_check is one that will interrupt an
    IRQS-off critical section asynchronously - and machine check events are rare.

    The other main asynchronous entries are NMI entries, which can be very high-freq
    with perf profiling, but they are special: they don't use the 'idtentry' macro but
    are open coded and restore user CR3 unconditionally so don't have this bug.

    Reported-and-tested-by: Josh Poimboeuf
    Reviewed-by: Andy Lutomirski
    Acked-by: Thomas Gleixner
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Woodhouse
    Cc: Greg Kroah-Hartman
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180214073910.boevmg65upbk3vqb@gmail.com
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Currently, x86_cache_size is of type int, which makes no sense as we
    will never have a valid cache size equal or less than 0. So instead of
    initializing this variable to -1, it can perfectly be initialized to 0
    and use it as an unsigned variable instead.

    Suggested-by: Thomas Gleixner
    Signed-off-by: Gustavo A. R. Silva
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Addresses-Coverity-ID: 1464429
    Link: http://lkml.kernel.org/r/20180213192208.GA26414@embeddedor.com
    Signed-off-by: Ingo Molnar

    Gustavo A. R. Silva
     
  • If i == ARRAY_SIZE(mitigation_options) then we accidentally print
    garbage from one space beyond the end of the mitigation_options[] array.

    Signed-off-by: Dan Carpenter
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: David Woodhouse
    Cc: Greg Kroah-Hartman
    Cc: KarimAllah Ahmed
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: kernel-janitors@vger.kernel.org
    Fixes: 9005c6834c0f ("x86/spectre: Simplify spectre_v2 command line parsing")
    Link: http://lkml.kernel.org/r/20180214071416.GA26677@mwanda
    Signed-off-by: Ingo Molnar

    Dan Carpenter
     
  • x86_mask is a confusing name which is hard to associate with the
    processor's stepping.

    Additionally, correct an indent issue in lib/cpu.c.

    Signed-off-by: Jia Zhang
    [ Updated it to more recent kernels. ]
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: bp@alien8.de
    Cc: tony.luck@intel.com
    Link: http://lkml.kernel.org/r/1514771530-70829-1-git-send-email-qianyue.zj@alibaba-inc.com
    Signed-off-by: Ingo Molnar

    Jia Zhang
     
  • flush_tlb_single() and flush_tlb_one() sound almost identical, but
    they really mean "flush one user translation" and "flush one kernel
    translation". Rename them to flush_tlb_one_user() and
    flush_tlb_one_kernel() to make the semantics more obvious.

    [ I was looking at some PTI-related code, and the flush-one-address code
    is unnecessarily hard to understand because the names of the helpers are
    uninformative. This came up during PTI review, but no one got around to
    doing it. ]

    Signed-off-by: Andy Lutomirski
    Acked-by: Peter Zijlstra (Intel)
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Eduardo Valentin
    Cc: Hugh Dickins
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Linux-MM
    Cc: Rik van Riel
    Cc: Thomas Gleixner
    Cc: Will Deacon
    Link: http://lkml.kernel.org/r/3303b02e3c3d049dc5235d5651e0ae6d29a34354.1517414378.git.luto@kernel.org
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     
  • Joe Konno reported a compile failure resulting from using an MSR
    without inclusion of , and while the current code builds
    fine (by accident) this needs fixing for future patches.

    Reported-by: Joe Konno
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: arjan@linux.intel.com
    Cc: bp@alien8.de
    Cc: dan.j.williams@intel.com
    Cc: dave.hansen@linux.intel.com
    Cc: dwmw2@infradead.org
    Cc: dwmw@amazon.co.uk
    Cc: gregkh@linuxfoundation.org
    Cc: hpa@zytor.com
    Cc: jpoimboe@redhat.com
    Cc: linux-tip-commits@vger.kernel.org
    Cc: luto@kernel.org
    Fixes: 20ffa1caecca ("x86/speculation: Add basic IBPB (Indirect Branch Prediction Barrier) support")
    Link: http://lkml.kernel.org/r/20180213132819.GJ25201@hirez.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • Allow the compiler to handle @size as an immediate value or memory
    directly rather than allocating a register.

    Reported-by: Linus Torvalds
    Signed-off-by: Dan Williams
    Cc: Andy Lutomirski
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/151797010204.1289.1510000292250184993.stgit@dwillia2-desk3.amr.corp.intel.com
    Signed-off-by: Ingo Molnar

    Dan Williams
     
  • Since the Intel SDM added an ModR/M byte to UD0 and binutils followed
    that specification, we now cannot disassemble our kernel anymore.

    This now means Intel and AMD disagree on the encoding of UD0. And instead
    of playing games with additional bytes that are valid ModR/M and single
    byte instructions (0xd6 for instance), simply use UD2 for both WARN() and
    BUG().

    Requested-by: Linus Torvalds
    Signed-off-by: Peter Zijlstra (Intel)
    Acked-by: Linus Torvalds
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20180208194406.GD25181@hirez.programming.kicks-ass.net
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • By default, objtool assumes that a UD2 is a dead end. This is mainly
    because GCC 7+ sometimes inserts a UD2 when it detects a divide-by-zero
    condition.

    Now that WARN() is moving back to UD2, annotate the code after it as
    reachable so objtool can follow the code flow.

    Reported-by: Borislav Petkov
    Signed-off-by: Josh Poimboeuf
    Signed-off-by: Peter Zijlstra (Intel)
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: kbuild test robot
    Link: http://lkml.kernel.org/r/0e483379275a42626ba8898117f918e1bf661e40.1518130694.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     
  • When CONFIG_NUMA is not set, the build fails with:

    arch/powerpc/platforms/pseries/hotplug-cpu.c:335:4:
    error: déclaration implicite de la fonction « update_numa_cpu_lookup_table »

    So we have to add update_numa_cpu_lookup_table() as an empty function
    when CONFIG_NUMA is not set.

    Fixes: 1d9a090783be ("powerpc/numa: Invalidate numa_cpu_lookup_table on cpu remove")
    Signed-off-by: Corentin Labbe
    Signed-off-by: Michael Ellerman

    Corentin Labbe
     
  • The OPAL IMC driver's shutdown handler disables nest PMU counters by
    walking nodes and taking the first CPU out of their cpumask, which is
    used to index into the paca (get_hard_smp_processor_id()). This does
    not always do the right thing, and in particular for CPU-less nodes it
    returns NR_CPUS and that overruns the paca and dereferences random
    memory.

    Fix it by being more careful about checking returned CPU, and only
    using online CPUs. It's not clear this shutdown code makes sense after
    commit 885dcd709b ("powerpc/perf: Add nest IMC PMU support"), but this
    should not make things worse

    Currently the bug causes us to call OPAL with a junk CPU number. A
    separate patch in development to change the way pacas are allocated
    escalates this bug into a crash:

    Unable to handle kernel paging request for data at address 0x2a21af1eeb000076
    Faulting instruction address: 0xc0000000000a5468
    Oops: Kernel access of bad area, sig: 11 [#1]
    ...
    NIP opal_imc_counters_shutdown+0x148/0x1d0
    LR opal_imc_counters_shutdown+0x134/0x1d0
    Call Trace:
    opal_imc_counters_shutdown+0x134/0x1d0 (unreliable)
    platform_drv_shutdown+0x44/0x60
    device_shutdown+0x1f8/0x350
    kernel_restart_prepare+0x54/0x70
    kernel_restart+0x28/0xc0
    SyS_reboot+0x1d0/0x2c0
    system_call+0x58/0x6c

    Signed-off-by: Nicholas Piggin
    Signed-off-by: Michael Ellerman

    Nicholas Piggin
     
  • The CPU event notification queues on sPAPR should be configured using
    a hardware CPU identifier.

    The problem did not show up on the Power Hypervisor because pHyp
    supports 8 threads per core which keeps CPU number contiguous. This is
    not the case on all sPAPR virtual machines, some use SMT=1.

    Also improve error logging by adding the CPU number.

    Fixes: eac1e731b59e ("powerpc/xive: guest exploitation of the XIVE interrupt controller")
    Cc: stable@vger.kernel.org # v4.14+
    Signed-off-by: Cédric Le Goater
    Signed-off-by: Michael Ellerman

    Cédric Le Goater
     
  • The TSCR can only be accessed in hypervisor mode.

    Fixes: 88b5e12eeb11 ("powerpc: Expose TSCR via sysfs")
    Signed-off-by: Cyril Bur
    Signed-off-by: Michael Ellerman

    Cyril Bur
     
  • When KASAN is enabled, the swapper page table contains many identical
    mappings of the zero page, which can lead to a stall during boot whilst
    the G -> nG code continually walks the same page table entries looking
    for global mappings.

    This patch sets the nG bit (bit 11, which is IGNORED) in table entries
    after processing the subtree so we can easily skip them if we see them
    a second time.

    Tested-by: Mark Rutland
    Signed-off-by: Will Deacon
    Signed-off-by: Catalin Marinas

    Will Deacon
     
  • Pull powerpc fixes from Michael Ellerman:
    "A larger batch of fixes than we'd like. Roughly 1/3 fixes for new
    code, 1/3 fixes for stable and 1/3 minor things.

    There's four commits fixing bugs when using 16GB huge pages on hash,
    caused by some of the preparatory changes for pkeys.

    Two fixes for bugs in the enhanced IRQ soft masking for local_t, one
    of which broke KVM in some circumstances.

    Four fixes for Power9. The most bizarre being a bug where futexes
    stopped working because a NULL pointer dereference didn't trap during
    early boot (it aliased the kernel mapping). A fix for memory hotplug
    when using the Radix MMU, and a fix for live migration of guests using
    the Radix MMU.

    Two fixes for hotplug on pseries machines. One where we weren't
    correctly updating NUMA info when CPUs are added and removed. And the
    other fixes crashes/hangs seen when doing memory hot remove during
    boot, which is apparently a thing people do.

    Finally a handful of build fixes for obscure configs and other minor
    fixes.

    Thanks to: Alexey Kardashevskiy, Aneesh Kumar K.V, Balbir Singh, Colin
    Ian King, Daniel Henrique Barboza, Florian Weimer, Guenter Roeck,
    Harish, Laurent Vivier, Madhavan Srinivasan, Mauricio Faria de
    Oliveira, Nathan Fontenot, Nicholas Piggin, Sam Bobroff"

    * tag 'powerpc-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
    selftests/powerpc: Fix to use ucontext_t instead of struct ucontext
    powerpc/kdump: Fix powernv build break when KEXEC_CORE=n
    powerpc/pseries: Fix build break for SPLPAR=n and CPU hotplug
    powerpc/mm/hash64: Zero PGD pages on allocation
    powerpc/mm/hash64: Store the slot information at the right offset for hugetlb
    powerpc/mm/hash64: Allocate larger PMD table if hugetlb config is enabled
    powerpc/mm: Fix crashes with 16G huge pages
    powerpc/mm: Flush radix process translations when setting MMU type
    powerpc/vas: Don't set uses_vas for kernel windows
    powerpc/pseries: Enable RAS hotplug events later
    powerpc/mm/radix: Split linear mapping on hot-unplug
    powerpc/64s/radix: Boot-time NULL pointer protection using a guard-PID
    ocxl: fix signed comparison with less than zero
    powerpc/64s: Fix may_hard_irq_enable() for PMI soft masking
    powerpc/64s: Fix MASKABLE_RELON_EXCEPTION_HV_OOL macro
    powerpc/numa: Invalidate numa_cpu_lookup_table on cpu remove

    Linus Torvalds
     

14 Feb, 2018

1 commit

  • Pull MIPS fix from James Hogan:
    "A single change (and associated DT binding update) to allow the
    address of the MIPS Cluster Power Controller (CPC) to be chosen by DT,
    which allows SMP to work on generic MIPS kernels where the bootloader
    hasn't configured the CPC address (i.e. the new Ranchu platform)"

    * tag 'mips_4.16_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
    MIPS: CPC: Map registers using DT in mips_cpc_default_phys_base()
    dt-bindings: Document mti,mips-cpc binding

    Linus Torvalds
     

13 Feb, 2018

9 commits

  • In the following commit:

    ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")

    ... we added code to memory_failure() to unmap the page from the
    kernel 1:1 virtual address space to avoid speculative access to the
    page logging additional errors.

    But memory_failure() may not always succeed in taking the page offline,
    especially if the page belongs to the kernel. This can happen if
    there are too many corrected errors on a page and either mcelog(8)
    or drivers/ras/cec.c asks to take a page offline.

    Since we remove the 1:1 mapping early in memory_failure(), we can
    end up with the page unmapped, but still in use. On the next access
    the kernel crashes :-(

    There are also various debug paths that call memory_failure() to simulate
    occurrence of an error. Since there is no actual error in memory, we
    don't need to map out the page for those cases.

    Revert most of the previous attempt and keep the solution local to
    arch/x86/kernel/cpu/mcheck/mce.c. Unmap the page only when:

    1) there is a real error
    2) memory_failure() succeeds.

    All of this only applies to 64-bit systems. 32-bit kernel doesn't map
    all of memory into kernel space. It isn't worth adding the code to unmap
    the piece that is mapped because nobody would run a 32-bit kernel on a
    machine that has recoverable machine checks.

    Signed-off-by: Tony Luck
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave
    Cc: Denys Vlasenko
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Naoya Horiguchi
    Cc: Peter Zijlstra
    Cc: Robert (Persistent Memory)
    Cc: Thomas Gleixner
    Cc: linux-mm@kvack.org
    Cc: stable@vger.kernel.org #v4.14
    Fixes: ce0fa3e56ad2 ("x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages")
    Signed-off-by: Ingo Molnar

    Tony Luck
     
  • With link time optimizations enabled, I get a link failure:

    ./ccLbOEHX.ltrans19.ltrans.o: In function `override_function_with_return':
    :(.text+0x7f3): undefined reference to `just_return_func'

    Marking the symbol .globl makes it work as expected.

    Signed-off-by: Arnd Bergmann
    Acked-by: Masami Hiramatsu
    Acked-by: Thomas Gleixner
    Cc: Alexei Starovoitov
    Cc: Josef Bacik
    Cc: Linus Torvalds
    Cc: Nicolas Pitre
    Cc: Peter Zijlstra
    Fixes: 540adea3809f ("error-injection: Separate error-injection from kprobe")
    Link: http://lkml.kernel.org/r/20180202145634.200291-3-arnd@arndb.de
    Signed-off-by: Ingo Molnar

    Arnd Bergmann
     
  • The latest UV platforms include the new ApachePass NVDIMMs into the
    UV address space. This has introduced address ranges in the Global
    Address Map Table that are less than the previous lowest range, which
    was 2GB. Fix the address calculation so it accommodates address ranges
    from bytes to exabytes.

    Signed-off-by: Mike Travis
    Reviewed-by: Andrew Banman
    Reviewed-by: Dimitri Sivanich
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Russ Anderson
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20180205221503.190219903@stormcage.americas.sgi.com
    Signed-off-by: Ingo Molnar

    mike.travis@hpe.com
     
  • Commit 73fbc1eba7ff ("MIPS: fix mem=X@Y commandline processing") added a
    fix to ensure that the memory range between PHYS_OFFSET and low memory
    address specified by mem= cmdline argument is not later processed by
    free_all_bootmem. This change was incorrect for systems where the
    commandline specifies more than 1 mem argument, as it will cause all
    memory between PHYS_OFFSET and each of the memory offsets to be marked
    as reserved, which results in parts of the RAM marked as reserved
    (Creator CI20's u-boot has a default commandline argument 'mem=256M@0x0
    mem=768M@0x30000000').

    Change the behaviour to ensure that only the range between PHYS_OFFSET
    and the lowest start address of the memories is marked as protected.

    This change also ensures that the range is marked protected even if it's
    only defined through the devicetree and not only via commandline
    arguments.

    Reported-by: Mathieu Malaterre
    Signed-off-by: Marcin Nowakowski
    Fixes: 73fbc1eba7ff ("MIPS: fix mem=X@Y commandline processing")
    Cc: Ralf Baechle
    Cc: linux-mips@linux-mips.org
    Cc: # v4.11+
    Tested-by: Mathieu Malaterre
    Patchwork: https://patchwork.linux-mips.org/patch/18562/
    Signed-off-by: James Hogan

    Marcin Nowakowski
     
  • The file was generated by make command and should not be in the source tree.

    Signed-off-by: Progyan Bhattacharya
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Progyan Bhattacharya
     
  • Remove the __init annotation from bmips_cpu_setup() to avoid the
    following warning.

    WARNING: vmlinux.o(.text+0x35c950): Section mismatch in reference from the function brcmstb_pm_s3() to the function .init.text:bmips_cpu_setup()
    The function brcmstb_pm_s3() references
    the function __init bmips_cpu_setup().
    This is often because brcmstb_pm_s3 lacks a __init
    annotation or the annotation of bmips_cpu_setup is wrong.

    Signed-off-by: Jaedon Shin
    Cc: Ralf Baechle
    Cc: Florian Fainelli
    Cc: Kevin Cernekee
    Cc: linux-mips@linux-mips.org
    Reviewed-by: James Hogan
    Reviewed-by: Florian Fainelli
    Patchwork: https://patchwork.linux-mips.org/patch/18589/
    Signed-off-by: James Hogan

    Jaedon Shin
     
  • When a physical CPU is hot-removed, the following warning messages
    are shown while the uncore device is removed in uncore_pci_remove():

    WARNING: CPU: 120 PID: 5 at arch/x86/events/intel/uncore.c:988
    uncore_pci_remove+0xf1/0x110
    ...
    CPU: 120 PID: 5 Comm: kworker/u1024:0 Not tainted 4.15.0-rc8 #1
    Workqueue: kacpi_hotplug acpi_hotplug_work_fn
    ...
    Call Trace:
    pci_device_remove+0x36/0xb0
    device_release_driver_internal+0x145/0x210
    pci_stop_bus_device+0x76/0xa0
    pci_stop_root_bus+0x44/0x60
    acpi_pci_root_remove+0x1f/0x80
    acpi_bus_trim+0x54/0x90
    acpi_bus_trim+0x2e/0x90
    acpi_device_hotplug+0x2bc/0x4b0
    acpi_hotplug_work_fn+0x1a/0x30
    process_one_work+0x141/0x340
    worker_thread+0x47/0x3e0
    kthread+0xf5/0x130

    When uncore_pci_remove() runs, it tries to get the package ID to
    clear the value of uncore_extra_pci_dev[].dev[] by using
    topology_phys_to_logical_pkg(). The warning messesages are
    shown because topology_phys_to_logical_pkg() returns -1.

    arch/x86/events/intel/uncore.c:
    static void uncore_pci_remove(struct pci_dev *pdev)
    {
    ...
    phys_id = uncore_pcibus_to_physid(pdev->bus);
    ...
    pkg = topology_phys_to_logical_pkg(phys_id); // returns -1
    for (i = 0; i < UNCORE_EXTRA_PCI_DEV_MAX; i++) {
    if (uncore_extra_pci_dev[pkg].dev[i] == pdev) {
    uncore_extra_pci_dev[pkg].dev[i] = NULL;
    break;
    }
    }
    WARN_ON_ONCE(i >= UNCORE_EXTRA_PCI_DEV_MAX); // phys_proc_id that matches the phys_pkg argument.

    arch/x86/kernel/smpboot.c:
    int topology_phys_to_logical_pkg(unsigned int phys_pkg)
    {
    int cpu;

    for_each_possible_cpu(cpu) {
    struct cpuinfo_x86 *c = &cpu_data(cpu);

    if (c->initialized && c->phys_proc_id == phys_pkg)
    return c->logical_proc_id;
    }
    return -1;
    }

    However, the phys_proc_id was already set to 0 by remove_siblinginfo()
    when the CPU was offlined.

    So, topology_phys_to_logical_pkg() cannot find the correct
    logical_proc_id and always returns -1.

    As the result, uncore_pci_remove() calls WARN_ON_ONCE() and the warning
    messages are shown.

    What is worse is that the bogus 'pkg' index results in two bugs:

    - We dereference uncore_extra_pci_dev[] with a negative index
    - We fail to clean up a stale pointer in uncore_extra_pci_dev[][]

    To fix these bugs, remove the clearing of ->phys_proc_id from remove_siblinginfo().

    This should not cause any problems, because ->phys_proc_id is not
    used after it is hot-removed and it is re-set while hot-adding.

    Signed-off-by: Masayoshi Mizuma
    Acked-by: Thomas Gleixner
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: yasu.isimatu@gmail.com
    Cc:
    Fixes: 30bb9811856f ("x86/topology: Avoid wasting 128k for package id array")
    Link: http://lkml.kernel.org/r/ed738d54-0f01-b38b-b794-c31dc118c207@gmail.com
    Signed-off-by: Ingo Molnar

    Masayoshi Mizuma
     
  • If KEXEC_CORE is not enabled, powernv builds fail as follows.

    arch/powerpc/platforms/powernv/smp.c: In function 'pnv_smp_cpu_kill_self':
    arch/powerpc/platforms/powernv/smp.c:236:4: error:
    implicit declaration of function 'crash_ipi_callback'

    Add dummy function calls, similar to kdump_in_progress(), to solve the
    problem.

    Fixes: 4145f358644b ("powernv/kdump: Fix cases where the kdump kernel can get HMI's")
    Signed-off-by: Guenter Roeck
    Acked-by: Balbir Singh
    Signed-off-by: Michael Ellerman

    Guenter Roeck
     
  • Commit e67e02a544e9 ("powerpc/pseries: Fix cpu hotplug crash with
    memoryless nodes") adds an unconditional call to
    find_and_online_cpu_nid(), which is only declared if CONFIG_PPC_SPLPAR
    is enabled. This results in the following build error if this is not
    the case.

    arch/powerpc/platforms/pseries/hotplug-cpu.o: In function `dlpar_online_cpu':
    arch/powerpc/platforms/pseries/hotplug-cpu.c:369:
    undefined reference to `.find_and_online_cpu_nid'

    Follow the guideline provided by similar functions and provide a dummy
    function if CONFIG_PPC_SPLPAR is not enabled. This also moves the
    external function declaration into an include file where it should be.

    Fixes: e67e02a544e9 ("powerpc/pseries: Fix cpu hotplug crash with memoryless nodes")
    Signed-off-by: Guenter Roeck
    [mpe: Change subject to emphasise the build fix]
    Signed-off-by: Michael Ellerman

    Guenter Roeck