16 Apr, 2018

1 commit

  • Pull x86 fixes from Thomas Gleixner:
    "A set of fixes and updates for x86:

    - Address a swiotlb regression which was caused by the recent DMA
    rework and made driver fail because dma_direct_supported() returned
    false

    - Fix a signedness bug in the APIC ID validation which caused invalid
    APIC IDs to be detected as valid thereby bloating the CPU possible
    space.

    - Fix inconsisten config dependcy/select magic for the MFD_CS5535
    driver.

    - Fix a corruption of the physical address space bits when encryption
    has reduced the address space and late cpuinfo updates overwrite
    the reduced bit information with the original value.

    - Dominiks syscall rework which consolidates the architecture
    specific syscall functions so all syscalls can be wrapped with the
    same macros. This allows to switch x86/64 to struct pt_regs based
    syscalls. Extend the clearing of user space controlled registers in
    the entry patch to the lower registers"

    * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/apic: Fix signedness bug in APIC ID validity checks
    x86/cpu: Prevent cpuinfo_x86::x86_phys_bits adjustment corruption
    x86/olpc: Fix inconsistent MFD_CS5535 configuration
    swiotlb: Use dma_direct_supported() for swiotlb_ops
    syscalls/x86: Adapt syscall_wrapper.h to the new syscall stub naming convention
    syscalls/core, syscalls/x86: Rename struct pt_regs-based sys_*() to __x64_sys_*()
    syscalls/core, syscalls/x86: Clean up compat syscall stub naming convention
    syscalls/core, syscalls/x86: Clean up syscall stub naming convention
    syscalls/x86: Extend register clearing on syscall entry to lower registers
    syscalls/x86: Unconditionally enable 'struct pt_regs' based syscalls on x86_64
    syscalls/x86: Use 'struct pt_regs' based syscall calling for IA32_EMULATION and x32
    syscalls/core: Prepare CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y for compat syscalls
    syscalls/x86: Use 'struct pt_regs' based syscall calling convention for 64-bit syscalls
    syscalls/core: Introduce CONFIG_ARCH_HAS_SYSCALL_WRAPPER=y
    x86/syscalls: Don't pointlessly reload the system call number
    x86/mm: Fix documentation of module mapping range with 4-level paging
    x86/cpuid: Switch to 'static const' specifier

    Linus Torvalds
     

12 Apr, 2018

1 commit


06 Apr, 2018

1 commit

  • Pull trivial tree updates from Jiri Kosina.

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
    kfifo: fix inaccurate comment
    tools/thermal: tmon: fix for segfault
    net: Spelling s/stucture/structure/
    edd: don't spam log if no EDD information is present
    Documentation: Fix early-microcode.txt references after file rename
    tracing: Block comments should align the * on each line
    treewide: Fix typos in printk
    GenWQE: Fix a typo in two comments
    treewide: Align function definition open/close braces

    Linus Torvalds
     

03 Apr, 2018

1 commit

  • Commit:

    f5a40711fa58 ("x86/mm: Set MODULES_END to 0xffffffffff000000")

    changed MODULES_END back to a fixed value, but didn't update the documentation
    of memory layout for 4-level paging.

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Andrey Ryabinin
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Fixes: f5a40711fa58 ("x86/mm: Set MODULES_END to 0xffffffffff000000")
    Link: http://lkml.kernel.org/r/20180402121025.10244-1-kirill.shutemov@linux.intel.com
    Signed-off-by: Ingo Molnar

    Kirill A. Shutemov
     

27 Mar, 2018

1 commit

  • The file Documentation/x86/early-microcode.txt was renamed to
    Documentation/x86/microcode.txt in 0e3258753f81, but it was still
    referenced by its old name in a three places:

    * Documentation/x86/00-INDEX
    * arch/x86/Kconfig
    * arch/x86/kernel/cpu/microcode/amd.c

    This commit updates these references accordingly.

    Fixes: 0e3258753f81 ("x86/microcode: Document the three loading methods")
    Signed-off-by: Jaak Ristioja
    Signed-off-by: Jiri Kosina

    Jaak Ristioja
     

15 Mar, 2018

1 commit


01 Mar, 2018

1 commit


26 Feb, 2018

1 commit


23 Feb, 2018

1 commit

  • topology_sibling_cpumask() is the correct thread-related topology
    function in the kernel:

    s/topology_sibling_mask/topology_sibling_cpumask

    Signed-off-by: Dou Liyang
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: corbet@lwn.net
    Cc: linux-doc@vger.kernel.org
    Link: http://lkml.kernel.org/r/20180222084812.14497-1-douly.fnst@cn.fujitsu.com
    Signed-off-by: Ingo Molnar

    Dou Liyang
     

16 Feb, 2018

1 commit

  • All pieces of the puzzle are in place and we can now allow to boot with
    CONFIG_X86_5LEVEL=y on a machine without LA57 support.

    Kernel will detect that LA57 is missing and fold p4d at runtime.

    Update the documentation and the Kconfig option description to reflect the
    change.

    Signed-off-by: Kirill A. Shutemov
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Woodhouse
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20180214182542.69302-10-kirill.shutemov@linux.intel.com
    Signed-off-by: Ingo Molnar

    Kirill A. Shutemov
     

02 Feb, 2018

1 commit

  • Pull driver core updates from Greg KH:
    "Here is the set of "big" driver core patches for 4.16-rc1.

    The majority of the work here is in the firmware subsystem, with
    reworks to try to attempt to make the code easier to handle in the
    long run, but no functional change. There's also some tree-wide sysfs
    attribute fixups with lots of acks from the various subsystem
    maintainers, as well as a handful of other normal fixes and changes.

    And finally, some license cleanups for the driver core and sysfs code.

    All have been in linux-next for a while with no reported issues"

    * tag 'driver-core-4.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (48 commits)
    device property: Define type of PROPERTY_ENRTY_*() macros
    device property: Reuse property_entry_free_data()
    device property: Move property_entry_free_data() upper
    firmware: Fix up docs referring to FIRMWARE_IN_KERNEL
    firmware: Drop FIRMWARE_IN_KERNEL Kconfig option
    USB: serial: keyspan: Drop firmware Kconfig options
    sysfs: remove DEBUG defines
    sysfs: use SPDX identifiers
    drivers: base: add coredump driver ops
    sysfs: add attribute specification for /sysfs/devices/.../coredump
    test_firmware: fix missing unlock on error in config_num_requests_store()
    test_firmware: make local symbol test_fw_config static
    sysfs: turn WARN() into pr_warn()
    firmware: Fix a typo in fallback-mechanisms.rst
    treewide: Use DEVICE_ATTR_WO
    treewide: Use DEVICE_ATTR_RO
    treewide: Use DEVICE_ATTR_RW
    sysfs.h: Use octal permissions
    component: add debugfs support
    bus: simple-pm-bus: convert bool SIMPLE_PM_BUS to tristate
    ...

    Linus Torvalds
     

30 Jan, 2018

1 commit

  • Pull x86/cache updates from Thomas Gleixner:
    "A set of patches which add support for L2 cache partitioning to the
    Intel RDT facility"

    * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/intel_rdt: Add command line parameter to control L2_CDP
    x86/intel_rdt: Enable L2 CDP in MSR IA32_L2_QOS_CFG
    x86/intel_rdt: Add two new resources for L2 Code and Data Prioritization (CDP)
    x86/intel_rdt: Enumerate L2 Code and Data Prioritization (CDP) feature
    x86/intel_rdt: Add L2CDP support in documentation
    x86/intel_rdt: Update documentation

    Linus Torvalds
     

25 Jan, 2018

1 commit


22 Jan, 2018

1 commit

  • Pull x86 pti fixes from Thomas Gleixner:
    "A small set of fixes for the meltdown/spectre mitigations:

    - Make kprobes aware of retpolines to prevent probes in the retpoline
    thunks.

    - Make the machine check exception speculation protected. MCE used to
    issue an indirect call directly from the ASM entry code. Convert
    that to a direct call into a C-function and issue the indirect call
    from there so the compiler can add the retpoline protection,

    - Make the vmexit_fill_RSB() assembly less stupid

    - Fix a typo in the PTI documentation"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/retpoline: Optimize inline assembler for vmexit_fill_RSB
    x86/pti: Document fix wrong index
    kprobes/x86: Disable optimizing on the function jumps to indirect thunk
    kprobes/x86: Blacklist indirect thunk functions for kprobes
    retpoline: Introduce start/end markers of indirect thunk
    x86/mce: Make machine check speculation protected

    Linus Torvalds
     

19 Jan, 2018

1 commit

  • In section , fix wrong index.

    Signed-off-by: zhenwei.pi
    Signed-off-by: Thomas Gleixner
    Cc: dave.hansen@linux.intel.com
    Link: https://lkml.kernel.org/r/1516237492-27739-1-git-send-email-zhenwei.pi@youruncloud.com

    zhenwei.pi
     

18 Jan, 2018

2 commits

  • L2 and L3 Code and Data Prioritization (CDP) can be enabled separately.
    The existing mount parameter "cdp" is only for enabling L3 CDP and will be
    kept for backwards compability.

    Add a new mount parameter 'cdpl2' for L2 CDP.

    [ tglx: Made changelog readable ]

    Signed-off-by: Fenghua Yu
    Signed-off-by: Thomas Gleixner
    Cc: "Ravi V Shankar"
    Cc: "Tony Luck"
    Cc: Vikas"
    Cc: Sai Praneeth"
    Cc: Reinette"
    Link: https://lkml.kernel.org/r/1513810644-78015-3-git-send-email-fenghua.yu@intel.com

    Fenghua Yu
     
  • With more flag bits in /proc/cpuinfo for RDT, it's better to classify the
    bits for readability.

    Some previously missing bits are added as well.

    Signed-off-by: Fenghua Yu
    Signed-off-by: Thomas Gleixner
    Cc: "Ravi V Shankar"
    Cc: "Tony Luck"
    Cc: Vikas"
    Cc: Sai Praneeth"
    Cc: Reinette"
    Link: https://lkml.kernel.org/r/1513810644-78015-2-git-send-email-fenghua.yu@intel.com

    Fenghua Yu
     

15 Jan, 2018

1 commit

  • Pull x86 pti updates from Thomas Gleixner:
    "This contains:

    - a PTI bugfix to avoid setting reserved CR3 bits when PCID is
    disabled. This seems to cause issues on a virtual machine at least
    and is incorrect according to the AMD manual.

    - a PTI bugfix which disables the perf BTS facility if PTI is
    enabled. The BTS AUX buffer is not globally visible and causes the
    CPU to fault when the mapping disappears on switching CR3 to user
    space. A full fix which restores BTS on PTI is non trivial and will
    be worked on.

    - PTI bugfixes for EFI and trusted boot which make sure that the user
    space visible page table entries have the NX bit cleared

    - removal of dead code in the PTI pagetable setup functions

    - add PTI documentation

    - add a selftest for vsyscall to verify that the kernel actually
    implements what it advertises.

    - a sysfs interface to expose vulnerability and mitigation
    information so there is a coherent way for users to retrieve the
    status.

    - the initial spectre_v2 mitigations, aka retpoline:

    + The necessary ASM thunk and compiler support

    + The ASM variants of retpoline and the conversion of affected ASM
    code

    + Make LFENCE serializing on AMD so it can be used as speculation
    trap

    + The RSB fill after vmexit

    - initial objtool support for retpoline

    As I said in the status mail this is the most of the set of patches
    which should go into 4.15 except two straight forward patches still on
    hold:

    - the retpoline add on of LFENCE which waits for ACKs

    - the RSB fill after context switch

    Both should be ready to go early next week and with that we'll have
    covered the major holes of spectre_v2 and go back to normality"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (28 commits)
    x86,perf: Disable intel_bts when PTI
    security/Kconfig: Correct the Documentation reference for PTI
    x86/pti: Fix !PCID and sanitize defines
    selftests/x86: Add test_vsyscall
    x86/retpoline: Fill return stack buffer on vmexit
    x86/retpoline/irq32: Convert assembler indirect jumps
    x86/retpoline/checksum32: Convert assembler indirect jumps
    x86/retpoline/xen: Convert Xen hypercall indirect jumps
    x86/retpoline/hyperv: Convert assembler indirect jumps
    x86/retpoline/ftrace: Convert ftrace assembler indirect jumps
    x86/retpoline/entry: Convert entry assembler indirect jumps
    x86/retpoline/crypto: Convert crypto assembler indirect jumps
    x86/spectre: Add boot time option to select Spectre v2 mitigation
    x86/retpoline: Add initial retpoline support
    objtool: Allow alternatives to be ignored
    objtool: Detect jumps to retpoline thunks
    x86/pti: Make unpoison of pgd for trusted boot work for real
    x86/alternatives: Fix optimize_nops() checking
    sysfs/cpu: Fix typos in vulnerability documentation
    x86/cpu/AMD: Use LFENCE_RDTSC in preference to MFENCE_RDTSC
    ...

    Linus Torvalds
     

07 Jan, 2018

1 commit

  • Add some details about how PTI works, what some of the downsides
    are, and how to debug it when things go wrong.

    Also document the kernel parameter: 'pti/nopti'.

    Signed-off-by: Dave Hansen
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Randy Dunlap
    Reviewed-by: Kees Cook
    Cc: Moritz Lipp
    Cc: Daniel Gruss
    Cc: Michael Schwarz
    Cc: Richard Fellner
    Cc: Andy Lutomirski
    Cc: Linus Torvalds
    Cc: Hugh Dickins
    Cc: Andi Lutomirsky
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20180105174436.1BC6FA2B@viggo.jf.intel.com

    Dave Hansen
     

06 Jan, 2018

1 commit

  • Pull more x86 pti fixes from Thomas Gleixner:
    "Another small stash of fixes for fallout from the PTI work:

    - Fix the modules vs. KASAN breakage which was caused by making
    MODULES_END depend of the fixmap size. That was done when the cpu
    entry area moved into the fixmap, but now that we have a separate
    map space for that this is causing more issues than it solves.

    - Use the proper cache flush methods for the debugstore buffers as
    they are mapped/unmapped during runtime and not statically mapped
    at boot time like the rest of the cpu entry area.

    - Make the map layout of the cpu_entry_area consistent for 4 and 5
    level paging and fix the KASLR vaddr_end wreckage.

    - Use PER_CPU_EXPORT for per cpu variable and while at it unbreak
    nvidia gfx drivers by dropping the GPL export. The subject line of
    the commit tells it the other way around, but I noticed that too
    late.

    - Fix the ASM alternative macros so they can be used in the middle of
    an inline asm block.

    - Rename the BUG_CPU_INSECURE flag to BUG_CPU_MELTDOWN so the attack
    vector is properly identified. The Spectre mitigations will come
    with their own bug bits later"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/pti: Rename BUG_CPU_INSECURE to BUG_CPU_MELTDOWN
    x86/alternatives: Add missing '\n' at end of ALTERNATIVE inline asm
    x86/tlb: Drop the _GPL from the cpu_tlbstate export
    x86/events/intel/ds: Use the proper cache flush method for mapping ds buffers
    x86/kaslr: Fix the vaddr_end mess
    x86/mm: Map cpu_entry_area at the same place on 4/5 level
    x86/mm: Set MODULES_END to 0xffffffffff000000

    Linus Torvalds
     

05 Jan, 2018

3 commits

  • vaddr_end for KASLR is only documented in the KASLR code itself and is
    adjusted depending on config options. So it's not surprising that a change
    of the memory layout causes KASLR to have the wrong vaddr_end. This can map
    arbitrary stuff into other areas causing hard to understand problems.

    Remove the whole ifdef magic and define the start of the cpu_entry_area to
    be the end of the KASLR vaddr range.

    Add documentation to that effect.

    Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap")
    Reported-by: Benjamin Gilbert
    Signed-off-by: Thomas Gleixner
    Tested-by: Benjamin Gilbert
    Cc: Andy Lutomirski
    Cc: Greg Kroah-Hartman
    Cc: stable
    Cc: Dave Hansen
    Cc: Peter Zijlstra
    Cc: Thomas Garnier ,
    Cc: Alexander Kuleshov
    Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos

    Thomas Gleixner
     
  • There is no reason for 4 and 5 level pagetables to have a different
    layout. It just makes determining vaddr_end for KASLR harder than
    necessary.

    Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap")
    Signed-off-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Benjamin Gilbert
    Cc: Greg Kroah-Hartman
    Cc: stable
    Cc: Dave Hansen
    Cc: Peter Zijlstra
    Cc: Thomas Garnier ,
    Cc: Alexander Kuleshov
    Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos

    Thomas Gleixner
     
  • Since f06bdd4001c2 ("x86/mm: Adapt MODULES_END based on fixmap section size")
    kasan_mem_to_shadow(MODULES_END) could be not aligned to a page boundary.

    So passing page unaligned address to kasan_populate_zero_shadow() have two
    possible effects:

    1) It may leave one page hole in supposed to be populated area. After commit
    21506525fb8d ("x86/kasan/64: Teach KASAN about the cpu_entry_area") that
    hole happens to be in the shadow covering fixmap area and leads to crash:

    BUG: unable to handle kernel paging request at fffffbffffe8ee04
    RIP: 0010:check_memory_region+0x5c/0x190

    Call Trace:

    memcpy+0x1f/0x50
    ghes_copy_tofrom_phys+0xab/0x180
    ghes_read_estatus+0xfb/0x280
    ghes_notify_nmi+0x2b2/0x410
    nmi_handle+0x115/0x2c0
    default_do_nmi+0x57/0x110
    do_nmi+0xf8/0x150
    end_repeat_nmi+0x1a/0x1e

    Note, the crash likely disappeared after commit 92a0f81d8957, which
    changed kasan_populate_zero_shadow() call the way it was before
    commit 21506525fb8d.

    2) Attempt to load module near MODULES_END will fail, because
    __vmalloc_node_range() called from kasan_module_alloc() will hit the
    WARN_ON(!pte_none(*pte)) in the vmap_pte_range() and bail out with error.

    To fix this we need to make kasan_mem_to_shadow(MODULES_END) page aligned
    which means that MODULES_END should be 8*PAGE_SIZE aligned.

    The whole point of commit f06bdd4001c2 was to move MODULES_END down if
    NR_CPUS is big, so the cpu_entry_area takes a lot of space.
    But since 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap")
    the cpu_entry_area is no longer in fixmap, so we could just set
    MODULES_END to a fixed 8*PAGE_SIZE aligned address.

    Fixes: f06bdd4001c2 ("x86/mm: Adapt MODULES_END based on fixmap section size")
    Reported-by: Jakub Kicinski
    Signed-off-by: Andrey Ryabinin
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Cc: Andy Lutomirski
    Cc: Thomas Garnier
    Link: https://lkml.kernel.org/r/20171228160620.23818-1-aryabinin@virtuozzo.com

    Andrey Ryabinin
     

30 Dec, 2017

1 commit

  • Pull x86 page table isolation updates from Thomas Gleixner:
    "This is the final set of enabling page table isolation on x86:

    - Infrastructure patches for handling the extra page tables.

    - Patches which map the various bits and pieces which are required to
    get in and out of user space into the user space visible page
    tables.

    - The required changes to have CR3 switching in the entry/exit code.

    - Optimizations for the CR3 switching along with documentation how
    the ASID/PCID mechanism works.

    - Updates to dump pagetables to cover the user space page tables for
    W+X scans and extra debugfs files to analyze both the kernel and
    the user space visible page tables

    The whole functionality is compile time controlled via a config switch
    and can be turned on/off on the command line as well"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
    x86/ldt: Make the LDT mapping RO
    x86/mm/dump_pagetables: Allow dumping current pagetables
    x86/mm/dump_pagetables: Check user space page table for WX pages
    x86/mm/dump_pagetables: Add page table directory to the debugfs VFS hierarchy
    x86/mm/pti: Add Kconfig
    x86/dumpstack: Indicate in Oops whether PTI is configured and enabled
    x86/mm: Clarify the whole ASID/kernel PCID/user PCID naming
    x86/mm: Use INVPCID for __native_flush_tlb_single()
    x86/mm: Optimize RESTORE_CR3
    x86/mm: Use/Fix PCID to optimize user/kernel switches
    x86/mm: Abstract switching CR3
    x86/mm: Allow flushing for future ASID switches
    x86/pti: Map the vsyscall page if needed
    x86/pti: Put the LDT in its own PGD if PTI is on
    x86/mm/64: Make a full PGD-entry size hole in the memory map
    x86/events/intel/ds: Map debug buffers in cpu_entry_area
    x86/cpu_entry_area: Add debugstore entries to cpu_entry_area
    x86/mm/pti: Map ESPFIX into user space
    x86/mm/pti: Share entry text PMD
    x86/entry: Align entry text section to PMD boundary
    ...

    Linus Torvalds
     

24 Dec, 2017

3 commits

  • With PTI enabled, the LDT must be mapped in the usermode tables somewhere.
    The LDT is per process, i.e. per mm.

    An earlier approach mapped the LDT on context switch into a fixmap area,
    but that's a big overhead and exhausted the fixmap space when NR_CPUS got
    big.

    Take advantage of the fact that there is an address space hole which
    provides a completely unused pgd. Use this pgd to manage per-mm LDT
    mappings.

    This has a down side: the LDT isn't (currently) randomized, and an attack
    that can write the LDT is instant root due to call gates (thanks, AMD, for
    leaving call gates in AMD64 but designing them wrong so they're only useful
    for exploits). This can be mitigated by making the LDT read-only or
    randomizing the mapping, either of which is strightforward on top of this
    patch.

    This will significantly slow down LDT users, but that shouldn't matter for
    important workloads -- the LDT is only used by DOSEMU(2), Wine, and very
    old libc implementations.

    [ tglx: Cleaned it up. ]

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Dave Hansen
    Cc: David Laight
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: Kirill A. Shutemov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     
  • Shrink vmalloc space from 16384TiB to 12800TiB to enlarge the hole starting
    at 0xff90000000000000 to be a full PGD entry.

    A subsequent patch will use this hole for the pagetable isolation LDT
    alias.

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Dave Hansen
    Cc: David Laight
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: Kirill A. Shutemov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     
  • Pull x86 PTI preparatory patches from Thomas Gleixner:
    "Todays Advent calendar window contains twentyfour easy to digest
    patches. The original plan was to have twenty three matching the date,
    but a late fixup made that moot.

    - Move the cpu_entry_area mapping out of the fixmap into a separate
    address space. That's necessary because the fixmap becomes too big
    with NRCPUS=8192 and this caused already subtle and hard to
    diagnose failures.

    The top most patch is fresh from today and cures a brain slip of
    that tall grumpy german greybeard, who ignored the intricacies of
    32bit wraparounds.

    - Limit the number of CPUs on 32bit to 64. That's insane big already,
    but at least it's small enough to prevent address space issues with
    the cpu_entry_area map, which have been observed and debugged with
    the fixmap code

    - A few TLB flush fixes in various places plus documentation which of
    the TLB functions should be used for what.

    - Rename the SYSENTER stack to CPU_ENTRY_AREA stack as it is used for
    more than sysenter now and keeping the name makes backtraces
    confusing.

    - Prevent LDT inheritance on exec() by moving it to arch_dup_mmap(),
    which is only invoked on fork().

    - Make vysycall more robust.

    - A few fixes and cleanups of the debug_pagetables code. Check
    PAGE_PRESENT instead of checking the PTE for 0 and a cleanup of the
    C89 initialization of the address hint array which already was out
    of sync with the index enums.

    - Move the ESPFIX init to a different place to prepare for PTI.

    - Several code moves with no functional change to make PTI
    integration simpler and header files less convoluted.

    - Documentation fixes and clarifications"

    * 'x86-pti-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
    x86/cpu_entry_area: Prevent wraparound in setup_cpu_entry_area_ptes() on 32bit
    init: Invoke init_espfix_bsp() from mm_init()
    x86/cpu_entry_area: Move it out of the fixmap
    x86/cpu_entry_area: Move it to a separate unit
    x86/mm: Create asm/invpcid.h
    x86/mm: Put MMU to hardware ASID translation in one place
    x86/mm: Remove hard-coded ASID limit checks
    x86/mm: Move the CR3 construction functions to tlbflush.h
    x86/mm: Add comments to clarify which TLB-flush functions are supposed to flush what
    x86/mm: Remove superfluous barriers
    x86/mm: Use __flush_tlb_one() for kernel memory
    x86/microcode: Dont abuse the TLB-flush interface
    x86/uv: Use the right TLB-flush API
    x86/entry: Rename SYSENTER_stack to CPU_ENTRY_AREA_entry_stack
    x86/doc: Remove obvious weirdnesses from the x86 MM layout documentation
    x86/mm/64: Improve the memory map documentation
    x86/ldt: Prevent LDT inheritance on exec
    x86/ldt: Rework locking
    arch, mm: Allow arch_dup_mmap() to fail
    x86/vsyscall/64: Warn and fail vsyscall emulation in NATIVE mode
    ...

    Linus Torvalds
     

23 Dec, 2017

3 commits

  • Put the cpu_entry_area into a separate P4D entry. The fixmap gets too big
    and 0-day already hit a case where the fixmap PTEs were cleared by
    cleanup_highmap().

    Aside of that the fixmap API is a pain as it's all backwards.

    Signed-off-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Thomas Gleixner
     
  • Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: David Laight
    Cc: Denys Vlasenko
    Cc: Eduardo Valentin
    Cc: Greg KH
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Cc: aliguori@amazon.com
    Cc: daniel.gruss@iaik.tugraz.at
    Cc: hughd@google.com
    Cc: keescook@google.com
    Cc: linux-mm@kvack.org
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     
  • The old docs had the vsyscall range wrong and were missing the fixmap.
    Fix both.

    There used to be 8 MB reserved for future vsyscalls, but that's long gone.

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Dave Hansen
    Cc: David Laight
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: Kirill A. Shutemov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Andy Lutomirski
     

21 Nov, 2017

1 commit

  • Now that CPUs that implement Memory Protection Keys are publicly
    available we can be a bit less oblique about where it is available.

    Signed-off-by: Dave Hansen
    Acked-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20171111001228.DC748A10@viggo.jf.intel.com
    Signed-off-by: Ingo Molnar

    Dave Hansen
     

14 Nov, 2017

1 commit

  • Pull x86 cache resource updates from Thomas Gleixner:
    "This update provides updates to RDT:

    - A diagnostic framework for the Resource Director Technology (RDT)
    user interface (sysfs). The failure modes of the user interface are
    hard to diagnose from the error codes. An extra last command status
    file provides now sensible textual information about the failure so
    its simpler to use.

    - A few minor cleanups and updates in the RDT code"

    * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/intel_rdt: Fix a silent failure when writing zero value schemata
    x86/intel_rdt: Fix potential deadlock during resctrl mount
    x86/intel_rdt: Fix potential deadlock during resctrl unmount
    x86/intel_rdt: Initialize bitmask of shareable resource if CDP enabled
    x86/intel_rdt: Remove redundant assignment
    x86/intel_rdt/cqm: Make integer rmid_limbo_count static
    x86/intel_rdt: Add documentation for "info/last_cmd_status"
    x86/intel_rdt: Add diagnostics when making directories
    x86/intel_rdt: Add diagnostics when writing the cpus file
    x86/intel_rdt: Add diagnostics when writing the tasks file
    x86/intel_rdt: Add diagnostics when writing the schemata file
    x86/intel_rdt: Add framework for better RDT UI diagnostics

    Linus Torvalds
     

07 Nov, 2017

1 commit


06 Nov, 2017

1 commit


20 Oct, 2017

1 commit

  • We are going to support boot-time switching between 4- and 5-level
    paging. For KASAN it means we cannot have different KASAN_SHADOW_OFFSET
    for different paging modes: the constant is passed to gcc to generate
    code and cannot be changed at runtime.

    This patch changes KASAN code to use 0xdffffc0000000000 as shadow offset
    for both 4- and 5-level paging.

    For 5-level paging it means that shadow memory region is not aligned to
    PGD boundary anymore and we have to handle unaligned parts of the region
    properly.

    In addition, we have to exclude paravirt code from KASAN instrumentation
    as we now use set_pgd() before KASAN is fully ready.

    [kirill.shutemov@linux.intel.com: clenaup, changelog message]
    Signed-off-by: Andrey Ryabinin
    Signed-off-by: Kirill A. Shutemov
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Cyrill Gorcunov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20170929140821.37654-4-kirill.shutemov@linux.intel.com
    Signed-off-by: Ingo Molnar

    Andrey Ryabinin
     

14 Oct, 2017

1 commit

  • Rename the unwinder config options from:

    CONFIG_ORC_UNWINDER
    CONFIG_FRAME_POINTER_UNWINDER
    CONFIG_GUESS_UNWINDER

    to:

    CONFIG_UNWINDER_ORC
    CONFIG_UNWINDER_FRAME_POINTER
    CONFIG_UNWINDER_GUESS

    ... in order to give them a more logical config namespace.

    Suggested-by: Ingo Molnar
    Signed-off-by: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/73972fc7e2762e91912c6b9584582703d6f1b8cc.1507924831.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

27 Sep, 2017

1 commit

  • New file in the "info" directory helps diagnose what went wrong
    when using the /sys/fs/resctrl file system

    Signed-off-by: Tony Luck
    Signed-off-by: Thomas Gleixner
    Cc: Fenghua Yu
    Cc: Steven Rostedt
    Cc: Vikas Shivappa
    Cc: Boris Petkov
    Cc: Reinette Chatre
    Link: https://lkml.kernel.org/r/387e78e444582403c2454479e576caf5721a363f.1506382469.git.tony.luck@intel.com

    Tony Luck
     

05 Sep, 2017

3 commits

  • Pull x86 cache quality monitoring update from Thomas Gleixner:
    "This update provides a complete rewrite of the Cache Quality
    Monitoring (CQM) facility.

    The existing CQM support was duct taped into perf with a lot of issues
    and the attempts to fix those turned out to be incomplete and
    horrible.

    After lengthy discussions it was decided to integrate the CQM support
    into the Resource Director Technology (RDT) facility, which is the
    obvious choise as in hardware CQM is part of RDT. This allowed to add
    Memory Bandwidth Monitoring support on top.

    As a result the mechanisms for allocating cache/memory bandwidth and
    the corresponding monitoring mechanisms are integrated into a single
    management facility with a consistent user interface"

    * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
    x86/intel_rdt: Turn off most RDT features on Skylake
    x86/intel_rdt: Add command line options for resource director technology
    x86/intel_rdt: Move special case code for Haswell to a quirk function
    x86/intel_rdt: Remove redundant ternary operator on return
    x86/intel_rdt/cqm: Improve limbo list processing
    x86/intel_rdt/mbm: Fix MBM overflow handler during CPU hotplug
    x86/intel_rdt: Modify the intel_pqr_state for better performance
    x86/intel_rdt/cqm: Clear the default RMID during hotcpu
    x86/intel_rdt: Show bitmask of shareable resource with other executing units
    x86/intel_rdt/mbm: Handle counter overflow
    x86/intel_rdt/mbm: Add mbm counter initialization
    x86/intel_rdt/mbm: Basic counting of MBM events (total and local)
    x86/intel_rdt/cqm: Add CPU hotplug support
    x86/intel_rdt/cqm: Add sched_in support
    x86/intel_rdt: Introduce rdt_enable_key for scheduling
    x86/intel_rdt/cqm: Add mount,umount support
    x86/intel_rdt/cqm: Add rmdir support
    x86/intel_rdt: Separate the ctrl bits from rmdir
    x86/intel_rdt/cqm: Add mon_data
    x86/intel_rdt: Prepare for RDT monitor data support
    ...

    Linus Torvalds
     
  • Pull x86 mm changes from Ingo Molnar:
    "PCID support, 5-level paging support, Secure Memory Encryption support

    The main changes in this cycle are support for three new, complex
    hardware features of x86 CPUs:

    - Add 5-level paging support, which is a new hardware feature on
    upcoming Intel CPUs allowing up to 128 PB of virtual address space
    and 4 PB of physical RAM space - a 512-fold increase over the old
    limits. (Supercomputers of the future forecasting hurricanes on an
    ever warming planet can certainly make good use of more RAM.)

    Many of the necessary changes went upstream in previous cycles,
    v4.14 is the first kernel that can enable 5-level paging.

    This feature is activated via CONFIG_X86_5LEVEL=y - disabled by
    default.

    (By Kirill A. Shutemov)

    - Add 'encrypted memory' support, which is a new hardware feature on
    upcoming AMD CPUs ('Secure Memory Encryption', SME) allowing system
    RAM to be encrypted and decrypted (mostly) transparently by the
    CPU, with a little help from the kernel to transition to/from
    encrypted RAM. Such RAM should be more secure against various
    attacks like RAM access via the memory bus and should make the
    radio signature of memory bus traffic harder to intercept (and
    decrypt) as well.

    This feature is activated via CONFIG_AMD_MEM_ENCRYPT=y - disabled
    by default.

    (By Tom Lendacky)

    - Enable PCID optimized TLB flushing on newer Intel CPUs: PCID is a
    hardware feature that attaches an address space tag to TLB entries
    and thus allows to skip TLB flushing in many cases, even if we
    switch mm's.

    (By Andy Lutomirski)

    All three of these features were in the works for a long time, and
    it's coincidence of the three independent development paths that they
    are all enabled in v4.14 at once"

    * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (65 commits)
    x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)
    x86/mm: Use pr_cont() in dump_pagetable()
    x86/mm: Fix SME encryption stack ptr handling
    kvm/x86: Avoid clearing the C-bit in rsvd_bits()
    x86/CPU: Align CR3 defines
    x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages
    acpi, x86/mm: Remove encryption mask from ACPI page protection type
    x86/mm, kexec: Fix memory corruption with SME on successive kexecs
    x86/mm/pkeys: Fix typo in Documentation/x86/protection-keys.txt
    x86/mm/dump_pagetables: Speed up page tables dump for CONFIG_KASAN=y
    x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID
    x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y
    x86/mm: Allow userspace have mappings above 47-bit
    x86/mm: Prepare to expose larger address space to userspace
    x86/mpx: Do not allow MPX if we have mappings above 47-bit
    x86/mm: Rename tasksize_32bit/64bit to task_size_32bit/64bit()
    x86/xen: Redefine XEN_ELFNOTE_INIT_P2M using PUD_SIZE * PTRS_PER_PUD
    x86/mm/dump_pagetables: Fix printout of p4d level
    x86/mm/dump_pagetables: Generalize address normalization
    x86/boot: Fix memremap() related build failure
    ...

    Linus Torvalds
     
  • Pull x86 microcode loading updates from Ingo Molnar:
    "Update documentation, improve robustness and fix a memory leak"

    * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/microcode/intel: Improve microcode patches saving flow
    x86/microcode: Document the three loading methods
    x86/microcode/AMD: Free unneeded patch before exit from update_cache()

    Linus Torvalds