27 Nov, 2018

1 commit

  • commit d52888aa2753e3063a9d3a0c9f72f94aa9809c15 upstream

    On 5-level paging the LDT remap area is placed in the middle of the KASLR
    randomization region and it can overlap with the direct mapping, the
    vmalloc or the vmap area.

    The LDT mapping is per mm, so it cannot be moved into the P4D page table
    next to the CPU_ENTRY_AREA without complicating PGD table allocation for
    5-level paging.

    The 4 PGD slot gap just before the direct mapping is reserved for
    hypervisors, so it cannot be used.

    Move the direct mapping one slot deeper and use the resulting gap for the
    LDT remap area. The resulting layout is the same for 4 and 5 level paging.

    [ tglx: Massaged changelog ]

    Fixes: f55f0501cbf6 ("x86/pti: Put the LDT in its own PGD if PTI is on")
    Signed-off-by: Kirill A. Shutemov
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Andy Lutomirski
    Cc: bp@alien8.de
    Cc: hpa@zytor.com
    Cc: dave.hansen@linux.intel.com
    Cc: peterz@infradead.org
    Cc: boris.ostrovsky@oracle.com
    Cc: jgross@suse.com
    Cc: bhe@redhat.com
    Cc: willy@infradead.org
    Cc: linux-mm@kvack.org
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20181026122856.66224-2-kirill.shutemov@linux.intel.com
    Signed-off-by: Sasha Levin

    Kirill A. Shutemov
     

24 Jan, 2018

1 commit

  • commit 98f0fceec7f84d80bc053e49e596088573086421 upstream.

    In section , fix wrong index.

    Signed-off-by: zhenwei.pi
    Signed-off-by: Thomas Gleixner
    Cc: dave.hansen@linux.intel.com
    Link: https://lkml.kernel.org/r/1516237492-27739-1-git-send-email-zhenwei.pi@youruncloud.com
    Signed-off-by: Greg Kroah-Hartman

    zhenwei.pi
     

17 Jan, 2018

1 commit

  • commit 01c9b17bf673b05bb401b76ec763e9730ccf1376 upstream.

    Add some details about how PTI works, what some of the downsides
    are, and how to debug it when things go wrong.

    Also document the kernel parameter: 'pti/nopti'.

    Signed-off-by: Dave Hansen
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Randy Dunlap
    Reviewed-by: Kees Cook
    Cc: Moritz Lipp
    Cc: Daniel Gruss
    Cc: Michael Schwarz
    Cc: Richard Fellner
    Cc: Andy Lutomirski
    Cc: Linus Torvalds
    Cc: Hugh Dickins
    Cc: Andi Lutomirsky
    Link: https://lkml.kernel.org/r/20180105174436.1BC6FA2B@viggo.jf.intel.com
    Signed-off-by: Greg Kroah-Hartman

    Dave Hansen
     

10 Jan, 2018

3 commits

  • commit 1dddd25125112ba49706518ac9077a1026a18f37 upstream.

    vaddr_end for KASLR is only documented in the KASLR code itself and is
    adjusted depending on config options. So it's not surprising that a change
    of the memory layout causes KASLR to have the wrong vaddr_end. This can map
    arbitrary stuff into other areas causing hard to understand problems.

    Remove the whole ifdef magic and define the start of the cpu_entry_area to
    be the end of the KASLR vaddr range.

    Add documentation to that effect.

    Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap")
    Reported-by: Benjamin Gilbert
    Signed-off-by: Thomas Gleixner
    Tested-by: Benjamin Gilbert
    Cc: Andy Lutomirski
    Cc: Greg Kroah-Hartman
    Cc: Dave Hansen
    Cc: Peter Zijlstra
    Cc: Thomas Garnier
    Cc: Alexander Kuleshov
    Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • commit f2078904810373211fb15f91888fba14c01a4acc upstream.

    There is no reason for 4 and 5 level pagetables to have a different
    layout. It just makes determining vaddr_end for KASLR harder than
    necessary.

    Fixes: 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap")
    Signed-off-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Benjamin Gilbert
    Cc: Greg Kroah-Hartman
    Cc: Dave Hansen
    Cc: Peter Zijlstra
    Cc: Thomas Garnier
    Cc: Alexander Kuleshov
    Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801041320360.1771@nanos
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • commit f5a40711fa58f1c109165a4fec6078bf2dfd2bdc upstream.

    Since f06bdd4001c2 ("x86/mm: Adapt MODULES_END based on fixmap section size")
    kasan_mem_to_shadow(MODULES_END) could be not aligned to a page boundary.

    So passing page unaligned address to kasan_populate_zero_shadow() have two
    possible effects:

    1) It may leave one page hole in supposed to be populated area. After commit
    21506525fb8d ("x86/kasan/64: Teach KASAN about the cpu_entry_area") that
    hole happens to be in the shadow covering fixmap area and leads to crash:

    BUG: unable to handle kernel paging request at fffffbffffe8ee04
    RIP: 0010:check_memory_region+0x5c/0x190

    Call Trace:

    memcpy+0x1f/0x50
    ghes_copy_tofrom_phys+0xab/0x180
    ghes_read_estatus+0xfb/0x280
    ghes_notify_nmi+0x2b2/0x410
    nmi_handle+0x115/0x2c0
    default_do_nmi+0x57/0x110
    do_nmi+0xf8/0x150
    end_repeat_nmi+0x1a/0x1e

    Note, the crash likely disappeared after commit 92a0f81d8957, which
    changed kasan_populate_zero_shadow() call the way it was before
    commit 21506525fb8d.

    2) Attempt to load module near MODULES_END will fail, because
    __vmalloc_node_range() called from kasan_module_alloc() will hit the
    WARN_ON(!pte_none(*pte)) in the vmap_pte_range() and bail out with error.

    To fix this we need to make kasan_mem_to_shadow(MODULES_END) page aligned
    which means that MODULES_END should be 8*PAGE_SIZE aligned.

    The whole point of commit f06bdd4001c2 was to move MODULES_END down if
    NR_CPUS is big, so the cpu_entry_area takes a lot of space.
    But since 92a0f81d8957 ("x86/cpu_entry_area: Move it out of the fixmap")
    the cpu_entry_area is no longer in fixmap, so we could just set
    MODULES_END to a fixed 8*PAGE_SIZE aligned address.

    Fixes: f06bdd4001c2 ("x86/mm: Adapt MODULES_END based on fixmap section size")
    Reported-by: Jakub Kicinski
    Signed-off-by: Andrey Ryabinin
    Signed-off-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Thomas Garnier
    Link: https://lkml.kernel.org/r/20171228160620.23818-1-aryabinin@virtuozzo.com
    Signed-off-by: Greg Kroah-Hartman

    Andrey Ryabinin
     

03 Jan, 2018

2 commits

  • commit f55f0501cbf65ec41cca5058513031b711730b1d upstream.

    With PTI enabled, the LDT must be mapped in the usermode tables somewhere.
    The LDT is per process, i.e. per mm.

    An earlier approach mapped the LDT on context switch into a fixmap area,
    but that's a big overhead and exhausted the fixmap space when NR_CPUS got
    big.

    Take advantage of the fact that there is an address space hole which
    provides a completely unused pgd. Use this pgd to manage per-mm LDT
    mappings.

    This has a down side: the LDT isn't (currently) randomized, and an attack
    that can write the LDT is instant root due to call gates (thanks, AMD, for
    leaving call gates in AMD64 but designing them wrong so they're only useful
    for exploits). This can be mitigated by making the LDT read-only or
    randomizing the mapping, either of which is strightforward on top of this
    patch.

    This will significantly slow down LDT users, but that shouldn't matter for
    important workloads -- the LDT is only used by DOSEMU(2), Wine, and very
    old libc implementations.

    [ tglx: Cleaned it up. ]

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Dave Hansen
    Cc: David Laight
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: Kirill A. Shutemov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit 9f449772a3106bcdd4eb8fdeb281147b0e99fb30 upstream.

    Shrink vmalloc space from 16384TiB to 12800TiB to enlarge the hole starting
    at 0xff90000000000000 to be a full PGD entry.

    A subsequent patch will use this hole for the pagetable isolation LDT
    alias.

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Dave Hansen
    Cc: David Laight
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: Kirill A. Shutemov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     

30 Dec, 2017

3 commits

  • commit 92a0f81d89571e3e8759366e050ee05cc545ef99 upstream.

    Put the cpu_entry_area into a separate P4D entry. The fixmap gets too big
    and 0-day already hit a case where the fixmap PTEs were cleared by
    cleanup_highmap().

    Aside of that the fixmap API is a pain as it's all backwards.

    Signed-off-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Dave Hansen
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • commit e8ffe96e5933d417195268478479933d56213a3f upstream.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: David Laight
    Cc: Denys Vlasenko
    Cc: Eduardo Valentin
    Cc: Greg KH
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Cc: aliguori@amazon.com
    Cc: daniel.gruss@iaik.tugraz.at
    Cc: hughd@google.com
    Cc: keescook@google.com
    Cc: linux-mm@kvack.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra
     
  • commit 5a7ccf4754fb3660569a6de52ba7f7fc3dfaf280 upstream.

    The old docs had the vsyscall range wrong and were missing the fixmap.
    Fix both.

    There used to be 8 MB reserved for future vsyscalls, but that's long gone.

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Dave Hansen
    Cc: David Laight
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: Kirill A. Shutemov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     

25 Dec, 2017

2 commits

  • commit 12a8cc7fcf54a8575f094be1e99032ec38aa045c upstream.

    We are going to support boot-time switching between 4- and 5-level
    paging. For KASAN it means we cannot have different KASAN_SHADOW_OFFSET
    for different paging modes: the constant is passed to gcc to generate
    code and cannot be changed at runtime.

    This patch changes KASAN code to use 0xdffffc0000000000 as shadow offset
    for both 4- and 5-level paging.

    For 5-level paging it means that shadow memory region is not aligned to
    PGD boundary anymore and we have to handle unaligned parts of the region
    properly.

    In addition, we have to exclude paravirt code from KASAN instrumentation
    as we now use set_pgd() before KASAN is fully ready.

    [kirill.shutemov@linux.intel.com: clenaup, changelog message]
    Signed-off-by: Andrey Ryabinin
    Signed-off-by: Kirill A. Shutemov
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Cyrill Gorcunov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20170929140821.37654-4-kirill.shutemov@linux.intel.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andrey Ryabinin
     
  • commit 11af847446ed0d131cf24d16a7ef3d5ea7a49554 upstream.

    Rename the unwinder config options from:

    CONFIG_ORC_UNWINDER
    CONFIG_FRAME_POINTER_UNWINDER
    CONFIG_GUESS_UNWINDER

    to:

    CONFIG_UNWINDER_ORC
    CONFIG_UNWINDER_FRAME_POINTER
    CONFIG_UNWINDER_GUESS

    ... in order to give them a more logical config namespace.

    Suggested-by: Ingo Molnar
    Signed-off-by: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/73972fc7e2762e91912c6b9584582703d6f1b8cc.1507924831.git.jpoimboe@redhat.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Josh Poimboeuf
     

05 Sep, 2017

3 commits

  • Pull x86 cache quality monitoring update from Thomas Gleixner:
    "This update provides a complete rewrite of the Cache Quality
    Monitoring (CQM) facility.

    The existing CQM support was duct taped into perf with a lot of issues
    and the attempts to fix those turned out to be incomplete and
    horrible.

    After lengthy discussions it was decided to integrate the CQM support
    into the Resource Director Technology (RDT) facility, which is the
    obvious choise as in hardware CQM is part of RDT. This allowed to add
    Memory Bandwidth Monitoring support on top.

    As a result the mechanisms for allocating cache/memory bandwidth and
    the corresponding monitoring mechanisms are integrated into a single
    management facility with a consistent user interface"

    * 'x86-cache-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (37 commits)
    x86/intel_rdt: Turn off most RDT features on Skylake
    x86/intel_rdt: Add command line options for resource director technology
    x86/intel_rdt: Move special case code for Haswell to a quirk function
    x86/intel_rdt: Remove redundant ternary operator on return
    x86/intel_rdt/cqm: Improve limbo list processing
    x86/intel_rdt/mbm: Fix MBM overflow handler during CPU hotplug
    x86/intel_rdt: Modify the intel_pqr_state for better performance
    x86/intel_rdt/cqm: Clear the default RMID during hotcpu
    x86/intel_rdt: Show bitmask of shareable resource with other executing units
    x86/intel_rdt/mbm: Handle counter overflow
    x86/intel_rdt/mbm: Add mbm counter initialization
    x86/intel_rdt/mbm: Basic counting of MBM events (total and local)
    x86/intel_rdt/cqm: Add CPU hotplug support
    x86/intel_rdt/cqm: Add sched_in support
    x86/intel_rdt: Introduce rdt_enable_key for scheduling
    x86/intel_rdt/cqm: Add mount,umount support
    x86/intel_rdt/cqm: Add rmdir support
    x86/intel_rdt: Separate the ctrl bits from rmdir
    x86/intel_rdt/cqm: Add mon_data
    x86/intel_rdt: Prepare for RDT monitor data support
    ...

    Linus Torvalds
     
  • Pull x86 mm changes from Ingo Molnar:
    "PCID support, 5-level paging support, Secure Memory Encryption support

    The main changes in this cycle are support for three new, complex
    hardware features of x86 CPUs:

    - Add 5-level paging support, which is a new hardware feature on
    upcoming Intel CPUs allowing up to 128 PB of virtual address space
    and 4 PB of physical RAM space - a 512-fold increase over the old
    limits. (Supercomputers of the future forecasting hurricanes on an
    ever warming planet can certainly make good use of more RAM.)

    Many of the necessary changes went upstream in previous cycles,
    v4.14 is the first kernel that can enable 5-level paging.

    This feature is activated via CONFIG_X86_5LEVEL=y - disabled by
    default.

    (By Kirill A. Shutemov)

    - Add 'encrypted memory' support, which is a new hardware feature on
    upcoming AMD CPUs ('Secure Memory Encryption', SME) allowing system
    RAM to be encrypted and decrypted (mostly) transparently by the
    CPU, with a little help from the kernel to transition to/from
    encrypted RAM. Such RAM should be more secure against various
    attacks like RAM access via the memory bus and should make the
    radio signature of memory bus traffic harder to intercept (and
    decrypt) as well.

    This feature is activated via CONFIG_AMD_MEM_ENCRYPT=y - disabled
    by default.

    (By Tom Lendacky)

    - Enable PCID optimized TLB flushing on newer Intel CPUs: PCID is a
    hardware feature that attaches an address space tag to TLB entries
    and thus allows to skip TLB flushing in many cases, even if we
    switch mm's.

    (By Andy Lutomirski)

    All three of these features were in the works for a long time, and
    it's coincidence of the three independent development paths that they
    are all enabled in v4.14 at once"

    * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (65 commits)
    x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE=y)
    x86/mm: Use pr_cont() in dump_pagetable()
    x86/mm: Fix SME encryption stack ptr handling
    kvm/x86: Avoid clearing the C-bit in rsvd_bits()
    x86/CPU: Align CR3 defines
    x86/mm, mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages
    acpi, x86/mm: Remove encryption mask from ACPI page protection type
    x86/mm, kexec: Fix memory corruption with SME on successive kexecs
    x86/mm/pkeys: Fix typo in Documentation/x86/protection-keys.txt
    x86/mm/dump_pagetables: Speed up page tables dump for CONFIG_KASAN=y
    x86/mm: Implement PCID based optimization: try to preserve old TLB entries using PCID
    x86: Enable 5-level paging support via CONFIG_X86_5LEVEL=y
    x86/mm: Allow userspace have mappings above 47-bit
    x86/mm: Prepare to expose larger address space to userspace
    x86/mpx: Do not allow MPX if we have mappings above 47-bit
    x86/mm: Rename tasksize_32bit/64bit to task_size_32bit/64bit()
    x86/xen: Redefine XEN_ELFNOTE_INIT_P2M using PUD_SIZE * PTRS_PER_PUD
    x86/mm/dump_pagetables: Fix printout of p4d level
    x86/mm/dump_pagetables: Generalize address normalization
    x86/boot: Fix memremap() related build failure
    ...

    Linus Torvalds
     
  • Pull x86 microcode loading updates from Ingo Molnar:
    "Update documentation, improve robustness and fix a memory leak"

    * 'x86-microcode-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
    x86/microcode/intel: Improve microcode patches saving flow
    x86/microcode: Document the three loading methods
    x86/microcode/AMD: Free unneeded patch before exit from update_cache()

    Linus Torvalds
     

02 Aug, 2017

2 commits

  • CPUID.(EAX=0x10, ECX=res#):EBX[31:0] reports a bit mask for a resource.
    Each set bit within the length of the CBM indicates the corresponding
    unit of the resource allocation may be used by other entities in the
    platform (e.g. an integrated graphics engine or hardware units outside
    the processor core and have direct access to the resource). Each
    cleared bit within the length of the CBM indicates the corresponding
    allocation unit can be configured to implement a priority-based
    allocation scheme without interference with other hardware agents in
    the system. Bits outside the length of the CBM are reserved.

    More details on the bit mask are described in x86 Software Developer's
    Manual.

    The bitmask is shown in "info" directory for each resource. It's
    up to user to decide how to use the bitmask within a CBM in a partition
    to share or isolate a resource with other executing units.

    Suggested-by: Reinette Chatre
    Signed-off-by: Fenghua Yu
    Signed-off-by: Tony Luck
    Signed-off-by: Thomas Gleixner
    Cc: ravi.v.shankar@intel.com
    Cc: peterz@infradead.org
    Cc: eranian@google.com
    Cc: ak@linux.intel.com
    Cc: davidcc@google.com
    Cc: vikas.shivappa@linux.intel.com
    Link: http://lkml.kernel.org/r/20170725223904.12996-1-tony.luck@intel.com

    Fenghua Yu
     
  • Add a description of resctrl based RDT(resource director technology)
    monitoring extension and its usage.

    [Tony: Added descriptions for how monitoring and allocation are measured
    and some cleanups]

    Signed-off-by: Vikas Shivappa
    Signed-off-by: Tony Luck
    Signed-off-by: Thomas Gleixner
    Cc: ravi.v.shankar@intel.com
    Cc: fenghua.yu@intel.com
    Cc: peterz@infradead.org
    Cc: eranian@google.com
    Cc: vikas.shivappa@intel.com
    Cc: ak@linux.intel.com
    Cc: davidcc@google.com
    Cc: reinette.chatre@intel.com
    Link: http://lkml.kernel.org/r/1501017287-28083-3-git-send-email-vikas.shivappa@linux.intel.com

    Vikas Shivappa
     

26 Jul, 2017

1 commit

  • Add the new ORC unwinder which is enabled by CONFIG_ORC_UNWINDER=y.
    It plugs into the existing x86 unwinder framework.

    It relies on objtool to generate the needed .orc_unwind and
    .orc_unwind_ip sections.

    For more details on why ORC is used instead of DWARF, see
    Documentation/x86/orc-unwinder.txt - but the short version is
    that it's a simplified, fundamentally more robust debugninfo
    data structure, which also allows up to two orders of magnitude
    faster lookups than the DWARF unwinder - which matters to
    profiling workloads like perf.

    Thanks to Andy Lutomirski for the performance improvement ideas:
    splitting the ORC unwind table into two parallel arrays and creating a
    fast lookup table to search a subset of the unwind table.

    Signed-off-by: Josh Poimboeuf
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Jiri Slaby
    Cc: Linus Torvalds
    Cc: Mike Galbraith
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: live-patching@vger.kernel.org
    Link: http://lkml.kernel.org/r/0a6cbfb40f8da99b7a45a1a8302dc6aef16ec812.1500938583.git.jpoimboe@redhat.com
    [ Extended the changelog. ]
    Signed-off-by: Ingo Molnar

    Josh Poimboeuf
     

25 Jul, 2017

2 commits

  • Replace PKEY_DENY_WRITE with PKEY_DISABLE_WRITE,
    to match the source code.

    Signed-off-by: Wang Kai
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: corbet@lwn.net
    Cc: dave.hansen@intel.com
    Cc: dave.hansen@linux.intel.com
    Cc: linux-doc@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Wang Kai
     
  • Paul Menzel recently asked how to load microcode on a system and I realized
    that we don't really have all the methods written down somewhere.

    Do that, so people can go and look them up.

    Reported-by: Paul Menzel
    Signed-off-by: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20170724101228.17326-3-bp@alien8.de
    [ Fix whitespace noise in the new description. ]
    Signed-off-by: Ingo Molnar

    Borislav Petkov
     

21 Jul, 2017

1 commit

  • Most of things are in place and we can enable support for 5-level paging.

    The patch makes XEN_PV and XEN_PVH dependent on !X86_5LEVEL. Both are
    not ready to work with 5-level paging.

    Signed-off-by: Kirill A. Shutemov
    Reviewed-by: Juergen Gross
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-arch@vger.kernel.org
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20170716225954.74185-9-kirill.shutemov@linux.intel.com
    [ Minor readability edits. ]
    Signed-off-by: Ingo Molnar

    Kirill A. Shutemov
     

18 Jul, 2017

1 commit

  • Create a Documentation entry to describe the AMD Secure Memory
    Encryption (SME) feature and add documentation for the mem_encrypt=
    kernel parameter.

    Signed-off-by: Tom Lendacky
    Reviewed-by: Thomas Gleixner
    Reviewed-by: Borislav Petkov
    Cc: Alexander Potapenko
    Cc: Andrey Ryabinin
    Cc: Andy Lutomirski
    Cc: Arnd Bergmann
    Cc: Borislav Petkov
    Cc: Brijesh Singh
    Cc: Dave Young
    Cc: Dmitry Vyukov
    Cc: Jonathan Corbet
    Cc: Konrad Rzeszutek Wilk
    Cc: Larry Woodman
    Cc: Linus Torvalds
    Cc: Matt Fleming
    Cc: Michael S. Tsirkin
    Cc: Paolo Bonzini
    Cc: Peter Zijlstra
    Cc: Radim Krčmář
    Cc: Rik van Riel
    Cc: Toshimitsu Kani
    Cc: kasan-dev@googlegroups.com
    Cc: kvm@vger.kernel.org
    Cc: linux-arch@vger.kernel.org
    Cc: linux-doc@vger.kernel.org
    Cc: linux-efi@vger.kernel.org
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/ca0a0c13b055fd804cfc92cbaca8acd68057eed0.1500319216.git.thomas.lendacky@amd.com
    Signed-off-by: Ingo Molnar

    Tom Lendacky
     

14 Jun, 2017

1 commit

  • The bootlog option is only disabled by default on AMD Fam10h and older
    systems.

    Update bootlog description to say this. Change the family value to hex
    to avoid confusion.

    Signed-off-by: Yazen Ghannam
    Signed-off-by: Borislav Petkov
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: Tony Luck
    Cc: linux-edac
    Link: http://lkml.kernel.org/r/20170613162835.30750-9-bp@alien8.de
    Signed-off-by: Ingo Molnar

    Yazen Ghannam
     

09 May, 2017

1 commit

  • Example 3 contains a typo:

    "C0" in "# echo C0 > p0/cpus" is wrong because it specifies core
    6-7 instead of wanted core 4-7.

    Correct this typo to avoid confusion.

    Signed-off-by: Xiaochen Shen
    Acked-by: Fenghua Yu
    Cc: vikas.shivappa@linux.intel.com
    Cc: tony.luck@intel.com
    Link: http://lkml.kernel.org/r/1493781356-24229-1-git-send-email-xiaochen.shen@intel.com
    Signed-off-by: Thomas Gleixner

    Xiaochen Shen
     

02 May, 2017

2 commits

  • Pull x86 mm updates from Ingo Molnar:
    "The main x86 MM changes in this cycle were:

    - continued native kernel PCID support preparation patches to the TLB
    flushing code (Andy Lutomirski)

    - various fixes related to 32-bit compat syscall returning address
    over 4Gb in applications, launched from 64-bit binaries - motivated
    by C/R frameworks such as Virtuozzo. (Dmitry Safonov)

    - continued Intel 5-level paging enablement: in particular the
    conversion of x86 GUP to the generic GUP code. (Kirill A. Shutemov)

    - x86/mpx ABI corner case fixes/enhancements (Joerg Roedel)

    - ... plus misc updates, fixes and cleanups"

    * 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (62 commits)
    mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash
    x86/mm: Fix flush_tlb_page() on Xen
    x86/mm: Make flush_tlb_mm_range() more predictable
    x86/mm: Remove flush_tlb() and flush_tlb_current_task()
    x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
    x86/mm/64: Fix crash in remove_pagetable()
    Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation"
    x86/boot/e820: Remove a redundant self assignment
    x86/mm: Fix dump pagetables for 4 levels of page tables
    x86/mpx, selftests: Only check bounds-vs-shadow when we keep shadow
    x86/mpx: Correctly report do_mpx_bt_fault() failures to user-space
    Revert "x86/mm/numa: Remove numa_nodemask_from_meminfo()"
    x86/espfix: Add support for 5-level paging
    x86/kasan: Extend KASAN to support 5-level paging
    x86/mm: Add basic defines/helpers for CONFIG_X86_5LEVEL=y
    x86/paravirt: Add 5-level support to the paravirt code
    x86/mm: Define virtual memory map for 5-level paging
    x86/asm: Remove __VIRTUAL_MASK_SHIFT==47 assert
    x86/boot: Detect 5-level paging support
    x86/mm/numa: Remove numa_nodemask_from_meminfo()
    ...

    Linus Torvalds
     
  • Pull x86 cpu updates from Ingo Molnar:
    "The biggest changes are an extension of the Intel RDT code to extend
    it with Intel Memory Bandwidth Allocation CPU support: MBA allows
    bandwidth allocation between cores, while CBM (already upstream)
    allows CPU cache partitioning.

    There's also misc smaller fixes and updates"

    * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits)
    x86/intel_rdt: Return error for incorrect resource names in schemata
    x86/intel_rdt: Trim whitespace while parsing schemata input
    x86/intel_rdt: Fix padding when resource is enabled via mount
    x86/intel_rdt: Get rid of anon union
    x86/cpu: Keep model defines sorted by model number
    x86/intel_rdt/mba: Add schemata file support for MBA
    x86/intel_rdt: Make schemata file parsers resource specific
    x86/intel_rdt/mba: Add info directory files for Memory Bandwidth Allocation
    x86/intel_rdt: Make information files resource specific
    x86/intel_rdt/mba: Add primary support for Memory Bandwidth Allocation (MBA)
    x86/intel_rdt/mba: Memory bandwith allocation feature detect
    x86/intel_rdt: Add resource specific msr update function
    x86/intel_rdt: Move CBM specific data into a struct
    x86/intel_rdt: Cleanup namespace to support multiple resource types
    Documentation, x86: Intel Memory bandwidth allocation
    x86/intel_rdt: Organize code properly
    x86/intel_rdt: Init padding only if a device exists
    x86/intel_rdt: Add cpus_list rdtgroup file
    x86/intel_rdt: Cleanup kernel-doc
    x86/intel_rdt: Update schemata read to show data in tabular format
    ...

    Linus Torvalds
     

14 Apr, 2017

1 commit

  • Update the 'intel_rdt_ui' documentation to have Memory bandwidth(b/w)
    allocation interface usage.

    Signed-off-by: Vikas Shivappa
    Cc: ravi.v.shankar@intel.com
    Cc: tony.luck@intel.com
    Cc: fenghua.yu@intel.com
    Cc: vikas.shivappa@intel.com
    Link: http://lkml.kernel.org/r/1491611637-20417-2-git-send-email-vikas.shivappa@linux.intel.com
    Signed-off-by: Thomas Gleixner

    Vikas Shivappa
     

11 Apr, 2017

2 commits

  • There's a conflict between ongoing level-5 paging support and
    the E820 rewrite. Since the E820 rewrite is essentially ready,
    merge it into x86/mm to reduce tree conflicts.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The resource control filesystem provides only a bitmask based cpus file for
    assigning CPUs to a resource group. That's cumbersome with large cpumasks
    and non-intuitive when modifying the file from the command line.

    Range based cpu lists are commonly used along with bitmask based cpu files
    in various subsystems throughout the kernel.

    Add 'cpus_list' file which is CPU range based.

    # cd /sys/fs/resctrl/
    # echo 1-10 > krava/cpus_list
    # cat krava/cpus_list
    1-10
    # cat krava/cpus
    0007fe
    # cat cpus
    fffff9
    # cat cpus_list
    0,3-23

    [ tglx: Massaged changelog and replaced "bitmask lists" by "CPU ranges" ]

    Signed-off-by: Jiri Olsa
    Cc: Fenghua Yu
    Cc: Peter Zijlstra
    Cc: Peter Zijlstra
    Cc: Mike Galbraith
    Cc: Shaohua Li
    Link: http://lkml.kernel.org/r/20170410145232.GF25354@krava
    Signed-off-by: Thomas Gleixner

    Jiri Olsa
     

05 Apr, 2017

1 commit

  • The schemata file can have multiple lines and it is cumbersome to update
    all lines.

    Remove code that requires that the user provides values for every resource
    (in the right order). If the user provides values for just a few
    resources, update them and leave the rest unchanged.

    Side benefit: we now check which values were updated and only send IPIs to
    cpus that actually have updates.

    Signed-off-by: Tony Luck
    Signed-off-by: Vikas Shivappa
    Tested-by: Sai Praneeth Prakhya
    Cc: ravi.v.shankar@intel.com
    Cc: fenghua.yu@intel.com
    Cc: peterz@infradead.org
    Cc: vikas.shivappa@intel.com
    Cc: h.peter.anvin@intel.com
    Link: http://lkml.kernel.org/r/1491255857-17213-3-git-send-email-vikas.shivappa@linux.intel.com
    Signed-off-by: Thomas Gleixner

    Tony Luck
     

04 Apr, 2017

1 commit

  • The first part of memory map (up to %esp fixup) simply scales existing
    map for 4-level paging by factor of 9 -- number of bits addressed by
    the additional page table level.

    The rest of the map is unchanged.

    Signed-off-by: Kirill A. Shutemov
    Cc: Andrew Morton
    Cc: Andy Lutomirski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-arch@vger.kernel.org
    Cc: linux-mm@kvack.org
    Link: http://lkml.kernel.org/r/20170330080731.65421-4-kirill.shutemov@linux.intel.com
    Signed-off-by: Ingo Molnar

    Kirill A. Shutemov
     

16 Mar, 2017

1 commit

  • This patch aligns MODULES_END to the beginning of the fixmap section.
    It optimizes the space available for both sections. The address is
    pre-computed based on the number of pages required by the fixmap
    section.

    It will allow GDT remapping in the fixmap section. The current
    MODULES_END static address does not provide enough space for the kernel
    to support a large number of processors.

    Signed-off-by: Thomas Garnier
    Cc: Alexander Potapenko
    Cc: Andrew Morton
    Cc: Andrey Ryabinin
    Cc: Andy Lutomirski
    Cc: Ard Biesheuvel
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Chris Wilson
    Cc: Christian Borntraeger
    Cc: Dmitry Vyukov
    Cc: Frederic Weisbecker
    Cc: Jiri Kosina
    Cc: Joerg Roedel
    Cc: Jonathan Corbet
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: Len Brown
    Cc: Linus Torvalds
    Cc: Lorenzo Stoakes
    Cc: Luis R . Rodriguez
    Cc: Matt Fleming
    Cc: Michal Hocko
    Cc: Paolo Bonzini
    Cc: Paul Gortmaker
    Cc: Pavel Machek
    Cc: Peter Zijlstra
    Cc: Radim Krčmář
    Cc: Rafael J . Wysocki
    Cc: Rusty Russell
    Cc: Stanislaw Gruszka
    Cc: Thomas Gleixner
    Cc: Tim Chen
    Cc: Vitaly Kuznetsov
    Cc: kasan-dev@googlegroups.com
    Cc: kernel-hardening@lists.openwall.com
    Cc: kvm@vger.kernel.org
    Cc: lguest@lists.ozlabs.org
    Cc: linux-doc@vger.kernel.org
    Cc: linux-efi@vger.kernel.org
    Cc: linux-mm@kvack.org
    Cc: linux-pm@vger.kernel.org
    Cc: xen-devel@lists.xenproject.org
    Cc: zijun_hu
    Link: http://lkml.kernel.org/r/20170314170508.100882-1-thgarnie@google.com
    [ Small build fix. ]
    Signed-off-by: Ingo Molnar

    Thomas Garnier
     

01 Mar, 2017

2 commits


20 Feb, 2017

1 commit


07 Feb, 2017

1 commit

  • Get the firmware's secure-boot status in the kernel boot wrapper and stash
    it somewhere that the main kernel image can find.

    The efi_get_secureboot() function is extracted from the ARM stub and (a)
    generalised so that it can be called from x86 and (b) made to use
    efi_call_runtime() so that it can be run in mixed-mode.

    For x86, it is stored in boot_params and can be overridden by the boot
    loader or kexec. This allows secure-boot mode to be passed on to a new
    kernel.

    Suggested-by: Lukas Wunner
    Signed-off-by: David Howells
    Signed-off-by: Ard Biesheuvel
    Cc: Linus Torvalds
    Cc: Matt Fleming
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux-efi@vger.kernel.org
    Link: http://lkml.kernel.org/r/1486380166-31868-5-git-send-email-ard.biesheuvel@linaro.org
    [ Small readability edits. ]
    Signed-off-by: Ingo Molnar

    David Howells
     

28 Jan, 2017

3 commits

  • No change in functionality.

    Cc: Alex Thorlton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dan Williams
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Huang, Ying
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Paul Jackson
    Cc: Peter Zijlstra
    Cc: Rafael J. Wysocki
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Wei Yang
    Cc: Yinghai Lu
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • In line with the rename to 'struct e820_array', harmonize the naming of common e820
    table variable names as well:

    e820 => e820_array
    e820_saved => e820_array_saved
    e820_map => e820_array
    initial_e820 => e820_array_init

    This makes the variable names more consistent and easier to grep for.

    No change in functionality.

    Cc: Alex Thorlton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dan Williams
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Huang, Ying
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Paul Jackson
    Cc: Peter Zijlstra
    Cc: Rafael J. Wysocki
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Wei Yang
    Cc: Yinghai Lu
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The 'e820entry' and 'e820map' names have various annoyances:

    - the missing underscore departs from the usual kernel style
    and makes the code look weird,

    - in the past I kept confusing the 'map' with the 'entry', because
    a 'map' is ambiguous in that regard,

    - it's not really clear from the 'e820map' that this is a regular
    C array.

    Rename them to 'struct e820_entry' and 'struct e820_array' accordingly.

    ( Leave the legacy UAPI header alone but do the rename in the bootparam.h
    and e820/types.h file - outside tools relying on these defines should
    either adjust their code, or should use the legacy header, or should
    create their private copies for the definitions. )

    No change in functionality.

    Cc: Alex Thorlton
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dan Williams
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Huang, Ying
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Paul Jackson
    Cc: Peter Zijlstra
    Cc: Rafael J. Wysocki
    Cc: Tejun Heo
    Cc: Thomas Gleixner
    Cc: Wei Yang
    Cc: Yinghai Lu
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Ingo Molnar

    Ingo Molnar