31 Jan, 2019

1 commit

  • commit 9f08890ab906abaf9d4c1bad8111755cbd302260 upstream.

    Right now there is only a pvclock_pvti_cpu0_va() which is defined
    on kvmclock since:

    commit dac16fba6fc5
    ("x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap")

    The only user of this interface so far is kvm. This commit adds a
    setter function for the pvti page and moves pvclock_pvti_cpu0_va
    to pvclock, which is a more generic place to have it; and would
    allow other PV clocksources to use it, such as Xen.

    While moving pvclock_pvti_cpu0_va into pvclock, rename also this
    function to pvclock_get_pvti_cpu0_va (including its call sites)
    to be symmetric with the setter (pvclock_set_pvti_cpu0_va).

    Signed-off-by: Joao Martins
    Acked-by: Andy Lutomirski
    Acked-by: Paolo Bonzini
    Acked-by: Thomas Gleixner
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross
    Signed-off-by: Greg Kroah-Hartman

    Joao Martins
     

13 Oct, 2018

3 commits

  • commit 02e425668f5c9deb42787d10001a3b605993ad15 upstream.

    When I added the missing memory outputs, I failed to update the
    index of the first argument (ebx) on 32-bit builds, which broke the
    fallbacks. Somehow I must have screwed up my testing or gotten
    lucky.

    Add another test to cover gettimeofday() as well.

    Signed-off-by: Andy Lutomirski
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Fixes: 715bd9d12f84 ("x86/vdso: Fix asm constraints on vDSO syscall fallbacks")
    Link: http://lkml.kernel.org/r/21bd45ab04b6d838278fa5bebfa9163eceffa13c.1538608971.git.luto@kernel.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit 4f166564014aba65ad6f15b612f6711fd0f117ee upstream.

    When I fixed the vDSO build to use inline retpolines, I messed up
    the Makefile logic and made it unconditional. It should have
    depended on CONFIG_RETPOLINE and on the availability of compiler
    support. This broke the build on some older compilers.

    Reported-by: nikola.ciprich@linuxbox.cz
    Signed-off-by: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: David Woodhouse
    Cc: Linus Torvalds
    Cc: Matt Rickard
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: jason.vas.dias@gmail.com
    Cc: stable@vger.kernel.org
    Fixes: 2e549b2ee0e3 ("x86/vdso: Fix vDSO build if a retpoline is emitted")
    Link: http://lkml.kernel.org/r/08a1f29f2c238dd1f493945e702a521f8a5aa3ae.1538540801.git.luto@kernel.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit 715bd9d12f84d8f5cc8ad21d888f9bc304a8eb0b upstream.

    The syscall fallbacks in the vDSO have incorrect asm constraints.
    They are not marked as writing to their outputs -- instead, they are
    marked as clobbering "memory", which is useless. In particular, gcc
    is smart enough to know that the timespec parameter hasn't escaped,
    so a memory clobber doesn't clobber it. And passing a pointer as an
    asm *input* does not tell gcc that the pointed-to value is changed.

    Add in the fact that the asm instructions weren't volatile, and gcc
    was free to omit them entirely unless their sole output (the return
    value) is used. Which it is (phew!), but that stops happening with
    some upcoming patches.

    As a trivial example, the following code:

    void test_fallback(struct timespec *ts)
    {
    vdso_fallback_gettime(CLOCK_MONOTONIC, ts);
    }

    compiles to:

    00000000000000c0 :
    c0: c3 retq

    To add insult to injury, the RCX and R11 clobbers on 64-bit
    builds were missing.

    The "memory" clobber is also unnecessary -- no ordering with respect to
    other memory operations is needed, but that's going to be fixed in a
    separate not-for-stable patch.

    Fixes: 2aae950b21e4 ("x86_64: Add vDSO for x86-64 with gettimeofday/clock_gettime/getcpu")
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/2c0231690551989d2fafa60ed0e7b5cc8b403908.1538422295.git.luto@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     

04 Oct, 2018

1 commit

  • [ Upstream commit 6709812f094d96543b443645c68daaa32d3d3e77 ]

    Sadly, other than claimed in:

    a368d7fd2a ("x86/entry/64: Add instruction suffix")

    ... there are two more instances which want to be adjusted.

    As said there, omitting suffixes from instructions in AT&T mode is bad
    practice when operand size cannot be determined by the assembler from
    register operands, and is likely going to be warned about by upstream
    gas in the future (mine does already).

    Add the other missing suffixes here as well.

    Signed-off-by: Jan Beulich
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/5B3A02DD02000078001CFB78@prv1-mh.provo.novell.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Sasha Levin
    Signed-off-by: Greg Kroah-Hartman

    Jan Beulich
     

05 Sep, 2018

1 commit

  • commit 2e549b2ee0e358bc758480e716b881f9cabedb6a upstream.

    Currently, if the vDSO ends up containing an indirect branch or
    call, GCC will emit the "external thunk" style of retpoline, and it
    will fail to link.

    Fix it by building the vDSO with inline retpoline thunks.

    I haven't seen any reports of this triggering on an unpatched
    kernel.

    Fixes: commit 76b043848fd2 ("x86/retpoline: Add initial retpoline support")
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Acked-by: Matt Rickard
    Cc: Borislav Petkov
    Cc: Jason Vas Dias
    Cc: David Woodhouse
    Cc: Peter Zijlstra
    Cc: Andi Kleen
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/c76538cd3afbe19c6246c2d1715bc6a60bd63985.1534448381.git.luto@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     

06 Aug, 2018

1 commit

  • commit b3681dd548d06deb2e1573890829dff4b15abf46 upstream.

    error_entry and error_exit communicate the user vs. kernel status of
    the frame using %ebx. This is unnecessary -- the information is in
    regs->cs. Just use regs->cs.

    This makes error_entry simpler and makes error_exit more robust.

    It also fixes a nasty bug. Before all the Spectre nonsense, the
    xen_failsafe_callback entry point returned like this:

    ALLOC_PT_GPREGS_ON_STACK
    SAVE_C_REGS
    SAVE_EXTRA_REGS
    ENCODE_FRAME_POINTER
    jmp error_exit

    And it did not go through error_entry. This was bogus: RBX
    contained garbage, and error_exit expected a flag in RBX.

    Fortunately, it generally contained *nonzero* garbage, so the
    correct code path was used. As part of the Spectre fixes, code was
    added to clear RBX to mitigate certain speculation attacks. Now,
    depending on kernel configuration, RBX got zeroed and, when running
    some Wine workloads, the kernel crashes. This was introduced by:

    commit 3ac6d8c787b8 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")

    With this patch applied, RBX is no longer needed as a flag, and the
    problem goes away.

    I suspect that malicious userspace could use this bug to crash the
    kernel even without the offending patch applied, though.

    [ Historical note: I wrote this patch as a cleanup before I was aware
    of the bug it fixed. ]

    [ Note to stable maintainers: this should probably get applied to all
    kernels. If you're nervous about that, a more conservative fix to
    add xorl %ebx,%ebx; incl %ebx before the jump to error_exit should
    also fix the problem. ]

    Reported-and-tested-by: M. Vefa Bicakci
    Signed-off-by: Andy Lutomirski
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: Denys Vlasenko
    Cc: Dominik Brodowski
    Cc: Greg KH
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Cc: xen-devel@lists.xenproject.org
    Fixes: 3ac6d8c787b8 ("x86/entry/64: Clear registers for exceptions/interrupts, to reduce speculation attack surface")
    Link: http://lkml.kernel.org/r/b5010a090d3586b2d6e06c7ad3ec5542d1241c45.1532282627.git.luto@kernel.org
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     

29 Mar, 2018

2 commits

  • commit 31ad7f8e7dc94d3b85ccf9b6141ce6dfd35a1781 upstream.

    Writing to it directly does not work for Xen PV guests.

    Fixes: 49275fef986a ("x86/vsyscall/64: Explicitly set _PAGE_USER in the pagetable hierarchy")
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Juergen Gross
    Acked-by: Andy Lutomirski
    Cc: stable@vger.kernel.org
    Link: https://lkml.kernel.org/r/20180319143154.3742-1-boris.ostrovsky@oracle.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Boris Ostrovsky
     
  • commit d8ba61ba58c88d5207c1ba2f7d9a2280e7d03be9 upstream.

    There's nothing IST-worthy about #BP/int3. We don't allow kprobes
    in the small handful of places in the kernel that run at CPL0 with
    an invalid stack, and 32-bit kernels have used normal interrupt
    gates for #BP forever.

    Furthermore, we don't allow kprobes in places that have usergs while
    in kernel mode, so "paranoid" is also unnecessary.

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Linus Torvalds
    Signed-off-by: Thomas Gleixner
    Cc: stable@vger.kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     

15 Mar, 2018

3 commits

  • commit d1c99108af3c5992640aa2afa7d2e88c3775c06e upstream.

    This reverts commit 1dde7415e99933bb7293d6b2843752cbdb43ec11. By putting
    the RSB filling out of line and calling it, we waste one RSB slot for
    returning from the function itself, which means one fewer actual function
    call we can make if we're doing the Skylake abomination of call-depth
    counting.

    It also changed the number of RSB stuffings we do on vmexit from 32,
    which was correct, to 16. Let's just stop with the bikeshedding; it
    didn't actually *fix* anything anyway.

    Signed-off-by: David Woodhouse
    Acked-by: Thomas Gleixner
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: arjan.van.de.ven@intel.com
    Cc: bp@alien8.de
    Cc: dave.hansen@intel.com
    Cc: jmattson@google.com
    Cc: karahmed@amazon.de
    Cc: kvm@vger.kernel.org
    Cc: pbonzini@redhat.com
    Cc: rkrcmar@redhat.com
    Link: http://lkml.kernel.org/r/1519037457-7643-4-git-send-email-dwmw@amazon.co.uk
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    David Woodhouse
     
  • commit ced5d0bf603fa0baee8ea889e1d70971fd210894 upstream.

    On some x86 CPU microarchitectures using 'xorq' to clear general-purpose
    registers is slower than 'xorl'. As 'xorl' is sufficient to clear all
    64 bits of these registers due to zero-extension [*], switch the x86
    64-bit entry code to use 'xorl'.

    No change in functionality and no change in code size.

    [*] According to Intel 64 and IA-32 Architecture Software Developer's
    Manual, section 3.4.1.1, the result of 32-bit operands are "zero-
    extended to a 64-bit result in the destination general-purpose
    register." The AMD64 Architecture Programmer’s Manual Volume 3,
    Appendix B.1, describes the same behaviour.

    Suggested-by: Denys Vlasenko
    Signed-off-by: Dominik Brodowski
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Woodhouse
    Cc: Greg Kroah-Hartman
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20180214175924.23065-3-linux@dominikbrodowski.net
    [ Improved on the changelog a bit. ]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dominik Brodowski
     
  • commit 9e809d15d6b692fa061d74be7aaab1c79f6784b8 upstream.

    Play a little trick in the generic PUSH_AND_CLEAR_REGS macro
    to insert the GP registers "above" the original return address.

    This allows us to (re-)insert the macro in error_entry() and
    paranoid_entry() and to remove it from the idtentry macro. This
    reduces the static footprint significantly:

    text data bss dec hex filename
    24307 0 0 24307 5ef3 entry_64.o-orig
    20987 0 0 20987 51fb entry_64.o

    Co-developed-by: Linus Torvalds
    Signed-off-by: Dominik Brodowski
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Woodhouse
    Cc: Greg Kroah-Hartman
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/20180214175924.23065-2-linux@dominikbrodowski.net
    [ Small tweaks to comments. ]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dominik Brodowski
     

22 Feb, 2018

14 commits

  • commit e48657573481a5dff7cfdc3d57005c80aa816500 upstream.

    Josh Poimboeuf noticed the following bug:

    "The paranoid exit code only restores the saved CR3 when it switches back
    to the user GS. However, even in the kernel GS case, it's possible that
    it needs to restore a user CR3, if for example, the paranoid exception
    occurred in the syscall exit path between SWITCH_TO_USER_CR3_STACK and
    SWAPGS."

    Josh also confirmed via targeted testing that it's possible to hit this bug.

    Fix the bug by also restoring CR3 in the paranoid_exit_no_swapgs branch.

    The reason we haven't seen this bug reported by users yet is probably because
    "paranoid" entry points are limited to the following cases:

    idtentry double_fault do_double_fault has_error_code=1 paranoid=2
    idtentry debug do_debug has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
    idtentry int3 do_int3 has_error_code=0 paranoid=1 shift_ist=DEBUG_STACK
    idtentry machine_check do_mce has_error_code=0 paranoid=1

    Amongst those entry points only machine_check is one that will interrupt an
    IRQS-off critical section asynchronously - and machine check events are rare.

    The other main asynchronous entries are NMI entries, which can be very high-freq
    with perf profiling, but they are special: they don't use the 'idtentry' macro but
    are open coded and restore user CR3 unconditionally so don't have this bug.

    Reported-and-tested-by: Josh Poimboeuf
    Reviewed-by: Andy Lutomirski
    Acked-by: Thomas Gleixner
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Woodhouse
    Cc: Greg Kroah-Hartman
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Link: http://lkml.kernel.org/r/20180214073910.boevmg65upbk3vqb@gmail.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Ingo Molnar
     
  • commit b498c261107461d5c42140dfddd05df83d8ca078 upstream.

    That macro was touched around 2.5.8 times, judging by the full history
    linux repo, but it was unused even then. Get rid of it already.

    Signed-off-by: Borislav Petkov
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: linux@dominikbrodowski.net
    Link: http://lkml.kernel.org/r/20180212201318.GD14640@pd.tnic
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     
  • commit b3ccefaed922529e6a67de7b30af5aa38c76ace9 upstream.

    With the following commit:

    f09d160992d1 ("x86/entry/64: Get rid of the ALLOC_PT_GPREGS_ON_STACK and SAVE_AND_CLEAR_REGS macros")

    ... one of my suggested improvements triggered a frame pointer warning:

    arch/x86/entry/entry_64.o: warning: objtool: paranoid_entry()+0x11: call without frame pointer save/setup

    The warning is correct for the build-time code, but it's actually not
    relevant at runtime because of paravirt patching. The paravirt swapgs
    call gets replaced with either a SWAPGS instruction or NOPs at runtime.

    Go back to the previous behavior by removing the ELF function annotation
    for paranoid_entry() and adding an unwind hint, which effectively
    silences the warning.

    Reported-by: kbuild test robot
    Signed-off-by: Josh Poimboeuf
    Cc: Dominik Brodowski
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: kbuild-all@01.org
    Cc: tipbuild@zytor.com
    Fixes: f09d160992d1 ("x86/entry/64: Get rid of the ALLOC_PT_GPREGS_ON_STACK and SAVE_AND_CLEAR_REGS macros")
    Link: http://lkml.kernel.org/r/20180212174503.5acbymg5z6p32snu@treble
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Josh Poimboeuf
     
  • commit 92816f571af81e9a71cc6f3dc8ce1e2fcdf7b6b8 upstream.

    ... same as the other macros in arch/x86/entry/calling.h

    Signed-off-by: Dominik Brodowski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dan.j.williams@intel.com
    Link: http://lkml.kernel.org/r/20180211104949.12992-8-linux@dominikbrodowski.net
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dominik Brodowski
     
  • commit dde3036d62ba3375840b10ab9ec0d568fd773b07 upstream.

    Previously, error_entry() and paranoid_entry() saved the GP registers
    onto stack space previously allocated by its callers. Combine these two
    steps in the callers, and use the generic PUSH_AND_CLEAR_REGS macro
    for that.

    This adds a significant amount ot text size. However, Ingo Molnar points
    out that:

    "these numbers also _very_ significantly over-represent the
    extra footprint. The assumptions that resulted in
    us compressing the IRQ entry code have changed very
    significantly with the new x86 IRQ allocation code we
    introduced in the last year:

    - IRQ vectors are usually populated in tightly clustered
    groups.

    With our new vector allocator code the typical per CPU
    allocation percentage on x86 systems is ~3 device vectors
    and ~10 fixed vectors out of ~220 vectors - i.e. a very
    low ~6% utilization (!). [...]

    The days where we allocated a lot of vectors on every
    CPU and the compression of the IRQ entry code text
    mattered are over.

    - Another issue is that only a small minority of vectors
    is frequent enough to actually matter to cache utilization
    in practice: 3-4 key IPIs and 1-2 device IRQs at most - and
    those vectors tend to be tightly clustered as well into about
    two groups, and are probably already on 2-3 cache lines in
    practice.

    For the common case of 'cache cold' IRQs it's the depth of
    the call chain and the fragmentation of the resulting I$
    that should be the main performance limit - not the overall
    size of it.

    - The CPU side cost of IRQ delivery is still very expensive
    even in the best, most cached case, as in 'over a thousand
    cycles'. So much stuff is done that maybe contemporary x86
    IRQ entry microcode already prefetches the IDT entry and its
    expected call target address."[*]

    [*] http://lkml.kernel.org/r/20180208094710.qnjixhm6hybebdv7@gmail.com

    The "testb $3, CS(%rsp)" instruction in the idtentry macro does not need
    modification. Previously, %rsp was manually decreased by 15*8; with
    this patch, %rsp is decreased by 15 pushq instructions.

    [jpoimboe@redhat.com: unwind hint improvements]

    Suggested-by: Linus Torvalds
    Signed-off-by: Dominik Brodowski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dan.j.williams@intel.com
    Link: http://lkml.kernel.org/r/20180211104949.12992-7-linux@dominikbrodowski.net
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dominik Brodowski
     
  • commit 30907fd13bb593202574bb20af58d67c70a1ee14 upstream.

    entry_SYSCALL_64_after_hwframe() and nmi() can be converted to use
    PUSH_AND_CLEAN_REGS instead of opencoded variants thereof. Due to
    the interleaving, the additional XOR-based clearing of R8 and R9
    in entry_SYSCALL_64_after_hwframe() should not have any noticeable
    negative implications.

    Suggested-by: Linus Torvalds
    Signed-off-by: Dominik Brodowski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dan.j.williams@intel.com
    Link: http://lkml.kernel.org/r/20180211104949.12992-6-linux@dominikbrodowski.net
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dominik Brodowski
     
  • commit 3f01daecd545e818098d84fd1ad43e19a508d705 upstream.

    Those instances where ALLOC_PT_GPREGS_ON_STACK is called just before
    SAVE_AND_CLEAR_REGS can trivially be replaced by PUSH_AND_CLEAN_REGS.
    This macro uses PUSH instead of MOV and should therefore be faster, at
    least on newer CPUs.

    Suggested-by: Linus Torvalds
    Signed-off-by: Dominik Brodowski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dan.j.williams@intel.com
    Link: http://lkml.kernel.org/r/20180211104949.12992-5-linux@dominikbrodowski.net
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dominik Brodowski
     
  • commit f7bafa2b05ef25eda1d9179fd930b0330cf2b7d1 upstream.

    Same as is done for syscalls, interleave XOR with PUSH instructions
    for exceptions/interrupts, in order to minimize the cost of the
    additional instructions required for register clearing.

    Signed-off-by: Dominik Brodowski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dan.j.williams@intel.com
    Link: http://lkml.kernel.org/r/20180211104949.12992-4-linux@dominikbrodowski.net
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dominik Brodowski
     
  • commit 502af0d70843c2a9405d7ba1f79b4b0305aaf5f5 upstream.

    The two special, opencoded cases for POP_C_REGS can be handled by ASM
    macros.

    Signed-off-by: Dominik Brodowski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dan.j.williams@intel.com
    Link: http://lkml.kernel.org/r/20180211104949.12992-3-linux@dominikbrodowski.net
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dominik Brodowski
     
  • commit 2e3f0098bc45f710a2f4cbcc94b80a1fae7a99a1 upstream.

    All current code paths call SAVE_C_REGS and then immediately
    SAVE_EXTRA_REGS. Therefore, merge these two macros and order the MOV
    sequeneces properly.

    While at it, remove the macros to save all except specific registers,
    as these macros have been unused for a long time.

    Suggested-by: Linus Torvalds
    Signed-off-by: Dominik Brodowski
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: dan.j.williams@intel.com
    Link: http://lkml.kernel.org/r/20180211104949.12992-2-linux@dominikbrodowski.net
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dominik Brodowski
     
  • commit 3ac6d8c787b835b997eb23e43e09aa0895ef7d58 upstream.

    Clear the 'extra' registers on entering the 64-bit kernel for exceptions
    and interrupts. The common registers are not cleared since they are
    likely clobbered well before they can be exploited in a speculative
    execution attack.

    Originally-From: Andi Kleen
    Signed-off-by: Dan Williams
    Cc:
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/151787989146.7847.15749181712358213254.stgit@dwillia2-desk3.amr.corp.intel.com
    [ Made small improvements to the changelog and the code comments. ]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     
  • commit 14b1fcc62043729d12e8ae00f8297ab2ffe9fa91 upstream.

    The comment is confusing since the path is taken when
    CONFIG_PAGE_TABLE_ISOLATION=y is disabled (while the comment says it is not
    taken).

    Signed-off-by: Nadav Amit
    Cc: Andy Lutomirski
    Cc: Arjan van de Ven
    Cc: Borislav Petkov
    Cc: Dan Williams
    Cc: Dave Hansen
    Cc: David Woodhouse
    Cc: Greg Kroah-Hartman
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Cc: nadav.amit@gmail.com
    Link: http://lkml.kernel.org/r/20180209170638.15161-1-namit@vmware.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • commit 6b8cf5cc9965673951f1ab3f0e3cf23d06e3e2ee upstream.

    At entry userspace may have populated registers with values that could
    otherwise be useful in a speculative execution attack. Clear them to
    minimize the kernel's attack surface.

    Originally-From: Andi Kleen
    Signed-off-by: Dan Williams
    Cc:
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/151787989697.7847.4083702787288600552.stgit@dwillia2-desk3.amr.corp.intel.com
    [ Made small improvements to the changelog. ]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     
  • commit 8e1eb3fa009aa7c0b944b3c8b26b07de0efb3200 upstream.

    At entry userspace may have (maliciously) populated the extra registers
    outside the syscall calling convention with arbitrary values that could
    be useful in a speculative execution (Spectre style) attack.

    Clear these registers to minimize the kernel's attack surface.

    Note, this only clears the extra registers and not the unused
    registers for syscalls less than 6 arguments, since those registers are
    likely to be clobbered well before their values could be put to use
    under speculation.

    Note, Linus found that the XOR instructions can be executed with
    minimized cost if interleaved with the PUSH instructions, and Ingo's
    analysis found that R10 and R11 should be included in the register
    clearing beyond the typical 'extra' syscall calling convention
    registers.

    Suggested-by: Linus Torvalds
    Reported-by: Andi Kleen
    Signed-off-by: Dan Williams
    Cc:
    Cc: Andy Lutomirski
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Denys Vlasenko
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Peter Zijlstra
    Cc: Thomas Gleixner
    Link: http://lkml.kernel.org/r/151787988577.7847.16733592218894189003.stgit@dwillia2-desk3.amr.corp.intel.com
    [ Made small improvements to the changelog and the code comments. ]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     

08 Feb, 2018

5 commits

  • commit 2fbd7af5af8665d18bcefae3e9700be07e22b681

    The syscall table base is a user controlled function pointer in kernel
    space. Use array_index_nospec() to prevent any out of bounds speculation.

    While retpoline prevents speculating into a userspace directed target it
    does not stop the pointer de-reference, the concern is leaking memory
    relative to the syscall table base, by observing instruction cache
    behavior.

    Reported-by: Linus Torvalds
    Signed-off-by: Dan Williams
    Signed-off-by: Thomas Gleixner
    Cc: linux-arch@vger.kernel.org
    Cc: kernel-hardening@lists.openwall.com
    Cc: gregkh@linuxfoundation.org
    Cc: Andy Lutomirski
    Cc: alan@linux.intel.com
    Link: https://lkml.kernel.org/r/151727417984.33451.1216731042505722161.stgit@dwillia2-desk3.amr.corp.intel.com
    Signed-off-by: Greg Kroah-Hartman

    Dan Williams
     
  • commit 37a8f7c38339b22b69876d6f5a0ab851565284e3

    The TS_COMPAT bit is very hot and is accessed from code paths that mostly
    also touch thread_info::flags. Move it into struct thread_info to improve
    cache locality.

    The only reason it was in thread_struct is that there was a brief period
    during which arch-specific fields were not allowed in struct thread_info.

    Linus suggested further changing:

    ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);

    to:

    if (unlikely(ti->status & (TS_COMPAT|TS_I386_REGS_POKED)))
    ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);

    on the theory that frequently dirtying the cacheline even in pure 64-bit
    code that never needs to modify status hurts performance. That could be a
    reasonable followup patch, but I suspect it matters less on top of this
    patch.

    Suggested-by: Linus Torvalds
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Reviewed-by: Ingo Molnar
    Acked-by: Linus Torvalds
    Cc: Borislav Petkov
    Cc: Kernel Hardening
    Link: https://lkml.kernel.org/r/03148bcc1b217100e6e8ecf6a5468c45cf4304b6.1517164461.git.luto@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit d1f7732009e0549eedf8ea1db948dc37be77fd46

    With the fast path removed there is no point in splitting the push of the
    normal and the extra register set. Just push the extra regs right away.

    [ tglx: Split out from 'x86/entry/64: Remove the SYSCALL64 fast path' ]

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Kernel Hardening
    Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit 21d375b6b34ff511a507de27bf316b3dde6938d9

    The SYCALLL64 fast path was a nice, if small, optimization back in the good
    old days when syscalls were actually reasonably fast. Now there is PTI to
    slow everything down, and indirect branches are verboten, making everything
    messier. The retpoline code in the fast path is particularly nasty.

    Just get rid of the fast path. The slow path is barely slower.

    [ tglx: Split out the 'push all extra regs' part ]

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Cc: Borislav Petkov
    Cc: Linus Torvalds
    Cc: Kernel Hardening
    Link: https://lkml.kernel.org/r/462dff8d4d64dfbfc851fbf3130641809d980ecd.1517164461.git.luto@kernel.org
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit 1dde7415e99933bb7293d6b2843752cbdb43ec11

    Simplify it to call an asm-function instead of pasting 41 insn bytes at
    every call site. Also, add alignment to the macro as suggested here:

    https://support.google.com/faqs/answer/7625886

    [dwmw2: Clean up comments, let it clobber %ebx and just tell the compiler]

    Signed-off-by: Borislav Petkov
    Signed-off-by: David Woodhouse
    Signed-off-by: Thomas Gleixner
    Cc: ak@linux.intel.com
    Cc: dave.hansen@intel.com
    Cc: karahmed@amazon.de
    Cc: arjan@linux.intel.com
    Cc: torvalds@linux-foundation.org
    Cc: peterz@infradead.org
    Cc: bp@alien8.de
    Cc: pbonzini@redhat.com
    Cc: tim.c.chen@linux.intel.com
    Cc: gregkh@linux-foundation.org
    Link: https://lkml.kernel.org/r/1517070274-12128-3-git-send-email-dwmw@amazon.co.uk
    Signed-off-by: Greg Kroah-Hartman

    Borislav Petkov
     

24 Jan, 2018

2 commits

  • commit 6f41c34d69eb005e7848716bbcafc979b35037d5 upstream.

    The machine check idtentry uses an indirect branch directly from the low
    level code. This evades the speculation protection.

    Replace it by a direct call into C code and issue the indirect call there
    so the compiler can apply the proper speculation protection.

    Signed-off-by: Thomas Gleixner
    Reviewed-by:Borislav Petkov
    Reviewed-by: David Woodhouse
    Niced-by: Peter Zijlstra
    Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801181626290.1847@nanos
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • commit c995efd5a740d9cbafbf58bde4973e8b50b4d761 upstream.

    On context switch from a shallow call stack to a deeper one, as the CPU
    does 'ret' up the deeper side it may encounter RSB entries (predictions for
    where the 'ret' goes to) which were populated in userspace.

    This is problematic if neither SMEP nor KPTI (the latter of which marks
    userspace pages as NX for the kernel) are active, as malicious code in
    userspace may then be executed speculatively.

    Overwrite the CPU's return prediction stack with calls which are predicted
    to return to an infinite loop, to "capture" speculation if this
    happens. This is required both for retpoline, and also in conjunction with
    IBRS for !SMEP && !KPTI.

    On Skylake+ the problem is slightly different, and an *underflow* of the
    RSB may cause errant branch predictions to occur. So there it's not so much
    overwrite, as *filling* the RSB to attempt to prevent it getting
    empty. This is only a partial solution for Skylake+ since there are many
    other conditions which may result in the RSB becoming empty. The full
    solution on Skylake+ is to use IBRS, which will prevent the problem even
    when the RSB becomes empty. With IBRS, the RSB-stuffing will not be
    required on context switch.

    [ tglx: Added missing vendor check and slighty massaged comments and
    changelog ]

    Signed-off-by: David Woodhouse
    Signed-off-by: Thomas Gleixner
    Acked-by: Arjan van de Ven
    Cc: gnomes@lxorguk.ukuu.org.uk
    Cc: Rik van Riel
    Cc: Andi Kleen
    Cc: Josh Poimboeuf
    Cc: thomas.lendacky@amd.com
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Jiri Kosina
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: Kees Cook
    Cc: Tim Chen
    Cc: Greg Kroah-Hartman
    Cc: Paul Turner
    Link: https://lkml.kernel.org/r/1515779365-9032-1-git-send-email-dwmw@amazon.co.uk
    Signed-off-by: Greg Kroah-Hartman

    David Woodhouse
     

17 Jan, 2018

2 commits

  • commit f10ee3dcc9f0aba92a5c4c064628be5200765dc2 upstream.

    The switch to the user space page tables in the low level ASM code sets
    unconditionally bit 12 and bit 11 of CR3. Bit 12 is switching the base
    address of the page directory to the user part, bit 11 is switching the
    PCID to the PCID associated with the user page tables.

    This fails on a machine which lacks PCID support because bit 11 is set in
    CR3. Bit 11 is reserved when PCID is inactive.

    While the Intel SDM claims that the reserved bits are ignored when PCID is
    disabled, the AMD APM states that they should be cleared.

    This went unnoticed as the AMD APM was not checked when the code was
    developed and reviewed and test systems with Intel CPUs never failed to
    boot. The report is against a Centos 6 host where the guest fails to boot,
    so it's not yet clear whether this is a virt issue or can happen on real
    hardware too, but thats irrelevant as the AMD APM clearly ask for clearing
    the reserved bits.

    Make sure that on non PCID machines bit 11 is not set by the page table
    switching code.

    Andy suggested to rename the related bits and masks so they are clearly
    describing what they should be used for, which is done as well for clarity.

    That split could have been done with alternatives but the macro hell is
    horrible and ugly. This can be done on top if someone cares to remove the
    extra orq. For now it's a straight forward fix.

    Fixes: 6fd166aae78c ("x86/mm: Use/Fix PCID to optimize user/kernel switches")
    Reported-by: Laura Abbott
    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: stable
    Cc: Borislav Petkov
    Cc: Andy Lutomirski
    Cc: Willy Tarreau
    Cc: David Woodhouse
    Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801140009150.2371@nanos
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     
  • commit 2641f08bb7fc63a636a2b18173221d7040a3512e upstream.

    Convert indirect jumps in core 32/64bit entry assembler code to use
    non-speculative sequences when CONFIG_RETPOLINE is enabled.

    Don't use CALL_NOSPEC in entry_SYSCALL_64_fastpath because the return
    address after the 'call' instruction must be *precisely* at the
    .Lentry_SYSCALL_64_after_fastpath label for stub_ptregs_64 to work,
    and the use of alternatives will mess that up unless we play horrid
    games to prepend with NOPs and make the variants the same length. It's
    not worth it; in the case where we ALTERNATIVE out the retpoline, the
    first instruction at __x86.indirect_thunk.rax is going to be a bare
    jmp *%rax anyway.

    Signed-off-by: David Woodhouse
    Signed-off-by: Thomas Gleixner
    Acked-by: Ingo Molnar
    Acked-by: Arjan van de Ven
    Cc: gnomes@lxorguk.ukuu.org.uk
    Cc: Rik van Riel
    Cc: Andi Kleen
    Cc: Josh Poimboeuf
    Cc: thomas.lendacky@amd.com
    Cc: Peter Zijlstra
    Cc: Linus Torvalds
    Cc: Jiri Kosina
    Cc: Andy Lutomirski
    Cc: Dave Hansen
    Cc: Kees Cook
    Cc: Tim Chen
    Cc: Greg Kroah-Hartman
    Cc: Paul Turner
    Link: https://lkml.kernel.org/r/1515707194-20531-7-git-send-email-dwmw@amazon.co.uk
    Signed-off-by: Greg Kroah-Hartman

    David Woodhouse
     

05 Jan, 2018

1 commit

  • commit d7732ba55c4b6a2da339bb12589c515830cfac2c upstream.

    The preparation for PTI which added CR3 switching to the entry code
    misplaced the CR3 switch in entry_SYSCALL_compat().

    With PTI enabled the entry code tries to access a per cpu variable after
    switching to kernel GS. This fails because that variable is not mapped to
    user space. This results in a double fault and in the worst case a kernel
    crash.

    Move the switch ahead of the access and clobber RSP which has been saved
    already.

    Fixes: 8a09317b895f ("x86/mm/pti: Prepare the x86/entry assembly code for entry/exit CR3 switching")
    Reported-by: Lars Wendler
    Reported-by: Laura Abbott
    Signed-off-by: Thomas Gleixner
    Cc: Borislav Betkov
    Cc: Andy Lutomirski ,
    Cc: Dave Hansen ,
    Cc: Peter Zijlstra ,
    Cc: Greg KH , ,
    Cc: Boris Ostrovsky ,
    Cc: Juergen Gross
    Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801031949200.1957@nanos
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner
     

03 Jan, 2018

4 commits

  • commit 21e94459110252d41b45c0c8ba50fd72a664d50c upstream.

    Most NMI/paranoid exceptions will not in fact change pagetables and would
    thus not require TLB flushing, however RESTORE_CR3 uses flushing CR3
    writes.

    Restores to kernel PCIDs can be NOFLUSH, because we explicitly flush the
    kernel mappings and now that we track which user PCIDs need flushing we can
    avoid those too when possible.

    This does mean RESTORE_CR3 needs an additional scratch_reg, luckily both
    sites have plenty available.

    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: David Laight
    Cc: Denys Vlasenko
    Cc: Eduardo Valentin
    Cc: Greg KH
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Cc: aliguori@amazon.com
    Cc: daniel.gruss@iaik.tugraz.at
    Cc: hughd@google.com
    Cc: keescook@google.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra
     
  • commit 6fd166aae78c0ab738d49bda653cbd9e3b1491cf upstream.

    We can use PCID to retain the TLBs across CR3 switches; including those now
    part of the user/kernel switch. This increases performance of kernel
    entry/exit at the cost of more expensive/complicated TLB flushing.

    Now that we have two address spaces, one for kernel and one for user space,
    we need two PCIDs per mm. We use the top PCID bit to indicate a user PCID
    (just like we use the PFN LSB for the PGD). Since we do TLB invalidation
    from kernel space, the existing code will only invalidate the kernel PCID,
    we augment that by marking the corresponding user PCID invalid, and upon
    switching back to userspace, use a flushing CR3 write for the switch.

    In order to access the user_pcid_flush_mask we use PER_CPU storage, which
    means the previously established SWAPGS vs CR3 ordering is now mandatory
    and required.

    Having to do this memory access does require additional registers, most
    sites have a functioning stack and we can spill one (RAX), sites without
    functional stack need to otherwise provide the second scratch register.

    Note: PCID is generally available on Intel Sandybridge and later CPUs.
    Note: Up until this point TLB flushing was broken in this series.

    Based-on-code-from: Dave Hansen
    Signed-off-by: Peter Zijlstra (Intel)
    Signed-off-by: Thomas Gleixner
    Cc: Andy Lutomirski
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: David Laight
    Cc: Denys Vlasenko
    Cc: Eduardo Valentin
    Cc: Greg KH
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Cc: aliguori@amazon.com
    Cc: daniel.gruss@iaik.tugraz.at
    Cc: hughd@google.com
    Cc: keescook@google.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Peter Zijlstra
     
  • commit 85900ea51577e31b186e523c8f4e068c79ecc7d3 upstream.

    Make VSYSCALLs work fully in PTI mode by mapping them properly to the user
    space visible page tables.

    [ tglx: Hide unused functions (Patch by Arnd Bergmann) ]

    Signed-off-by: Andy Lutomirski
    Signed-off-by: Thomas Gleixner
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: David Laight
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Kees Cook
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Andy Lutomirski
     
  • commit aa8c6248f8c75acfd610fe15d8cae23cf70d9d09 upstream.

    Add the initial files for kernel page table isolation, with a minimal init
    function and the boot time detection for this misfeature.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: Borislav Petkov
    Cc: Andy Lutomirski
    Cc: Boris Ostrovsky
    Cc: Borislav Petkov
    Cc: Brian Gerst
    Cc: Dave Hansen
    Cc: David Laight
    Cc: Denys Vlasenko
    Cc: Eduardo Valentin
    Cc: Greg KH
    Cc: H. Peter Anvin
    Cc: Josh Poimboeuf
    Cc: Juergen Gross
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: Will Deacon
    Cc: aliguori@amazon.com
    Cc: daniel.gruss@iaik.tugraz.at
    Cc: hughd@google.com
    Cc: keescook@google.com
    Signed-off-by: Ingo Molnar
    Signed-off-by: Greg Kroah-Hartman

    Thomas Gleixner