Eric Lee / smarc-fsl-linux-kernel

30 Sep, 2016

1 commit

1ef55be16 x86/asm: Get rid of __read_cr4_safe() ... Browse Code »

We use __read_cr4() vs __read_cr4_safe() inconsistently. On
CR4-less CPUs, all CR4 bits are effectively clear, so we can make
the code simpler and more robust by making __read_cr4() always fix
up faults on 32-bit kernels.

This may fix some bugs on old 486-like CPUs, but I don't have any
easy way to test that.

Signed-off-by: Andy Lutomirski
Cc: Brian Gerst
Cc: Borislav Petkov
Cc: david@saggiorato.net
Link: http://lkml.kernel.org/r/ea647033d357d9ce2ad2bbde5a631045f5052fb6.1475178370.git.luto@kernel.org
Signed-off-by: Thomas Gleixner

Andy Lutomirski
2016-09-30 18:40:12 +0800

16 Aug, 2016

1 commit

5d87f493d x86/power/64: Use __pa() for physical address computation ... Browse Code »

The value of temp_level4_pgt is the physical address of the
top-level page directory, so use __pa() to compute it.

Signed-off-by: Rafael J. Wysocki
Acked-by: Ingo Molnar

Rafael J. Wysocki
2016-08-16 06:39:37 +0800

09 Aug, 2016

1 commit

e4630fdd4 x86/power/64: Always create temporary identity mapping correctly ... Browse Code »

The low-level resume-from-hibernation code on x86-64 uses
kernel_ident_mapping_init() to create the temoprary identity mapping,
but that function assumes that the offset between kernel virtual
addresses and physical addresses is aligned on the PGD level.

However, with a randomized identity mapping base, it may be aligned
on the PUD level and if that happens, the temporary identity mapping
created by set_up_temporary_mappings() will not reflect the actual
kernel identity mapping and the image restoration will fail as a
result (leading to a kernel panic most of the time).

To fix this problem, rework kernel_ident_mapping_init() to support
unaligned offsets between KVA and PA up to the PMD level and make
set_up_temporary_mappings() use it as approprtiate.

Reported-and-tested-by: Thomas Garnier
Reported-by: Borislav Petkov
Suggested-by: Yinghai Lu
Signed-off-by: Rafael J. Wysocki
Acked-by: Yinghai Lu

Rafael J. Wysocki
2016-08-09 04:04:30 +0800

03 Aug, 2016

1 commit

c226fab47 x86/power/64: Do not refer to __PAGE_OFFSET from assembly code ... Browse Code »

When CONFIG_RANDOMIZE_MEMORY is set on x86-64, __PAGE_OFFSET becomes
a variable and using it as a symbol in the image memory restoration
assembly code under core_restore_code is not correct any more.

To avoid that problem, modify set_up_temporary_mappings() to compute
the physical address of the temporary page tables and store it in
temp_level4_pgt, so that the value of that variable is ready to be
written into CR3. Then, the assembly code doesn't have to worry
about converting that value into a physical address and things work
regardless of whether or not CONFIG_RANDOMIZE_MEMORY is set.

Reported-and-tested-by: Thomas Garnier
Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2016-08-03 07:35:38 +0800

29 Jul, 2016

1 commit

4ce827b4c x86/power/64: Fix hibernation return address corruption ... Browse Code »

In kernel bug 150021, a kernel panic was reported when restoring a
hibernate image. Only a picture of the oops was reported, so I can't
paste the whole thing here. But here are the most interesting parts:

kernel tried to execute NX-protected page - exploit attempt? (uid: 0)
BUG: unable to handle kernel paging request at ffff8804615cfd78
...
RIP: ffff8804615cfd78
RSP: ffff8804615f0000
RBP: ffff8804615cfdc0
...
Call Trace:
do_signal+0x23
exit_to_usermode_loop+0x64
...

The RIP is on the same page as RBP, so it apparently started executing
on the stack.

The bug was bisected to commit ef0f3ed5a4ac (x86/asm/power: Create
stack frames in hibernate_asm_64.S), which in retrospect seems quite
dangerous, since that code saves and restores the stack pointer from a
global variable ('saved_context').

There are a lot of moving parts in the hibernate save and restore paths,
so I don't know exactly what caused the panic. Presumably, a FRAME_END
was executed without the corresponding FRAME_BEGIN, or vice versa. That
would corrupt the return address on the stack and would be consistent
with the details of the above panic.

[ rjw: One major problem is that by the time the FRAME_BEGIN in
restore_registers() is executed, the stack pointer value may not
be valid any more. Namely, the stack area pointed to by it
previously may have been overwritten by some image memory contents
and that page frame may now be used for whatever different purpose
it had been allocated for before hibernation. In that case, the
FRAME_BEGIN will corrupt that memory. ]

Instead of doing the frame pointer save/restore around the bounds of the
affected functions, just do it around the call to swsusp_save().

That has the same effect of ensuring that if swsusp_save() sleeps, the
frame pointers will be correct. It's also a much more obviously safe
way to do it than the original patch. And objtool still doesn't report
any warnings.

Fixes: ef0f3ed5a4ac (x86/asm/power: Create stack frames in hibernate_asm_64.S)
Link: https://bugzilla.kernel.org/show_bug.cgi?id=150021
Cc: 4.6+ # 4.6+
Reported-by: Andre Reinke
Tested-by: Andre Reinke
Signed-off-by: Josh Poimboeuf
Acked-by: Ingo Molnar
Signed-off-by: Rafael J. Wysocki

Josh Poimboeuf
2016-07-29 19:38:59 +0800

16 Jul, 2016

1 commit

406f992e4 x86 / hibernate: Use hlt_play_dead() when resuming from hibernation ... Browse Code »

On Intel hardware, native_play_dead() uses mwait_play_dead() by
default and only falls back to the other methods if that fails.
That also happens during resume from hibernation, when the restore
(boot) kernel runs disable_nonboot_cpus() to take all of the CPUs
except for the boot one offline.

However, that is problematic, because the address passed to
__monitor() in mwait_play_dead() is likely to be written to in the
last phase of hibernate image restoration and that causes the "dead"
CPU to start executing instructions again. Unfortunately, the page
containing the address in that CPU's instruction pointer may not be
valid any more at that point.

First, that page may have been overwritten with image kernel memory
contents already, so the instructions the CPU attempts to execute may
simply be invalid. Second, the page tables previously used by that
CPU may have been overwritten by image kernel memory contents, so the
address in its instruction pointer is impossible to resolve then.

A report from Varun Koyyalagunta and investigation carried out by
Chen Yu show that the latter sometimes happens in practice.

To prevent it from happening, temporarily change the smp_ops.play_dead
pointer during resume from hibernation so that it points to a special
"play dead" routine which uses hlt_play_dead() and avoids the
inadvertent "revivals" of "dead" CPUs this way.

A slightly unpleasant consequence of this change is that if the
system is hibernated with one or more CPUs offline, it will generally
draw more power after resume than it did before hibernation, because
the physical state entered by CPUs via hlt_play_dead() is higher-power
than the mwait_play_dead() one in the majority of cases. It is
possible to work around this, but it is unclear how much of a problem
that's going to be in practice, so the workaround will be implemented
later if it turns out to be necessary.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=106371
Reported-by: Varun Koyyalagunta
Original-by: Chen Yu
Tested-by: Chen Yu
Signed-off-by: Rafael J. Wysocki
Acked-by: Ingo Molnar

Rafael J. Wysocki
2016-07-16 04:42:48 +0800

01 Jul, 2016

1 commit

65c0554b7 x86/power/64: Fix kernel text mapping corruption during image restoration ... Browse Code »

Logan Gunthorpe reports that hibernation stopped working reliably for
him after commit ab76f7b4ab23 (x86/mm: Set NX on gap between __ex_table
and rodata).

That turns out to be a consequence of a long-standing issue with the
64-bit image restoration code on x86, which is that the temporary
page tables set up by it to avoid page tables corruption when the
last bits of the image kernel's memory contents are copied into
their original page frames re-use the boot kernel's text mapping,
but that mapping may very well get corrupted just like any other
part of the page tables. Of course, if that happens, the final
jump to the image kernel's entry point will go to nowhere.

The exact reason why commit ab76f7b4ab23 matters here is that it
sometimes causes a PMD of a large page to be split into PTEs
that are allocated dynamically and get corrupted during image
restoration as described above.

To fix that issue note that the code copying the last bits of the
image kernel's memory contents to the page frames occupied by them
previoulsy doesn't use the kernel text mapping, because it runs from
a special page covered by the identity mapping set up for that code
from scratch. Hence, the kernel text mapping is only needed before
that code starts to run and then it will only be used just for the
final jump to the image kernel's entry point.

Accordingly, the temporary page tables set up in swsusp_arch_resume()
on x86-64 need to contain the kernel text mapping too. That mapping
is only going to be used for the final jump to the image kernel, so
it only needs to cover the image kernel's entry point, because the
first thing the image kernel does after getting control back is to
switch over to its own original page tables. Moreover, the virtual
address of the image kernel's entry point in that mapping has to be
the same as the one mapped by the image kernel's page tables.

With that in mind, modify the x86-64's arch_hibernation_header_save()
and arch_hibernation_header_restore() routines to pass the physical
address of the image kernel's entry point (in addition to its virtual
address) to the boot kernel (a small piece of assembly code involved
in passing the entry point's virtual address to the image kernel is
not necessary any more after that, so drop it). Update RESTORE_MAGIC
too to reflect the image header format change.

Next, in set_up_temporary_mappings(), use the physical and virtual
addresses of the image kernel's entry point passed in the image
header to set up a minimum kernel text mapping (using memory pages
that won't be overwritten by the image kernel's memory contents) that
will map those addresses to each other as appropriate.

This makes the concern about the possible corruption of the original
boot kernel text mapping go away and if the the minimum kernel text
mapping used for the final jump marks the image kernel's entry point
memory as executable, the jump to it is guaraneed to succeed.

Fixes: ab76f7b4ab23 (x86/mm: Set NX on gap between __ex_table and rodata)
Link: http://marc.info/?l=linux-pm&m=146372852823760&w=2
Reported-by: Logan Gunthorpe
Reported-and-tested-by: Borislav Petkov
Tested-by: Kees Cook
Signed-off-by: Rafael J. Wysocki

Rafael J. Wysocki
2016-07-01 05:57:15 +0800

31 Mar, 2016

1 commit

16bf92261 x86/cpufeature: Remove cpu_has_pse ... Browse Code »

Signed-off-by: Borislav Petkov
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/1459266123-21878-11-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar

Borislav Petkov
2016-03-31 19:35:10 +0800

24 Feb, 2016

1 commit

ef0f3ed5a x86/asm/power: Create stack frames in hibernate_asm_64.S ... Browse Code »

swsusp_arch_suspend() and restore_registers() are callable non-leaf
functions which don't honor CONFIG_FRAME_POINTER, which can result in
bad stack traces. Also they aren't annotated as ELF callable functions
which can confuse tooling.

Create a stack frame for them when CONFIG_FRAME_POINTER is enabled and
give them proper ELF function annotations.

Signed-off-by: Josh Poimboeuf
Reviewed-by: Borislav Petkov
Acked-by: Pavel Machek
Acked-by: Rafael J. Wysocki
Cc: Andrew Morton
Cc: Andy Lutomirski
Cc: Andy Lutomirski
Cc: Arnaldo Carvalho de Melo
Cc: Bernd Petrovitsch
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Chris J Arges
Cc: Denys Vlasenko
Cc: H. Peter Anvin
Cc: Jiri Slaby
Cc: Linus Torvalds
Cc: Michal Marek
Cc: Namhyung Kim
Cc: Pedro Alves
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: live-patching@vger.kernel.org
Link: http://lkml.kernel.org/r/bdad00205897dc707aebe9e9e39757085e2bf999.1453405861.git.jpoimboe@redhat.com
Signed-off-by: Ingo Molnar

Josh Poimboeuf
2016-02-24 15:35:43 +0800

26 Nov, 2015

1 commit

7a9c2dd08 x86/pm: Introduce quirk framework to save/restore extra MSR registers around suspend/resume ... Browse Code »

A bug was reported that on certain Broadwell platforms, after
resuming from S3, the CPU is running at an anomalously low
speed.

It turns out that the BIOS has modified the value of the
THERM_CONTROL register during S3, and changed it from 0 to 0x10,
thus enabled clock modulation(bit4), but with undefined CPU Duty
Cycle(bit1:3) - which causes the problem.

Here is a simple scenario to reproduce the issue:

1. Boot up the system
2. Get MSR 0x19a, it should be 0
3. Put the system into sleep, then wake it up
4. Get MSR 0x19a, it shows 0x10, while it should be 0

Although some BIOSen want to change the CPU Duty Cycle during
S3, in our case we don't want the BIOS to do any modification.

Fix this issue by introducing a more generic x86 framework to
save/restore specified MSR registers(THERM_CONTROL in this case)
for suspend/resume. This allows us to fix similar bugs in a much
simpler way in the future.

When the kernel wants to protect certain MSRs during suspending,
we simply add a quirk entry in msr_save_dmi_table, and customize
the MSR registers inside the quirk callback, for example:

u32 msr_id_need_to_save[] = {MSR_ID0, MSR_ID1, MSR_ID2...};

and the quirk mechanism ensures that, once resumed from suspend,
the MSRs indicated by these IDs will be restored to their
original, pre-suspend values.

Since both 64-bit and 32-bit kernels are affected, this patch
covers the common 64/32-bit suspend/resume code path. And
because the MSRs specified by the user might not be available or
readable in any situation, we use rdmsrl_safe() to safely save
these MSRs.

Reported-and-tested-by: Marcin Kaszewski
Signed-off-by: Chen Yu
Acked-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Cc: Andy Lutomirski
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Denys Vlasenko
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: bp@suse.de
Cc: len.brown@intel.com
Cc: linux@horizon.com
Cc: luto@kernel.org
Cc: rjw@rjwysocki.net
Link: http://lkml.kernel.org/r/c9abdcbc173dd2f57e8990e304376f19287e92ba.1448382971.git.yu.c.chen@intel.com
[ More edits to the naming of data structures. ]
Signed-off-by: Ingo Molnar

Chen Yu
2015-11-26 17:04:53 +0800

31 Jul, 2015

1 commit

37868fe11 x86/ldt: Make modify_ldt synchronous ... Browse Code »

modify_ldt() has questionable locking and does not synchronize
threads. Improve it: redesign the locking and synchronize all
threads' LDTs using an IPI on all modifications.

This will dramatically slow down modify_ldt in multithreaded
programs, but there shouldn't be any multithreaded programs that
care about modify_ldt's performance in the first place.

This fixes some fallout from the CVE-2015-5157 fixes.

Signed-off-by: Andy Lutomirski
Reviewed-by: Borislav Petkov
Cc: Andrew Cooper
Cc: Andy Lutomirski
Cc: Boris Ostrovsky
Cc: Borislav Petkov
Cc: Brian Gerst
Cc: Denys Vlasenko
Cc: H. Peter Anvin
Cc: Jan Beulich
Cc: Konrad Rzeszutek Wilk
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Sasha Levin
Cc: Steven Rostedt
Cc: Thomas Gleixner
Cc: security@kernel.org
Cc:
Cc: xen-devel
Link: http://lkml.kernel.org/r/4c6978476782160600471bd865b318db34c7b628.1438291540.git.luto@kernel.org
Signed-off-by: Ingo Molnar

Andy Lutomirski
2015-07-31 16:23:23 +0800

23 Jun, 2015

1 commit

d70b3ef54 Merge branch 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 core updates from Ingo Molnar:
"There were so many changes in the x86/asm, x86/apic and x86/mm topics
in this cycle that the topical separation of -tip broke down somewhat -
so the result is a more traditional architecture pull request,
collected into the 'x86/core' topic.

The topics were still maintained separately as far as possible, so
bisectability and conceptual separation should still be pretty good -
but there were a handful of merge points to avoid excessive
dependencies (and conflicts) that would have been poorly tested in the
end.

The next cycle will hopefully be much more quiet (or at least will
have fewer dependencies).

The main changes in this cycle were:

* x86/apic changes, with related IRQ core changes: (Jiang Liu, Thomas
Gleixner)

- This is the second and most intrusive part of changes to the x86
interrupt handling - full conversion to hierarchical interrupt
domains:

[IOAPIC domain] -----
|
[MSI domain] --------[Remapping domain] ----- [ Vector domain ]
| (optional) |
[HPET MSI domain] ----- |
|
[DMAR domain] -----------------------------
|
[Legacy domain] -----------------------------

This now reflects the actual hardware and allowed us to distangle
the domain specific code from the underlying parent domain, which
can be optional in the case of interrupt remapping. It's a clear
separation of functionality and removes quite some duct tape
constructs which plugged the remap code between ioapic/msi/hpet
and the vector management.

- Intel IOMMU IRQ remapping enhancements, to allow direct interrupt
injection into guests (Feng Wu)

* x86/asm changes:

- Tons of cleanups and small speedups, micro-optimizations. This
is in preparation to move a good chunk of the low level entry
code from assembly to C code (Denys Vlasenko, Andy Lutomirski,
Brian Gerst)

- Moved all system entry related code to a new home under
arch/x86/entry/ (Ingo Molnar)

- Removal of the fragile and ugly CFI dwarf debuginfo annotations.
Conversion to C will reintroduce many of them - but meanwhile
they are only getting in the way, and the upstream kernel does
not rely on them (Ingo Molnar)

- NOP handling refinements. (Borislav Petkov)

* x86/mm changes:

- Big PAT and MTRR rework: making the code more robust and
preparing to phase out exposing direct MTRR interfaces to drivers -
in favor of using PAT driven interfaces (Toshi Kani, Luis R
Rodriguez, Borislav Petkov)

- New ioremap_wt()/set_memory_wt() interfaces to support
Write-Through cached memory mappings. This is especially
important for good performance on NVDIMM hardware (Toshi Kani)

* x86/ras changes:

- Add support for deferred errors on AMD (Aravind Gopalakrishnan)

This is an important RAS feature which adds hardware support for
poisoned data. That means roughly that the hardware marks data
which it has detected as corrupted but wasn't able to correct, as
poisoned data and raises an APIC interrupt to signal that in the
form of a deferred error. It is the OS's responsibility then to
take proper recovery action and thus prolonge system lifetime as
far as possible.

- Add support for Intel "Local MCE"s: upcoming CPUs will support
CPU-local MCE interrupts, as opposed to the traditional system-
wide broadcasted MCE interrupts (Ashok Raj)

- Misc cleanups (Borislav Petkov)

* x86/platform changes:

- Intel Atom SoC updates

... and lots of other cleanups, fixlets and other changes - see the
shortlog and the Git log for details"

* 'x86-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (222 commits)
x86/hpet: Use proper hpet device number for MSI allocation
x86/hpet: Check for irq==0 when allocating hpet MSI interrupts
x86/mm/pat, drivers/infiniband/ipath: Use arch_phys_wc_add() and require PAT disabled
x86/mm/pat, drivers/media/ivtv: Use arch_phys_wc_add() and require PAT disabled
x86/platform/intel/baytrail: Add comments about why we disabled HPET on Baytrail
genirq: Prevent crash in irq_move_irq()
genirq: Enhance irq_data_to_desc() to support hierarchy irqdomain
iommu, x86: Properly handle posted interrupts for IOMMU hotplug
iommu, x86: Provide irq_remapping_cap() interface
iommu, x86: Setup Posted-Interrupts capability for Intel iommu
iommu, x86: Add cap_pi_support() to detect VT-d PI capability
iommu, x86: Avoid migrating VT-d posted interrupts
iommu, x86: Save the mode (posted or remapped) of an IRTE
iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip
iommu: dmar: Provide helper to copy shared irte fields
iommu: dmar: Extend struct irte for VT-d Posted-Interrupts
iommu: Add new member capability to struct irq_remap_ops
x86/asm/entry/64: Disentangle error_entry/exit gsbase/ebx/usermode code
x86/asm/entry/32: Shorten __audit_syscall_entry() args preparation
x86/asm/entry/32: Explain reloading of registers after __audit_syscall_entry()
...

Linus Torvalds
2015-06-23 08:59:09 +0800

19 May, 2015

4 commits

952f07ecb x86/fpu: Move various internal function prototypes to fpu/internal.h ... Browse Code »

There are a number of FPU internal function prototypes and an inline function
in fpu/api.h, mostly placed so historically as the code grew over the years.

Move them over into fpu/internal.h where they belong. (Add sched.h include
to stackprotector.h which incorrectly relied on getting it from fpu/api.h.)

fpu/api.h is now a pure file that only contains FPU APIs intended for driver
use.

Reviewed-by: Borislav Petkov
Cc: Andy Lutomirski
Cc: Dave Hansen
Cc: Fenghua Yu
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Ingo Molnar
2015-05-19 21:47:48 +0800
9254aaa0f x86/fpu: Move XCR0 manipulation to the FPU code proper ... Browse Code »

The suspend code accesses FPU state internals, add a helper for
it and isolate it.

Reviewed-by: Borislav Petkov
Cc: Andy Lutomirski
Cc: Dave Hansen
Cc: Fenghua Yu
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Ingo Molnar
2015-05-19 21:47:33 +0800
614df7fb8 x86/fpu: Rename 'pcntxt_mask' to 'xfeatures_mask' ... Browse Code »

So the 'pcntxt_mask' is a misnomer, it's essentially meaningless to anyone
who doesn't know what it does exactly.

Name it more descriptively as 'xfeatures_mask'.

Reviewed-by: Borislav Petkov
Cc: Andy Lutomirski
Cc: Dave Hansen
Cc: Fenghua Yu
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Ingo Molnar
2015-05-19 21:47:33 +0800
78f7f1e54 x86/fpu: Rename fpu-internal.h to fpu/internal.h ... Browse Code »

This unifies all the FPU related header files under a unified, hiearchical
naming scheme:

- asm/fpu/types.h: FPU related data types, needed for 'struct task_struct',
widely included in almost all kernel code, and hence kept
as small as possible.

- asm/fpu/api.h: FPU related 'public' methods exported to other subsystems.

- asm/fpu/internal.h: FPU subsystem internal methods

- asm/fpu/xsave.h: XSAVE support internal methods

(Also standardize the header guard in asm/fpu/internal.h.)

Reviewed-by: Borislav Petkov
Cc: Andy Lutomirski
Cc: Dave Hansen
Cc: Fenghua Yu
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Signed-off-by: Ingo Molnar

Ingo Molnar
2015-05-19 21:47:31 +0800

15 Apr, 2015

1 commit

ff22e2010 x86/asm, x86/power/hibernate: Use local labels in asm ... Browse Code »

... so that they don't appear in the object file and thus in
objdump output. They're local anyway and have a meaning only
within that file.

No functionality change.

Signed-off-by: Borislav Petkov
Acked-by: Rafael J. Wysocki
Acked-by: Pavel Machek
Cc: H. Peter Anvin
Cc: Rafael J. Wysocki
Cc: Thomas Gleixner
Cc: linux-pm@vger.kernel.org
Link: http://lkml.kernel.org/r/1428867906-12016-1-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar

Borislav Petkov
2015-04-15 17:37:51 +0800

06 Mar, 2015

1 commit

24933b82c x86/asm/entry: Rename 'init_tss' to 'cpu_tss' ... Browse Code »

It has nothing to do with init -- there's only one TSS per cpu.

Other names considered include:

- current_tss: Confusing because we never switch the tss.
- singleton_tss: Too long.

This patch was generated with 's/init_tss/cpu_tss/g'. Followup
patches will fix INIT_TSS and INIT_TSS_IST by hand.

Signed-off-by: Andy Lutomirski
Cc: Borislav Petkov
Cc: Denys Vlasenko
Cc: H. Peter Anvin
Cc: Linus Torvalds
Cc: Oleg Nesterov
Cc: Thomas Gleixner
Link: http://lkml.kernel.org/r/da29fb2a793e4f649d93ce2d1ed320ebe8516262.1425611534.git.luto@amacapital.net
Signed-off-by: Ingo Molnar

Andy Lutomirski
2015-03-06 15:32:58 +0800

04 Feb, 2015

1 commit

1e02ce4cc x86: Store a per-cpu shadow copy of CR4 ... Browse Code »

Context switches and TLB flushes can change individual bits of CR4.
CR4 reads take several cycles, so store a shadow copy of CR4 in a
per-cpu variable.

To avoid wasting a cache line, I added the CR4 shadow to
cpu_tlbstate, which is already touched in switch_mm. The heaviest
users of the cr4 shadow will be switch_mm and __switch_to_xtra, and
__switch_to_xtra is called shortly after switch_mm during context
switch, so the cacheline is likely to be hot.

Signed-off-by: Andy Lutomirski
Reviewed-by: Thomas Gleixner
Signed-off-by: Peter Zijlstra (Intel)
Cc: Kees Cook
Cc: Andrea Arcangeli
Cc: Vince Weaver
Cc: "hillf.zj"
Cc: Valdis Kletnieks
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/3a54dd3353fffbf84804398e00dfdc5b7c1afd7d.1414190806.git.luto@amacapital.net
Signed-off-by: Ingo Molnar

Andy Lutomirski
2015-02-04 19:10:42 +0800

10 Oct, 2014

1 commit

7f8998c7a nosave: consolidate __nosave_{begin,end} in <asm/sections.h> ... Browse Code »

The different architectures used their own (and different) declarations:

extern __visible const void __nosave_begin, __nosave_end;
extern const void __nosave_begin, __nosave_end;
extern long __nosave_begin, __nosave_end;

Consolidate them using the first variant in .

Signed-off-by: Geert Uytterhoeven
Cc: Russell King
Cc: Ralf Baechle
Cc: Benjamin Herrenschmidt
Cc: Martin Schwidefsky
Cc: "David S. Miller"
Cc: Guan Xuetao
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Geert Uytterhoeven
2014-10-10 10:26:04 +0800

17 Jul, 2014

1 commit

b8f99b3e0 x86, power, suspend: Annotate restore_processor_state() with notrace ... Browse Code »

ftrace_stop() is used to stop function tracing during suspend and resume
which removes a lot of possible debugging opportunities with tracing.
The reason was that some function in the resume path was causing a triple
fault if it were to be traced. The issue I found was that doing something
as simple as calling smp_processor_id() would reboot the box!

When function tracing was first created I didn't have a good way to figure
out what function was having issues, or it looked to be multiple ones. To
fix it, we just created a big hammer approach to the problem which was to
add a flag in the mcount trampoline that could be checked and not call
the traced functions.

Lately I developed better ways to find problem functions and I can bisect
down to see what function is causing the issue. I removed the flag that
stopped tracing and proceeded to find the problem function and it ended
up being restore_processor_state(). This function makes sense as when the
CPU comes back online from a suspend it calls this function to set up
registers, amongst them the GS register, which stores things such as
what CPU the processor is (if you call smp_processor_id() without this
set up properly, it would fault).

By making restore_processor_state() notrace, the system can suspend and
resume without the need of the big hammer tracing to stop.

Link: http://lkml.kernel.org/r/3577662.BSnUZfboWb@vostro.rjw.lan

Acked-by: "Rafael J. Wysocki"
Reviewed-by: Masami Hiramatsu
Signed-off-by: Steven Rostedt

Steven Rostedt (Red Hat)
2014-07-17 21:45:05 +0800

06 May, 2014

1 commit

2605fc216 asmlinkage, x86: Add explicit __visible to arch/x86/* ... Browse Code »

As requested by Linus add explicit __visible to the asmlinkage users.
This marks all functions visible to assembler.

Tree sweep for arch/x86/*

Signed-off-by: Andi Kleen
Link: http://lkml.kernel.org/r/1398984278-29319-3-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin

Andi Kleen
2014-05-06 07:07:44 +0800

07 Aug, 2013

1 commit

d6efc2f72 x86, asmlinkage, power: Make various symbols used by the suspend asm code visible ... Browse Code »

Signed-off-by: Andi Kleen
Link: http://lkml.kernel.org/r/1375740170-7446-16-git-send-email-andi@firstfloor.org
Signed-off-by: H. Peter Anvin

Andi Kleen
2013-08-07 05:21:03 +0800

03 May, 2013

1 commit

cc456c4e7 x86, gdt, hibernate: Store/load GDT for hibernate path. ... Browse Code »

The git commite7a5cd063c7b4c58417f674821d63f5eb6747e37
("x86-64, gdt: Store/load GDT for ACPI S3 or hibernate/resume path
is not needed.") assumes that for the hibernate path the booting
kernel and the resuming kernel MUST be the same. That is certainly
the case for a 32-bit kernel (see check_image_kernel and
CONFIG_ARCH_HIBERNATION_HEADER config option).

However for 64-bit kernels it is OK to have a different kernel
version (and size of the image) of the booting and resuming kernels.
Hence the above mentioned git commit introduces an regression.

This patch fixes it by introducing a 'struct desc_ptr gdt_desc'
back in the 'struct saved_context'. However instead of having in the
'save_processor_state' and 'restore_processor_state' the
store/load_gdt calls, we are only saving the GDT in the
save_processor_state.

For the restore path the lgdt operation is done in
hibernate_asm_[32|64].S in the 'restore_registers' path.

The apt reader of this description will recognize that only 64-bit
kernels need this treatment, not 32-bit. This patch adds the logic
in the 32-bit path to be more similar to 64-bit so that in the future
the unification process can take advantage of this.

[ hpa: this also reverts an inadvertent on-disk format change ]

Suggested-by: "H. Peter Anvin"
Acked-by: "Rafael J. Wysocki"
Signed-off-by: Konrad Rzeszutek Wilk
Link: http://lkml.kernel.org/r/1367459610-9656-2-git-send-email-konrad.wilk@oracle.com
Signed-off-by: H. Peter Anvin

Konrad Rzeszutek Wilk
2013-05-03 02:27:35 +0800

30 Apr, 2013

1 commit

1e2f5b598 Merge branch 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 paravirt update from Ingo Molnar:
"Various paravirtualization related changes - the biggest one makes
guest support optional via CONFIG_HYPERVISOR_GUEST"

* 'x86-paravirt-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, wakeup, sleep: Use pvops functions for changing GDT entries
x86, xen, gdt: Remove the pvops variant of store_gdt.
x86-32, gdt: Store/load GDT for ACPI S3 or hibernation/resume path is not needed
x86-64, gdt: Store/load GDT for ACPI S3 or hibernate/resume path is not needed.
x86: Make Linux guest support optional
x86, Kconfig: Move PARAVIRT_DEBUG into the paravirt menu

Linus Torvalds
2013-04-30 23:41:21 +0800

12 Apr, 2013

3 commits

4d681be3c x86, wakeup, sleep: Use pvops functions for changing GDT entries ... Browse Code »

We check the TSS descriptor before we try to dereference it.
Also we document what the value '9' actually means using the
AMD64 Architecture Programmer's Manual Volume 2, pg 90:
"Hex value 9: Available 64-bit TSS" and pg 91:
"The available 32-bit TSS (09h), which is redefined as the
available 64-bit TSS."

Without this, on Xen, where the GDT is available as R/O (to
protect the hypervisor from the guest modifying it), we end up
with a pagetable fault.

Signed-off-by: Konrad Rzeszutek Wilk
Link: http://lkml.kernel.org/r/1365194544-14648-5-git-send-email-konrad.wilk@oracle.com
Cc: Rafael J. Wysocki
Signed-off-by: H. Peter Anvin

konrad@kernel.org
2013-04-12 06:41:15 +0800
84e70971e x86-32, gdt: Store/load GDT for ACPI S3 or hibernation/resume path is not needed ... Browse Code »

During the ACPI S3 suspend, we store the GDT in the wakup_header (see
wakeup_asm.s) field called 'pmode_gdt'.

Which is then used during the resume path and has the same exact
value as what the store/load_gdt do with the saved_context
(which is saved/restored via save/restore_processor_state()).

The flow during resume from ACPI S3 is simpler than the 64-bit
counterpart. We only use the early bootstrap once (wakeup_gdt) and
do various checks in real mode.

After the checks are completed, we load the saved GDT ('pmode_gdt') and
continue on with the resume (by heading to startup_32 in trampoline_32.S) -
which quickly jumps to what was saved in 'pmode_entry'
aka 'wakeup_pmode_return'.

The 'wakeup_pmode_return' restores the GDT (saved_gdt) again (which was
saved in do_suspend_lowlevel initially). After that it ends up calling
the 'ret_point' which calls 'restore_processor_state()'.

We have two opportunities to remove code where we restore the same GDT
twice.

Here is the call chain:
wakeup_start
|- lgdtl wakeup_gdt [the work-around broken BIOSes]
|
| - lgdtl pmode_gdt [the real one]
|
\-- startup_32 (in trampoline_32.S)
\-- wakeup_pmode_return (in wakeup_32.S)
|- lgdtl saved_gdt [the real one]
\-- ret_point
|..
|- call restore_processor_state

The hibernate path is much simpler. During the saving of the hibernation
image we call save_processor_state() and save the contents of that
along with the rest of the kernel in the hibernation image destination.
We save the EIP of 'restore_registers' (restore_jump_address) and
cr3 (restore_cr3).

During hibernate resume, the 'restore_registers' (via the
'restore_jump_address) in hibernate_asm_32.S is invoked which
restores the contents of most registers. Naturally the resume path benefits
from already being in 32-bit mode, so it does not have to reload the GDT.

It only reloads the cr3 (from restore_cr3) and continues on. Note
that the restoration of the restore image page-tables is done prior to
this.

After the 'restore_registers' it returns and we end up called
restore_processor_state() - where we reload the GDT. The reload of
the GDT is not needed as bootup kernel has already loaded the GDT
which is at the same physical location as the the restored kernel.

Note that the hibernation path assumes the GDT is correct during its
'restore_registers'. The assumption in the code is that the restored
image is the same as saved - meaning we are not trying to restore
an different kernel in the virtual address space of a new kernel.

Signed-off-by: Konrad Rzeszutek Wilk
Link: http://lkml.kernel.org/r/1365194544-14648-3-git-send-email-konrad.wilk@oracle.com
Cc: Rafael J. Wysocki
Signed-off-by: H. Peter Anvin

Konrad Rzeszutek Wilk
2013-04-12 06:40:17 +0800
e7a5cd063 x86-64, gdt: Store/load GDT for ACPI S3 or hibernate/resume path is not needed. ... Browse Code »

During the ACPI S3 resume path the trampoline code handles it already.

During the ACPI S3 suspend phase (acpi_suspend_lowlevel) we set:
early_gdt_descr.address = (..)get_cpu_gdt_table(smp_processor_id());

which is then used during the resume path and has the same exact
value as what the store/load_gdt do with the saved_context
(which is saved/restored via save/restore_processor_state()).

The flow during resume is complex and for 64-bit kernels we use three GDTs
- one early bootstrap GDT (wakeup_igdt) that we load to workaround
broken BIOSes, an early Protected Mode to Long Mode transition one
(tr_gdt), and the final one - early_gdt_descr (which points to the real GDT).

The early ('wakeup_gdt') is loaded in 'trampoline_start' for working
around broken BIOSes, and then when we end up in Protected Mode in the
startup_32 (in trampoline_64.s, not head_32.s) we use the 'tr_gdt'
(still in trampoline_64.s). This 'tr_gdt' has a a 32-bit code segment,
64-bit code segment with L=1, and a 32-bit data segment.

Once we have transitioned from Protected Mode to Long Mode we then
set the GDT to 'early_gdt_desc' and then via an iretq emerge in
wakeup_long64 (set via 'initial_code' variable in acpi_suspend_lowlevel).

In the wakeup_long64 we end up restoring the %rip (which is set to
'resume_point') and jump there.

In 'resume_point' we call 'restore_processor_state' which does
the load_gdt on the saved context. This load_gdt is redundant as the
GDT loaded via early_gdt_desc is the same.

Here is the call-chain:
wakeup_start
|- lgdtl wakeup_gdt [the work-around broken BIOSes]
|
\-- trampoline_start (trampoline_64.S)
|- lgdtl tr_gdt
|
\-- startup_32 (trampoline_64.S)
|
\-- startup_64 (trampoline_64.S)
|
\-- secondary_startup_64
|- lgdtl early_gdt_desc
| ...
|- movq initial_code(%rip), %eax
|-.. lretq
\-- wakeup_64
|-- other registers are reloaded
|-- call restore_processor_state

The hibernate path is much simpler. During the saving of the hibernation
image we call save_processor_state() and save the contents of that along
with the rest of the kernel in the hibernation image destination.
We save the EIP of 'restore_registers' (restore_jump_address) and cr3
(restore_cr3).

During hibernate resume, the 'restore_registers' (via the
'restore_jump_address) in hibernate_asm_64.S is invoked which restores
the contents of most registers. Naturally the resume path benefits from
already being in 64-bit mode, so it does not have to load the GDT.

It only reloads the cr3 (from restore_cr3) and continues on. Note that
the restoration of the restore image page-tables is done prior to this.

After the 'restore_registers' it returns and we end up called
restore_processor_state() - where we reload the GDT. The reload of
the GDT is not needed as bootup kernel has already loaded the GDT which
is at the same physical location as the the restored kernel.

Note that the hibernation path assumes the GDT is correct during its
'restore_registers'. The assumption in the code is that the restored
image is the same as saved - meaning we are not trying to restore
an different kernel in the virtual address space of a new kernel.

Signed-off-by: Konrad Rzeszutek Wilk
Link: http://lkml.kernel.org/r/1365194544-14648-2-git-send-email-konrad.wilk@oracle.com
Cc: Rafael J. Wysocki
Signed-off-by: H. Peter Anvin

Konrad Rzeszutek Wilk
2013-04-12 06:39:38 +0800

16 Mar, 2013

1 commit

1d9d8639c perf,x86: fix kernel crash with PEBS/BTS after suspend/resume ... Browse Code »

This patch fixes a kernel crash when using precise sampling (PEBS)
after a suspend/resume. Turns out the CPU notifier code is not invoked
on CPU0 (BP). Therefore, the DS_AREA (used by PEBS) is not restored properly
by the kernel and keeps it power-on/resume value of 0 causing any PEBS
measurement to crash when running on CPU0.

The workaround is to add a hook in the actual resume code to restore
the DS Area MSR value. It is invoked for all CPUS. So for all but CPU0,
the DS_AREA will be restored twice but this is harmless.

Reported-by: Linus Torvalds
Signed-off-by: Stephane Eranian
Signed-off-by: Linus Torvalds

Stephane Eranian
2013-03-16 00:26:35 +0800

01 Feb, 2013

2 commits

68d00bbeb Merge remote-tracking branch 'origin/x86/mm' into x86/mm2 ... Browse Code »

Explicitly merging these two branches due to nontrivial conflicts and
to allow further work.

Resolved Conflicts:
arch/x86/kernel/head32.c
arch/x86/kernel/head64.c
arch/x86/mm/init_64.c
arch/x86/realmode/init.c

Signed-off-by: H. Peter Anvin

H. Peter Anvin
2013-02-01 18:28:36 +0800
bb112aec5 x86-32, mm: Remove reference to resume_map_numa_kva() ... Browse Code »

Remove reference to removed function resume_map_numa_kva().

Signed-off-by: H. Peter Anvin
Cc: Dave Hansen
Cc:
Link: http://lkml.kernel.org/r/20130131005616.1C79F411@kernel.stglabs.ibm.com

H. Peter Anvin
2013-02-01 06:12:30 +0800

30 Jan, 2013

1 commit

8b78c21d7 x86, 64bit, mm: hibernate use generic mapping_init ... Browse Code »

We should set mappings only for usable memory ranges under max_pfn
Otherwise causes same problem that is fixed by

x86, mm: Only direct map addresses that are marked as E820_RAM

Make it only map range in pfn_mapped array.

Signed-off-by: Yinghai Lu
Link: http://lkml.kernel.org/r/1359058816-7615-34-git-send-email-yinghai@kernel.org
Cc: Pavel Machek
Cc: Rafael J. Wysocki
Cc: linux-pm@vger.kernel.org
Signed-off-by: H. Peter Anvin

Yinghai Lu
2013-01-30 11:32:59 +0800

15 Nov, 2012

2 commits

a71c8bc5d x86, topology: Debug CPU0 hotplug ... Browse Code »

CONFIG_DEBUG_HOTPLUG_CPU0 is for debugging the CPU0 hotplug feature. The switch
offlines CPU0 as soon as possible and boots userspace up with CPU0 offlined.
User can online CPU0 back after boot time. The default value of the switch is
off.

To debug CPU0 hotplug, you need to enable CPU0 offline/online feature by either
turning on CONFIG_BOOTPARAM_HOTPLUG_CPU0 during compilation or giving
cpu0_hotplug kernel parameter at boot.

It's safe and early place to take down CPU0 after all hotplug notifiers
are installed and SMP is booted.

Please note that some applications or drivers, e.g. some versions of udevd,
during boot time may put CPU0 online again in this CPU0 hotplug debug mode.

In this debug mode, setup_local_APIC() may report a warning on max_loops
Link: http://lkml.kernel.org/r/1352835171-3958-15-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin

Fenghua Yu
2012-11-15 07:28:11 +0800
209efae12 x86, hotplug, suspend: Online CPU0 for suspend or hibernate ... Browse Code »

Because x86 BIOS requires CPU0 to resume from sleep, suspend or hibernate can't
be executed if CPU0 is detected offline. To make suspend or hibernate and
further resume succeed, CPU0 must be online.

Signed-off-by: Fenghua Yu
Link: http://lkml.kernel.org/r/1352835171-3958-6-git-send-email-fenghua.yu@intel.com
Signed-off-by: H. Peter Anvin

Fenghua Yu
2012-11-15 01:39:49 +0800

02 Apr, 2012

1 commit

dba69d109 x86, kvm: Call restore_sched_clock_state() only after %gs is initialized ... Browse Code »

s2ram broke due to this KVM commit:

b74f05d61b73 x86: kvmclock: abstract save/restore sched_clock_state

restore_sched_clock_state() methods use percpu data, therefore
they must run after %gs is initialized, but before mtrr_bp_restore()
(due to lockstat using sched_clock).

Move it to the correct place.

Reported-and-tested-by: Konstantin Khlebnikov
Signed-off-by: Marcelo Tosatti
Cc: Avi Kivity
Signed-off-by: Ingo Molnar

Marcelo Tosatti
2012-04-02 19:53:00 +0800

29 Mar, 2012

3 commits

0195c0024 Merge tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/sc… ... Browse Code »
86

…m/linux/kernel/git/dhowells/linux-asm_system

Pull "Disintegrate and delete asm/system.h" from David Howells:
"Here are a bunch of patches to disintegrate asm/system.h into a set of
separate bits to relieve the problem of circular inclusion
dependencies.

I've built all the working defconfigs from all the arches that I can
and made sure that they don't break.

The reason for these patches is that I recently encountered a circular
dependency problem that came about when I produced some patches to
optimise get_order() by rewriting it to use ilog2().

This uses bitops - and on the SH arch asm/bitops.h drags in
asm-generic/get_order.h by a circuituous route involving asm/system.h.

The main difficulty seems to be asm/system.h. It holds a number of
low level bits with no/few dependencies that are commonly used (eg.
memory barriers) and a number of bits with more dependencies that
aren't used in many places (eg. switch_to()).

These patches break asm/system.h up into the following core pieces:

(1) asm/barrier.h

Move memory barriers here. This already done for MIPS and Alpha.

(2) asm/switch_to.h

Move switch_to() and related stuff here.

(3) asm/exec.h

Move arch_align_stack() here. Other process execution related bits
could perhaps go here from asm/processor.h.

(4) asm/cmpxchg.h

Move xchg() and cmpxchg() here as they're full word atomic ops and
frequently used by atomic_xchg() and atomic_cmpxchg().

(5) asm/bug.h

Move die() and related bits.

(6) asm/auxvec.h

Move AT_VECTOR_SIZE_ARCH here.

Other arch headers are created as needed on a per-arch basis."

Fixed up some conflicts from other header file cleanups and moving code
around that has happened in the meantime, so David's testing is somewhat
weakened by that. We'll find out anything that got broken and fix it..

* tag 'split-asm_system_h-for-linus-20120328' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-asm_system: (38 commits)
Delete all instances of asm/system.h
Remove all #inclusions of asm/system.h
Add #includes needed to permit the removal of asm/system.h
Move all declarations of free_initmem() to linux/mm.h
Disintegrate asm/system.h for OpenRISC
Split arch_align_stack() out from asm-generic/system.h
Split the switch_to() wrapper out of asm-generic/system.h
Move the asm-generic/system.h xchg() implementation to asm-generic/cmpxchg.h
Create asm-generic/barrier.h
Make asm-generic/cmpxchg.h #include asm-generic/cmpxchg-local.h
Disintegrate asm/system.h for Xtensa
Disintegrate asm/system.h for Unicore32 [based on ver #3, changed by gxt]
Disintegrate asm/system.h for Tile
Disintegrate asm/system.h for Sparc
Disintegrate asm/system.h for SH
Disintegrate asm/system.h for Score
Disintegrate asm/system.h for S390
Disintegrate asm/system.h for PowerPC
Disintegrate asm/system.h for PA-RISC
Disintegrate asm/system.h for MN10300
...

Linus Torvalds
2012-03-29 06:58:21 +0800
2e7580b0e Merge branch 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

Pull kvm updates from Avi Kivity:
"Changes include timekeeping improvements, support for assigning host
PCI devices that share interrupt lines, s390 user-controlled guests, a
large ppc update, and random fixes."

This is with the sign-off's fixed, hopefully next merge window we won't
have rebased commits.

* 'kvm-updates/3.4' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (130 commits)
KVM: Convert intx_mask_lock to spin lock
KVM: x86: fix kvm_write_tsc() TSC matching thinko
x86: kvmclock: abstract save/restore sched_clock_state
KVM: nVMX: Fix erroneous exception bitmap check
KVM: Ignore the writes to MSR_K7_HWCR(3)
KVM: MMU: make use of ->root_level in reset_rsvds_bits_mask
KVM: PMU: add proper support for fixed counter 2
KVM: PMU: Fix raw event check
KVM: PMU: warn when pin control is set in eventsel msr
KVM: VMX: Fix delayed load of shared MSRs
KVM: use correct tlbs dirty type in cmpxchg
KVM: Allow host IRQ sharing for assigned PCI 2.3 devices
KVM: Ensure all vcpus are consistent with in-kernel irqchip settings
KVM: x86 emulator: Allow PM/VM86 switch during task switch
KVM: SVM: Fix CPL updates
KVM: x86 emulator: VM86 segments must have DPL 3
KVM: x86 emulator: Fix task switch privilege checks
arch/powerpc/kvm/book3s_hv.c: included linux/sched.h twice
KVM: x86 emulator: correctly mask pmc index bits in RDPMC instruction emulation
KVM: mmu_notifier: Flush TLBs before releasing mmu_lock
...

Linus Torvalds
2012-03-29 05:35:31 +0800
f05e798ad Disintegrate asm/system.h for X86 ... Browse Code »

Disintegrate asm/system.h for X86.

Signed-off-by: David Howells
Acked-by: H. Peter Anvin
cc: x86@kernel.org

David Howells
2012-03-29 01:11:12 +0800

20 Mar, 2012

1 commit

b74f05d61 x86: kvmclock: abstract save/restore sched_clock_state ... Browse Code »
43

Upon resume from hibernation, CPU 0's hvclock area contains the old
values for system_time and tsc_timestamp. It is necessary for the
hypervisor to update these values with uptodate ones before the CPU uses
them.

Abstract TSC's save/restore sched_clock_state functions and use
restore_state to write to KVM_SYSTEM_TIME MSR, forcing an update.

Also move restore_sched_clock_state before __restore_processor_state,
since the later calls CONFIG_LOCK_STAT's lockstat_clock (also for TSC).
Thanks to Igor Mammedov for tracking it down.

Fixes suspend-to-disk with kvmclock.

Reviewed-by: Thomas Gleixner
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Marcelo Tosatti
2012-03-20 18:37:45 +0800

22 Feb, 2012

1 commit

1361b83a1 i387: Split up <asm/i387.h> into exported and internal interfaces ... Browse Code »

While various modules include to get access to things we
actually *intend* for them to use, most of that header file was really
pretty low-level internal stuff that we really don't want to expose to
others.

So split the header file into two: the small exported interfaces remain
in , while the internal definitions that are only used by
core architecture code are now in .

The guiding principle for this was to expose functions that we export to
modules, and leave them in , while stuff that is used by
task switching or was marked GPL-only is in .

The fpu-internal.h file could be further split up too, especially since
arch/x86/kvm/ uses some of the remaining stuff for its module. But that
kvm usage should probably be abstracted out a bit, and at least now the
internal FPU accessor functions are much more contained. Even if it
isn't perhaps as contained as it _could_ be.

Signed-off-by: Linus Torvalds
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1202211340330.5354@i5.linux-foundation.org
Signed-off-by: H. Peter Anvin

Linus Torvalds
2012-02-22 06:12:54 +0800