Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

29 Apr, 2008

1 commit

7d195a540 proper extern for late_time_init ... Browse Code »

Add a proper extern for late_time_init in include/linux/init.h

Signed-off-by: Adrian Bunk
Acked-by: Ingo Molnar
Cc: Thomas Gleixner
Cc: john stultz
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Adrian Bunk
2008-04-29 23:06:03 +0800

28 Apr, 2008

9 commits

fd9679563 gxfb/lxfb: detect framebuffer size using an MSR if VSA2 isn't available ... Browse Code »

If there's no VSA2 (ie, if we're using tinybios or OpenFirmware), use the
GLIU's P2D Range Offset Descriptor to determine how much memory we have
available for the framebuffer.

Originally based on a patch by Jordan Crouse. Tested with OpenFirmware;
Pascal informs me that tinybios has a stub that fills in P2D_RO0.

Signed-off-by: Andres Salomon
Cc: Jordan Crouse
Cc: "Antonino A. Daplas"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andres Salomon
2008-04-28 23:58:40 +0800
61a517a06 gxfb/lxfb: use VSA definitions when fetching framebuffer size ... Browse Code »

..Rather than using magic constants.

Signed-off-by: Andres Salomon
Cc: Jordan Crouse
Cc: "Antonino A. Daplas"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andres Salomon
2008-04-28 23:58:40 +0800
e9338364e x86: GEODE: add Virtual Systems Architecture detection ... Browse Code »

This is generic VSA2 detection. It's used by OLPC to determine whether or not
the BIOS contains VSA2, but since other BIOSes are coming out that don't use
the VSA (ie, tinybios), it might end up being useful for others.

Signed-off-by: Andres Salomon
Acked-by: Alan Cox
Cc: Jordan Crouse
Cc: Ingo Molnar
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andres Salomon
2008-04-28 23:58:35 +0800
32bf87e36 x86: geode: MSR cleanup ... Browse Code »

This cleans up a few MSR-using drivers in the following manner:
- Ensures MSRs are all defined in asm/geode.h, rather than in misc
places
- Makes the naming consistent; cs553[56] ones begin with MSR_,
GX-specific ones start with MSR_GX_, and LX-specific ones start
with MSR_LX_. Also, make the names match the data sheet.
- Use MSR names rather than numbers in source code
- Document the fact that the LX's MSR_PADSEL has the wrong value
in the data sheet. That's, uh, good to note.

Signed-off-by: Andres Salomon
Acked-by: Jordan Crouse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Andres Salomon
2008-04-28 23:58:35 +0800
7f2e9525b hugetlbfs: common code update for s390 ... Browse Code »

Huge ptes have a special type on s390 and cannot be handled with the standard
pte functions in certain cases, e.g. because of a different location of the
invalid bit. This patch adds some new architecture- specific functions to
hugetlb common code, as a prerequisite for the s390 large page support.

This won't affect other architectures in functionality, but I need to add some
new dummy inline functions to the headers.

Acked-by: Martin Schwidefsky
Signed-off-by: Gerald Schaefer
Cc: Paul Mundt
Cc: "Luck, Tony"
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Gerald Schaefer
2008-04-28 23:58:25 +0800
8fe627ec5 hugetlbfs: add missing TLB flush to hugetlb_cow() ... Browse Code »

A cow break on a hugetlbfs page with page_count > 1 will set a new pte with
set_huge_pte_at(), w/o any tlb flush operation. The old pte will remain in
the tlb and subsequent write access to the page will result in a page fault
loop, for as long as it may take until the tlb is flushed from somewhere else.
This patch introduces an architecture-specific huge_ptep_clear_flush()
function, which is called before the the set_huge_pte_at() in hugetlb_cow().

ATTENTION: This is just a nop on all architectures for now, the s390
implementation will come with our large page patch later. Other architectures
should define their own huge_ptep_clear_flush() if needed.

Acked-by: Martin Schwidefsky
Signed-off-by: Gerald Schaefer
Cc: Paul Mundt
Cc: "Luck, Tony"
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Gerald Schaefer
2008-04-28 23:58:25 +0800
6d779079b hugetlbfs: architecture header cleanup ... Browse Code »

This patch moves all architecture functions for hugetlb to architecture header
files (include/asm-foo/hugetlb.h) and converts all macros to inline functions.
It also removes (!) ARCH_HAS_HUGEPAGE_ONLY_RANGE,
ARCH_HAS_HUGETLB_FREE_PGD_RANGE, ARCH_HAS_PREPARE_HUGEPAGE_RANGE,
ARCH_HAS_SETCLEAR_HUGE_PTE and ARCH_HAS_HUGETLB_PREFAULT_HOOK.

Getting rid of the ARCH_HAS_xxx #ifdef and macro fugliness should increase
readability and maintainability, at the price of some code duplication. An
asm-generic common part would have reduced the loc, but we would end up with
new ARCH_HAS_xxx defines eventually.

Acked-by: Martin Schwidefsky
Signed-off-by: Gerald Schaefer
Cc: Paul Mundt
Cc: "Luck, Tony"
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "David S. Miller"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Gerald Schaefer
2008-04-28 23:58:25 +0800
7e675137a mm: introduce pte_special pte bit ... Browse Code »

s390 for one, cannot implement VM_MIXEDMAP with pfn_valid, due to their memory
model (which is more dynamic than most). Instead, they had proposed to
implement it with an additional path through vm_normal_page(), using a bit in
the pte to determine whether or not the page should be refcounted:

vm_normal_page()
{
...
if (unlikely(vma->vm_flags & (VM_PFNMAP|VM_MIXEDMAP))) {
if (vma->vm_flags & VM_MIXEDMAP) {
#ifdef s390
if (!mixedmap_refcount_pte(pte))
return NULL;
#else
if (!pfn_valid(pfn))
return NULL;
#endif
goto out;
}
...
}

This is fine, however if we are allowed to use a bit in the pte to determine
refcountedness, we can use that to _completely_ replace all the vma based
schemes. So instead of adding more cases to the already complex vma-based
scheme, we can have a clearly seperate and simple pte-based scheme (and get
slightly better code generation in the process):

vm_normal_page()
{
#ifdef s390
if (!mixedmap_refcount_pte(pte))
return NULL;
return pte_page(pte);
#else
...
#endif
}

And finally, we may rather make this concept usable by any architecture rather
than making it s390 only, so implement a new type of pte state for this.
Unfortunately the old vma based code must stay, because some architectures may
not be able to spare pte bits. This makes vm_normal_page a little bit more
ugly than we would like, but the 2 cases are clearly seperate.

So introduce a pte_special pte state, and use it in mm/memory.c. It is
currently a noop for all architectures, so this doesn't actually result in any
compiled code changes to mm/memory.o.

BTW:
I haven't put vm_normal_page() into arch code as-per an earlier suggestion.
The reason is that, regardless of where vm_normal_page is actually
implemented, the *abstraction* is still exactly the same. Also, while it
depends on whether the architecture has pte_special or not, that is the
only two possible cases, and it really isn't an arch specific function --
the role of the arch code should be to provide primitive functions and
accessors with which to build the core code; pte_special does that. We do
not want architectures to know or care about vm_normal_page itself, and
we definitely don't want them being able to invent something new there
out of sight of mm/ code. If we made vm_normal_page an arch function, then
we have to make vm_insert_mixed (next patch) an arch function too. So I
don't think moving it to arch code fundamentally improves any abstractions,
while it does practically make the code more difficult to follow, for both
mm and arch developers, and easier to misuse.

[akpm@linux-foundation.org: build fix]
Signed-off-by: Nick Piggin
Acked-by: Carsten Otte
Cc: Jared Hulbert
Cc: Martin Schwidefsky
Cc: Heiko Carstens
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Nick Piggin
2008-04-28 23:58:23 +0800
42cadc860 Merge branch 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm ... Browse Code »

* 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm: (147 commits)
KVM: kill file->f_count abuse in kvm
KVM: MMU: kvm_pv_mmu_op should not take mmap_sem
KVM: SVM: remove selective CR0 comment
KVM: SVM: remove now obsolete FIXME comment
KVM: SVM: disable CR8 intercept when tpr is not masking interrupts
KVM: SVM: sync V_TPR with LAPIC.TPR if CR8 write intercept is disabled
KVM: export kvm_lapic_set_tpr() to modules
KVM: SVM: sync TPR value to V_TPR field in the VMCB
KVM: ppc: PowerPC 440 KVM implementation
KVM: Add MAINTAINERS entry for PowerPC KVM
KVM: ppc: Add DCR access information to struct kvm_run
ppc: Export tlb_44x_hwater for KVM
KVM: Rename debugfs_dir to kvm_debugfs_dir
KVM: x86 emulator: fix lea to really get the effective address
KVM: x86 emulator: fix smsw and lmsw with a memory operand
KVM: x86 emulator: initialize src.val and dst.val for register operands
KVM: SVM: force a new asid when initializing the vmcb
KVM: fix kvm_vcpu_kick vs __vcpu_run race
KVM: add ioctls to save/store mpstate
KVM: Rename VCPU_MP_STATE_* to KVM_MP_STATE_*
...

Linus Torvalds
2008-04-28 01:13:52 +0800

27 Apr, 2008

30 commits

62d9f0dbc KVM: add ioctls to save/store mpstate ... Browse Code »

So userspace can save/restore the mpstate during migration.

[avi: export the #define constants describing the value]
[christian: add s390 stubs]
[avi: ditto for ia64]

Signed-off-by: Marcelo Tosatti
Signed-off-by: Christian Borntraeger
Signed-off-by: Carsten Otte
Signed-off-by: Avi Kivity

Marcelo Tosatti
2008-04-27 23:21:16 +0800
a45352908 KVM: Rename VCPU_MP_STATE_* to KVM_MP_STATE_* ... Browse Code »

We wish to export it to userspace, so move it into the kvm namespace.

Signed-off-by: Avi Kivity

Avi Kivity
2008-04-27 17:04:13 +0800
2714d1d3d KVM: Add trace markers ... Browse Code »

Trace markers allow userspace to trace execution of a virtual machine
in order to monitor its performance.

Signed-off-by: Feng (Eric) Liu
Signed-off-by: Avi Kivity

Feng (Eric) Liu
2008-04-27 17:01:19 +0800
53371b509 KVM: SVM: add intercept for machine check exception ... Browse Code »

To properly forward a MCE occured while the guest is running to the host, we
have to intercept this exception and call the host handler by hand. This is
implemented by this patch.

Signed-off-by: Joerg Roedel
Signed-off-by: Avi Kivity

Joerg Roedel
2008-04-27 17:01:18 +0800
35149e212 KVM: MMU: Don't assume struct page for x86 ... Browse Code »

This patch introduces a gfn_to_pfn() function and corresponding functions like
kvm_release_pfn_dirty(). Using these new functions, we can modify the x86
MMU to no longer assume that it can always get a struct page for any given gfn.

We don't want to eliminate gfn_to_page() entirely because a number of places
assume they can do gfn_to_page() and then kmap() the results. When we support
IO memory, gfn_to_page() will fail for IO pages although gfn_to_pfn() will
succeed.

This does not implement support for avoiding reference counting for reserved
RAM or for IO memory. However, it should make those things pretty straight
forward.

Since we're only introducing new common symbols, I don't think it will break
the non-x86 architectures but I haven't tested those. I've tested Intel,
AMD, NPT, and hugetlbfs with Windows and Linux guests.

[avi: fix overflow when shifting left pfns by adding casts]

Signed-off-by: Anthony Liguori
Signed-off-by: Avi Kivity

Anthony Liguori
2008-04-27 17:01:15 +0800
9c20456a3 KVM: function declaration parameter name cleanup ... Browse Code »

The kvm_host.h file for x86 declares the functions kvm_set_cr[0348]. In the
header file their second parameter is named cr0 in all cases. This patch
renames the parameters so that they match the function name.

Signed-off-by: Joerg Roedel
Signed-off-by: Avi Kivity

Joerg Roedel
2008-04-27 17:00:55 +0800
3200f405a KVM: MMU: unify slots_lock usage ... Browse Code »

Unify slots_lock acquision around vcpu_run(). This is simpler and less
error-prone.

Also fix some callsites that were not grabbing the lock properly.

[avi: drop slots_lock while in guest mode to avoid holding the lock
for indefinite periods]

Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Marcelo Tosatti
2008-04-27 17:00:52 +0800
37817f298 KVM: x86: hardware task switching support ... Browse Code »

This emulates the x86 hardware task switch mechanism in software, as it is
unsupported by either vmx or svm. It allows operating systems which use it,
like freedos, to run as kvm guests.

Signed-off-by: Izik Eidus
Signed-off-by: Avi Kivity

Izik Eidus
2008-04-27 17:00:39 +0800
2e4d26534 KVM: x86: add functions to get the cpl of vcpu ... Browse Code »

Signed-off-by: Izik Eidus
Signed-off-by: Avi Kivity

Izik Eidus
2008-04-27 17:00:38 +0800
69a9f69bb KVM: Move some x86 specific constants and structures to include/asm-x86 ... Browse Code »

Signed-off-by: Avi Kivity

Avi Kivity
2008-04-27 17:00:34 +0800
3c62c6250 x86: make native_machine_shutdown non-static ... Browse Code »

it will allow external users to call it. It is mainly
useful for routines that will override its machine_ops
field for its own special purposes, but want to call the
normal shutdown routine after they're done

Signed-off-by: Glauber Costa
Signed-off-by: Avi Kivity

Glauber Costa
2008-04-27 17:00:30 +0800
ed23dc6f5 x86: allow machine_crash_shutdown to be replaced ... Browse Code »

This patch a llows machine_crash_shutdown to
be replaced, just like any of the other functions
in machine_ops

Signed-off-by: Glauber Costa
Signed-off-by: Avi Kivity

Glauber Costa
2008-04-27 17:00:29 +0800
2f333bcb4 KVM: MMU: hypercall based pte updates and TLB flushes ... Browse Code »

Hypercall based pte updates are faster than faults, and also allow use
of the lazy MMU mode to batch operations.

Don't report the feature if two dimensional paging is enabled.

[avi:
- one mmu_op hypercall instead of one per op
- allow 64-bit gpa on hypercall
- don't pass host errors (-ENOMEM) to guest]

[akpm: warning fix on i386]

Signed-off-by: Marcelo Tosatti
Signed-off-by: Andrew Morton
Signed-off-by: Avi Kivity

Marcelo Tosatti
2008-04-27 17:00:27 +0800
9f8112859 KVM: Provide unlocked version of emulator_write_phys() ... Browse Code »

Signed-off-by: Avi Kivity

Avi Kivity
2008-04-27 17:00:26 +0800
a28e4f5a6 KVM: add basic paravirt support ... Browse Code »

Add basic KVM paravirt support. Avoid vm-exits on IO delays.

Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Marcelo Tosatti
2008-04-27 17:00:24 +0800
e0f63cb92 KVM: Add save/restore supporting of in kernel PIT ... Browse Code »

Signed-off-by: Sheng Yang
Signed-off-by: Avi Kivity

Sheng Yang
2008-04-27 17:00:22 +0800
7837699fa KVM: In kernel PIT model ... Browse Code »

The patch moves the PIT model from userspace to kernel, and increases
the timer accuracy greatly.

[marcelo: make last_injected_time per-guest]

Signed-off-by: Sheng Yang
Signed-off-by: Marcelo Tosatti
Tested-and-Acked-by: Alex Davis
Signed-off-by: Avi Kivity

Sheng Yang
2008-04-27 17:00:21 +0800
2d3ad1f40 KVM: Prefix control register accessors with kvm_ to avoid namespace pollution ... Browse Code »

Names like 'set_cr3()' look dangerously close to affecting the host.

Signed-off-by: Avi Kivity

Avi Kivity
2008-04-27 16:53:26 +0800
05da45583 KVM: MMU: large page support ... Browse Code »

Create large pages mappings if the guest PTE's are marked as such and
the underlying memory is hugetlbfs backed. If the largepage contains
write-protected pages, a large pte is not used.

Gives a consistent 2% improvement for data copies on ram mounted
filesystem, without NPT/EPT.

Anthony measures a 4% improvement on 4-way kernbench, with NPT.

Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Marcelo Tosatti
2008-04-27 16:53:25 +0800
2e53d63ac KVM: MMU: ignore zapped root pagetables ... Browse Code »

Mark zapped root pagetables as invalid and ignore such pages during lookup.

This is a problem with the cr3-target feature, where a zapped root table fools
the faulting code into creating a read-only mapping. The result is a lockup
if the instruction can't be emulated.

Signed-off-by: Marcelo Tosatti
Cc: Anthony Liguori
Signed-off-by: Avi Kivity

Marcelo Tosatti
2008-04-27 16:53:25 +0800
f11c3a8d8 KVM: Add stat counter for hypercalls ... Browse Code »

Signed-off-by: Amit Shah
Signed-off-by: Avi Kivity

Amit Shah
2008-04-27 16:53:24 +0800
18068523d KVM: paravirtualized clocksource: host part ... Browse Code »

This is the host part of kvm clocksource implementation. As it does
not include clockevents, it is a fairly simple implementation. We
only have to register a per-vcpu area, and start writing to it periodically.

The area is binary compatible with xen, as we use the same shadow_info
structure.

[marcelo: fix bad_page on MSR_KVM_SYSTEM_TIME]
[avi: save full value of the msr, even if enable bit is clear]
[avi: clear previous value of time_page]

Signed-off-by: Glauber de Oliveira Costa
Signed-off-by: Marcelo Tosatti
Signed-off-by: Avi Kivity

Glauber de Oliveira Costa
2008-04-27 16:53:22 +0800
cc4b6871e KVM: export the load_pdptrs() function to modules ... Browse Code »

The load_pdptrs() function is required in the SVM module for NPT support.

Signed-off-by: Joerg Roedel
Signed-off-by: Avi Kivity

Joerg Roedel
2008-04-27 16:53:20 +0800
185526721 KVM: export information about NPT to generic x86 code ... Browse Code »

The generic x86 code has to know if the specific implementation uses Nested
Paging. In the generic code Nested Paging is called Two Dimensional Paging
(TDP) to avoid confusion with (future) TDP implementations of other vendors.
This patch exports the availability of TDP to the generic x86 code.

Signed-off-by: Joerg Roedel
Signed-off-by: Avi Kivity

Joerg Roedel
2008-04-27 16:53:19 +0800
f2b4b7ddf KVM: make EFER_RESERVED_BITS configurable for architecture code ... Browse Code »

This patch give the SVM and VMX implementations the ability to add some bits
the guest can set in its EFER register.

Signed-off-by: Joerg Roedel
Signed-off-by: Avi Kivity

Joerg Roedel
2008-04-27 16:53:18 +0800
2384d2b32 KVM: VMX: Enable Virtual Processor Identification (VPID) ... Browse Code »

To allow TLB entries to be retained across VM entry and VM exit, the VMM
can now identify distinct address spaces through a new virtual-processor ID
(VPID) field of the VMCS.

[avi: drop vpid_sync_all()]
[avi: add "cc" to asm constraints]

Signed-off-by: Sheng Yang
Signed-off-by: Avi Kivity

Sheng Yang
2008-04-27 16:53:17 +0800
1ae0a13de KVM: MMU: Simplify hash table indexing ... Browse Code »

Signed-off-by: Yaozu (Eddie) Dong
Signed-off-by: Avi Kivity

Dong, Eddie
2008-04-27 16:53:14 +0800
7f424a8b0 fix idle (arch, acpi and apm) and lockdep ... Browse Code »

OK, so 25-mm1 gave a lockdep error which made me look into this.

The first thing that I noticed was the horrible mess; the second thing I
saw was hacks like: 71e93d15612c61c2e26a169567becf088e71b8ff

The problem is that arch idle routines are somewhat inconsitent with
their IRQ state handling and instead of fixing _that_, we go paper over
the problem.

So the thing I've tried to do is set a standard for idle routines and
fix them all up to adhere to that. So the rules are:

idle routines are entered with IRQs disabled
idle routines will exit with IRQs enabled

Nearly all already did this in one form or another.

Merge the 32 and 64 bit bits so they no longer have different bugs.

As for the actual lockdep warning; __sti_mwait() did a plainly un-annotated
irq-enable.

Signed-off-by: Peter Zijlstra
Tested-by: Bob Copeland
Signed-off-by: Ingo Molnar

Peter Zijlstra
2008-04-27 06:01:45 +0800
c3bf9bc24 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/li… ... Browse Code »

…nux-2.6-x86-bigbox-bootmem-v3

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/x86/linux-2.6-x86-bigbox-bootmem-v3:
x86_64/mm: check and print vmemmap allocation continuous
x86_64: fix setup_node_bootmem to support big mem excluding with memmap
x86_64: make reserve_bootmem_generic() use new reserve_bootmem()
mm: allow reserve_bootmem() cross nodes
mm: offset align in alloc_bootmem()
mm: fix alloc_bootmem_core to use fast searching for all nodes
mm: make mem_map allocation continuous

Linus Torvalds
2008-04-27 05:04:32 +0800
1a27fc0a4 x86_64: fix setup_node_bootmem to support big mem excluding with memmap ... Browse Code »

typical case: four sockets system, every node has 4g ram, and we are using:

memmap=10g$4g

to mask out memory on node1 and node2

when numa is enabled, early_node_mem is used to get node_data and node_bootmap.

if it can not get memory from the same node with find_e820_area(), it will
use alloc_bootmem to get buff from previous nodes.

so check it and print out some info about it.

need to move early_res_to_bootmem into every setup_node_bootmem.
and it takes range that node has. otherwise alloc_bootmem could return addr
that reserved early.

depends on "mm: make reserve_bootmem can crossed the nodes".

Signed-off-by: Yinghai Lu
Signed-off-by: Ingo Molnar

Yinghai Lu
2008-04-27 04:51:08 +0800