21 Sep, 2010
1 commit
-
We used to have a hypercall which reloaded the entire GDT, then we
switched to one which loaded a single entry (to match the IDT code).Some comments were not updated, so fix them.
Signed-off-by: Rusty Russell
Reported by: Eviatar Khen
29 May, 2010
1 commit
14 Apr, 2010
1 commit
-
This is a partial revert of 4cd8b5e2a159 "lguest: use KVM hypercalls";
we revert to using (just as questionable but more reliable) int $15 for
hypercalls. I didn't revert the register mapping, so we still use the
same calling convention as kvm.KVM in more recent incarnations stopped injecting a fault when a guest
tried to use the VMCALL instruction from ring 1, so lguest under kvm
fails to make hypercalls. It was nice to share code with our KVM
cousins, but this was overreach.Signed-off-by: Rusty Russell
Cc: Matias Zabaljauregui
Cc: Avi Kivity
15 Mar, 2010
1 commit
-
acpi=ht was important in 2003 -- before ACPI was
universally deployed and enabled by default in
the major Linux distributions.At that time, there were a fair number of people who
or chose to, or needed to, run with acpi=off,
yet also wanted access to Hyper-threading.Today we find that many invocations of "acpi=ht"
are accidental, and thus is it possible that it
is doing more harm than good.In 2.6.34, we warn on invocation of acpi=ht.
In 2.6.35, we delete the boot option.Signed-off-by: Len Brown
23 Sep, 2009
1 commit
-
We used to defer it, so lockdep was happy. We now init lockdep early
anyway, so just do it after that.Signed-off-by: Rusty Russell
16 Sep, 2009
1 commit
-
get/set_wallclock() have already a set of platform dependent
implementations (default, EFI, paravirt). MRST will add another
variant.Moving them to platform ops simplifies the existing code and minimizes
the effort to integrate new variants.Signed-off-by: Feng Tang
LKML-Reference:
Signed-off-by: Thomas Gleixner
31 Aug, 2009
3 commits
-
TSC calibration is modified by the vmware hypervisor and paravirt by
separate means. Moorestown wants to add its own calibration routine as
well. So make calibrate_tsc a proper x86_init_ops function and
override it by paravirt or by the early setup of the vmware
hypervisor.Signed-off-by: Thomas Gleixner
-
The timer init code is convoluted with several quirks and the paravirt
timer chooser. Figuring out which code path is actually taken is not
for the faint hearted.Move the numaq TSC quirk to tsc_pre_init x86_init_ops function and
replace the paravirt time chooser and the remaining x86 quirk with a
simple x86_init_ops function.Signed-off-by: Thomas Gleixner
-
irq_init is overridden by x86_quirks and by paravirts. Unify the whole
mess and make it an unconditional x86_init_ops function which defaults
to the standard function and can be overridden by the early platform
code.Signed-off-by: Thomas Gleixner
27 Aug, 2009
1 commit
-
memory_setup is overridden by x86_quirks and by paravirts with weak
functions and quirks. Unify the whole mess and make it an
unconditional x86_init_ops function which defaults to the standard
function and can be overridden by the early platform code.Signed-off-by: Thomas Gleixner
30 Jul, 2009
2 commits
-
Every so often, after code shuffles, I need to go through and unbitrot
the Lguest Journey (see drivers/lguest/README). Since we now use RCU in
a simple form in one place I took the opportunity to expand that explanation.Signed-off-by: Rusty Russell
Cc: Ingo Molnar
Cc: Paul McKenney -
I don't really notice it (except to begrudge the extra vertical
space), but Ingo does. And he pointed out that one excuse of lguest
is as a teaching tool, it should set a good example.Signed-off-by: Rusty Russell
Cc: Ingo Molnar
17 Jul, 2009
2 commits
-
Avoid the following:
[ 0.012093] WARNING: at arch/x86/kernel/apic/apic.c:249 native_apic_write_dummy+0x2f/0x40()Rather than chase each new cpuid-detected feature, just lie about the highest
valid CPUID so this code is never run.Signed-off-by: Rusty Russell
-
fix: "make Guest" was complaining about duplicated G:032
Signed-off-by: Matias Zabaljauregui
Signed-off-by: Rusty Russell
12 Jun, 2009
7 commits
-
This version requires that host and guest have the same PAE status.
NX cap is not offered to the guest, yet.Signed-off-by: Matias Zabaljauregui
Signed-off-by: Rusty Russell -
Add support for kvm_hypercall4(); PAE wants it.
Signed-off-by: Matias Zabaljauregui
Signed-off-by: Rusty Russell -
replace LHCALL_SET_PMD with LHCALL_SET_PGD hypercall name
(That's really what it is, and the confusion gets worse with PAE support)Signed-off-by: Matias Zabaljauregui
Signed-off-by: Rusty Russell
Reported-by: Jeremy Fitzhardinge -
Some cleanups and replace direct assignment with native_set_* macros which properly handle 64-bit entries when PAE is activated
Signed-off-by: Matias Zabaljauregui
Signed-off-by: Rusty Russell -
The downside of the last patch which made restore_flags and irq_enable
check interrupts is that they are now too big to be patched directly
into the callsites, so the C versions are always used.But the C versions go via PV_CALLEE_SAVE_REGS_THUNK which saves all
the registers. In fact, we don't need any registers in the fast path,
so we can do better than this if we actually code them in assembler.The results are in the noise, but since it's about the same amount of
code, it's worth applying.1GB Guest->Host: input(suppressed),output(suppressed)
Before:
Seconds: 0:16.53
Packets: 377268,753673
Interrupts: 22461,24297
Notifications: 1(5245),21303(732370)
Net IRQs triggered: 377023(245),42578(711095)After:
Seconds: 0:16.48
Packets: 377289,753673
Interrupts: 22281,24465
Notifications: 1(5245),21296(732377)
Net IRQs triggered: 377060(229),42564(711109)Signed-off-by: Rusty Russell
-
lguest never checked for pending interrupts when enabling interrupts, and
things still worked. However, it makes a significant difference to TCP
performance, so it's time we fixed it by introducing a pending_irq flag
and checking it on irq_restore and irq_enable.These two routines are now too big to patch into the 8/10 bytes
patch space, so we drop that code.Note: The high latency on interrupt delivery had a very curious
effect: once everything else was optimized, networking without GSO was
faster than networking with GSO, since more interrupts were sent and
hence a greater chance of one getting through to the Guest!Note2: (Almost) Closing the same loophole for iret doesn't have any
measurable effect, so I'm leaving that patch for the moment.Before:
1GB tcpblast Guest->Host: 30.7 seconds
1GB tcpblast Guest->Host (no GSO): 76.0 secondsAfter:
1GB tcpblast Guest->Host: 6.8 seconds
1GB tcpblast Guest->Host (no GSO): 27.8 secondsSigned-off-by: Rusty Russell
-
Copy from arch/x86/kernel/irqinit_32.c: we don't use the vectors beyond
LGUEST_IRQS (if any), but we might as well set them all.Signed-off-by: Rusty Russell
11 Jun, 2009
2 commits
-
* 'x86-xen-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (42 commits)
xen: cache cr0 value to avoid trap'n'emulate for read_cr0
xen/x86-64: clean up warnings about IST-using traps
xen/x86-64: fix breakpoints and hardware watchpoints
xen: reserve Xen start_info rather than e820 reserving
xen: add FIX_TEXT_POKE to fixmap
lguest: update lazy mmu changes to match lguest's use of kvm hypercalls
xen: honour VCPU availability on boot
xen: add "capabilities" file
xen: drop kexec bits from /sys/hypervisor since kexec isn't implemented yet
xen/sys/hypervisor: change writable_pt to features
xen: add /sys/hypervisor support
xen/xenbus: export xenbus_dev_changed
xen: use device model for suspending xenbus devices
xen: remove suspend_cancel hook
xen/dev-evtchn: clean up locking in evtchn
xen: export ioctl headers to userspace
xen: add /dev/xen/evtchn driver
xen: add irq_from_evtchn
xen: clean up gate trap/interrupt constants
xen: set _PAGE_NX in __supported_pte_mask before pagetable construction
... -
* 'irq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (76 commits)
x86, apic: Fix dummy apic read operation together with broken MP handling
x86, apic: Restore irqs on fail paths
x86: Print real IOAPIC version for x86-64
x86: enable_update_mptable should be a macro
sparseirq: Allow early irq_desc allocation
x86, io-apic: Don't mark pin_programmed early
x86, irq: don't call mp_config_acpi_gsi() if update_mptable is not enabled
x86, irq: update_mptable needs pci_routeirq
x86: don't call read_apic_id if !cpu_has_apic
x86, apic: introduce io_apic_irq_attr
x86/pci: add 4 more return parameters to IO_APIC_get_PCI_irq_vector(), fix
x86: read apic ID in the !acpi_lapic case
x86: apic: Fixmap apic address even if apic disabled
x86: display extended apic registers with print_local_APIC and cpu_debug code
x86: read apic ID in the !acpi_lapic case
x86: clean up and fix setup_clear/force_cpu_cap handling
x86: apic: Check rev 3 fadt correctly for physical_apic bit
x86/pci: update pirq_enable_irq() to setup io apic routing
x86/acpi: move setup io apic routing out of CONFIG_ACPI scope
x86/pci: add 4 more return parameters to IO_APIC_get_PCI_irq_vector()
...
05 Jun, 2009
1 commit
-
We don't set up the canary; let's disable stack protector on boot.c so
we can get into lguest_init, then set it up. As a side effect,
switch_to_new_gdt() sets up %fs for us properly too.Signed-off-by: Rusty Russell
Acked-by: Tejun Heo
Signed-off-by: Linus Torvalds
08 May, 2009
1 commit
-
Conflicts:
arch/frv/include/asm/pgtable.h
arch/x86/include/asm/required-features.h
arch/x86/xen/mmu.cMerge reason: x86/xen was on a .29 base still, move it to a fresher
branch and pick up Xen fixes as well, plus resolve
conflictsSigned-off-by: Ingo Molnar
28 Apr, 2009
1 commit
-
This simplifies the node awareness of the code. All our allocators
only deal with a NUMA node ID locality not with CPU ids anyway - so
there's no need to maintain (and transform) a CPU id all across the
IRq layer.v2: keep move_irq_desc related
[ Impact: cleanup, prepare IRQ code to be NUMA-aware ]
Signed-off-by: Yinghai Lu
Cc: Andrew Morton
Cc: Suresh Siddha
Cc: "Eric W. Biederman"
Cc: Rusty Russell
Cc: Jeremy Fitzhardinge
LKML-Reference:
Signed-off-by: Ingo Molnar
22 Apr, 2009
1 commit
-
Pass clocksource pointer to the read() callback for clocksources. This
allows us to share the callback between multiple instances.[hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
[akpm@linux-foundation.org: cleanup]
Signed-off-by: Magnus Damm
Acked-by: John Stultz
Cc: Thomas Gleixner
Signed-off-by: Hugh Dickins
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
19 Apr, 2009
1 commit
-
Fixes guest crash 'lguest: bad read address 0x4800000 len 256'
The new per-cpu allocator ends up handing a non-linear address to
write_gdt_entry. We do __pa() on it, and hand it to the host, which
kills us.I've long wanted to make the hypercall "LOAD_GDT_ENTRY" to match the IDT
code, but had no pressing reason until now.Signed-off-by: Rusty Russell
Cc: lguest@ozlabs.org
08 Apr, 2009
2 commits
-
Duplicate hcall -> kvm_hypercall0 convertion from "lguest: use KVM
hypercalls".Signed-off-by: Jeremy Fitzhardinge
Cc: Matias Zabaljauregui
Cc: Rusty Russell -
* commit 'origin/master': (4825 commits)
Fix build errors due to CONFIG_BRANCH_TRACER=y
parport: Use the PCI IRQ if offered
tty: jsm cleanups
Adjust path to gpio headers
KGDB_SERIAL_CONSOLE check for module
Change KCONFIG name
tty: Blackin CTS/RTS
Change hardware flow control from poll to interrupt driven
Add support for the MAX3100 SPI UART.
lanana: assign a device name and numbering for MAX3100
serqt: initial clean up pass for tty side
tty: Use the generic RS485 ioctl on CRIS
tty: Correct inline types for tty_driver_kref_get()
splice: fix deadlock in splicing to file
nilfs2: support nanosecond timestamp
nilfs2: introduce secondary super block
nilfs2: simplify handling of active state of segments
nilfs2: mark minor flag for checkpoint created by internal operation
nilfs2: clean up sketch file
nilfs2: super block operations fix endian bug
...Conflicts:
arch/x86/include/asm/thread_info.h
arch/x86/lguest/boot.c
drivers/xen/manage.c
31 Mar, 2009
1 commit
-
* git://git.kernel.org/pub/scm/linux/kernel/git/rusty/linux-2.6-lguest-and-virtio:
lguest: barrier me harder
lguest: use bool instead of int
lguest: use KVM hypercalls
lguest: wire up pte_update/pte_update_defer
lguest: fix spurious BUG_ON() on invalid guest stack.
virtio: more neatening of virtio_ring macros.
virtio: fix BAD_RING, START_US and END_USE macros
30 Mar, 2009
4 commits
-
Impact: cleanup
This patch allow us to use KVM hypercalls
Signed-off-by: Matias Zabaljauregui
Signed-off-by: Rusty Russell -
Impact: intermittent guest segv/crash fix
I've been seeing random guest bad address crashes and segmentation faults:
bisect led to 4f98a2fee8 (vmscan: split LRU lists into anon & file sets),
but that's a red herring.It turns out that lguest never hooked up the pte_update/pte_update_defer
calls, so our ptes were not always in sync. After the vmscan commit, the
bug became reproducible; now a fsck in a 64MB guest causes reproducible
pagetable corruption.Signed-off-by: Rusty Russell
Cc: jeremy@xensource.com
Cc: virtualization@lists.osdl.org
Cc: stable@kernel.org -
Impact: fix lazy context switch API
Pass the previous and next tasks into the context switch start
end calls, so that the called functions can properly access the
task state (esp in end_context_switch, in which the next task
is not yet completely current).Signed-off-by: Jeremy Fitzhardinge
Acked-by: Peter Zijlstra -
Impact: allow preemption during lazy mmu updates
If we're in lazy mmu mode when context switching, leave
lazy mmu mode, but remember the task's state in
TIF_LAZY_MMU_UPDATES. When we resume the task, check this
flag and re-enter lazy mmu mode if its set.This sets things up for allowing lazy mmu mode while preemptible,
though that won't actually be active until the next change.Signed-off-by: Jeremy Fitzhardinge
Acked-by: Peter Zijlstra
28 Mar, 2009
1 commit
-
Conflicts:
arch/parisc/kernel/irq.c
arch/x86/include/asm/fixmap_64.h
arch/x86/include/asm/setup.h
kernel/irq/handle.cSemantic merge:
arch/x86/include/asm/fixmap.hSigned-off-by: Ingo Molnar
15 Mar, 2009
1 commit
-
Impact: use new interface instead of previous ad hoc implementation
Rather than having special purpose init_pg_table_start/end variables
to delimit the kernel pagetable built by head_32.S, just use the brk
mechanism to extend the bss for the new pagetable.This patch removes init_pg_table_start/end and pg0, defines __brk_base
(which is page-aligned and immediately follows _end), initializes
the brk region to start there, and uses it for the 32-bit pagetable.Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: H. Peter Anvin
11 Mar, 2009
1 commit
09 Mar, 2009
2 commits
-
Impact: remove lots of lguest boot WARN_ON() when CONFIG_SPARSE_IRQ=y
We now need to call irq_to_desc_alloc_cpu() before
set_irq_chip_and_handler_name(), but we can't do that from init_IRQ (no
kmalloc available).So do it as we use interrupts instead. Also means we only alloc for
irqs we use, which was the intent of CONFIG_SPARSE_IRQ anyway.Signed-off-by: Rusty Russell
Cc: Ingo Molnar -
Impact: fix lguest boot crash on modern Intel machines
The code in early_init_intel does:
if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
u64 misc_enable;rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);
And that rdmsr faults (not allowed from non-0 PL). We can get around
this by mugging the family ID part of the cpuid. 5 seems like a good
number.Of course, this is a hack (how very lguest!). We could just indicate
that we don't support MSRs, or implement lguest_rdmst.Reported-by: Patrick McHardy
Signed-off-by: Rusty Russell
Tested-by: Patrick McHardy