Eric Lee / smarc-fsl-linux-kernel

16 Dec, 2011

2 commits

42ebfc61c Merge branch 'stable/for-linus-fixes-3.2' of git://git.kernel.org/pub/scm/linux/… ... Browse Code »

…kernel/git/konrad/xen

* 'stable/for-linus-fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/swiotlb: Use page alignment for early buffer allocation.
xen: only limit memory map to maximum reservation for domain 0.

Linus Torvalds
2011-12-16 02:52:40 +0800
d3db72812 xen: only limit memory map to maximum reservation for domain 0. ... Browse Code »
1

d312ae878b6a "xen: use maximum reservation to limit amount of usable RAM"
clamped the total amount of RAM to the current maximum reservation. This is
correct for dom0 but is not correct for guest domains. In order to boot a guest
"pre-ballooned" (e.g. with memory=1G but maxmem=2G) in order to allow for
future memory expansion the guest must derive max_pfn from the e820 provided by
the toolstack and not the current maximum reservation (which can reflect only
the current maximum, not the guest lifetime max). The existing algorithm
already behaves this correctly if we do not artificially limit the maximum
number of pages for the guest case.

For a guest booted with maxmem=512, memory=128 this results in:
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] Xen: 0000000000000000 - 00000000000a0000 (usable)
[ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved)
-[ 0.000000] Xen: 0000000000100000 - 0000000008100000 (usable)
-[ 0.000000] Xen: 0000000008100000 - 0000000020800000 (unusable)
+[ 0.000000] Xen: 0000000000100000 - 0000000020800000 (usable)
...
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] DMI not present or invalid.
[ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved)
[ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
-[ 0.000000] last_pfn = 0x8100 max_arch_pfn = 0x1000000
+[ 0.000000] last_pfn = 0x20800 max_arch_pfn = 0x1000000
[ 0.000000] initial memory mapped : 0 - 027ff000
[ 0.000000] Base memory trampoline at [c009f000] 9f000 size 4096
-[ 0.000000] init_memory_mapping: 0000000000000000-0000000008100000
-[ 0.000000] 0000000000 - 0008100000 page 4k
-[ 0.000000] kernel direct mapping tables up to 8100000 @ 27bb000-27ff000
+[ 0.000000] init_memory_mapping: 0000000000000000-0000000020800000
+[ 0.000000] 0000000000 - 0020800000 page 4k
+[ 0.000000] kernel direct mapping tables up to 20800000 @ 26f8000-27ff000
[ 0.000000] xen: setting RW the range 27e8000 - 27ff000
[ 0.000000] 0MB HIGHMEM available.
-[ 0.000000] 129MB LOWMEM available.
-[ 0.000000] mapped low ram: 0 - 08100000
-[ 0.000000] low ram: 0 - 08100000
+[ 0.000000] 520MB LOWMEM available.
+[ 0.000000] mapped low ram: 0 - 20800000
+[ 0.000000] low ram: 0 - 20800000

With this change "xl mem-set 512M" will successfully increase the
guest RAM (by reducing the balloon).

There is no change for dom0.

Reported-and-Tested-by: George Shuklin
Signed-off-by: Ian Campbell
Cc: stable@kernel.org
Reviewed-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk

Ian Campbell
2011-12-16 00:24:02 +0800

04 Dec, 2011

1 commit

e5fd47bfa xen/pm_idle: Make pm_idle be default_idle under Xen. ... Browse Code »

The idea behind commit d91ee5863b71 ("cpuidle: replace xen access to x86
pm_idle and default_idle") was to have one call - disable_cpuidle()
which would make pm_idle not be molested by other code. It disallows
cpuidle_idle_call to be set to pm_idle (which is excellent).

But in the select_idle_routine() and idle_setup(), the pm_idle can still
be set to either: amd_e400_idle, mwait_idle or default_idle. This
depends on some CPU flags (MWAIT) and in AMD case on the type of CPU.

In case of mwait_idle we can hit some instances where the hypervisor
(Amazon EC2 specifically) sets the MWAIT and we get:

Brought up 2 CPUs
invalid opcode: 0000 [#1] SMP

Pid: 0, comm: swapper Not tainted 3.1.0-0.rc6.git0.3.fc16.x86_64 #1
RIP: e030:[] [] mwait_idle+0x6f/0xb4
...
Call Trace:
[] cpu_idle+0xae/0xe8
[] cpu_bringup_and_idle+0xe/0x10
RIP [] mwait_idle+0x6f/0xb4
RSP

In the case of amd_e400_idle we don't get so spectacular crashes, but we
do end up making an MSR which is trapped in the hypervisor, and then
follow it up with a yield hypercall. Meaning we end up going to
hypervisor twice instead of just once.

The previous behavior before v3.0 was that pm_idle was set to
default_idle regardless of select_idle_routine/idle_setup.

We want to do that, but only for one specific case: Xen. This patch
does that.

Fixes RH BZ #739499 and Ubuntu #881076
Reported-by: Stefan Bader
Signed-off-by: Konrad Rzeszutek Wilk
Signed-off-by: Linus Torvalds

Konrad Rzeszutek Wilk
2011-12-04 02:49:58 +0800

17 Nov, 2011

2 commits

90d4f5534 xen:pvhvm: enable PVHVM VCPU placement when using more than 32 CPUs. ... Browse Code »
1

PVHVM running with more than 32 vcpus and pv_irq/pv_time enabled
need VCPU placement to work, or else it will softlockup.

CC: stable@kernel.org
Acked-by: Stefano Stabellini
Signed-off-by: Zhenzhong Duan
Signed-off-by: Konrad Rzeszutek Wilk

Zhenzhong Duan
2011-11-17 01:13:44 +0800
cd12909cb xen: map foreign pages for shared rings by updating the PTEs directly ... Browse Code »

When mapping a foreign page with xenbus_map_ring_valloc() with the
GNTTABOP_map_grant_ref hypercall, set the GNTMAP_contains_pte flag and
pass a pointer to the PTE (in init_mm).

After the page is mapped, the usual fault mechanism can be used to
update additional MMs. This allows the vmalloc_sync_all() to be
removed from alloc_vm_area().

Signed-off-by: David Vrabel
Acked-by: Andrew Morton
[v1: Squashed fix by Michal for no-mmu case]
Signed-off-by: Konrad Rzeszutek Wilk
Signed-off-by: Michal Simek

David Vrabel
2011-11-17 01:13:08 +0800

07 Nov, 2011

2 commits

403299a85 Merge branch 'upstream/xen-settime' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen ... Browse Code »

* 'upstream/xen-settime' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
xen/dom0: set wallclock time in Xen
xen: add dom0_op hypercall
xen/acpi: Domain0 acpi parser related platform hypercall

Linus Torvalds
2011-11-07 12:15:05 +0800
06d381484 Merge branch 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen ... Browse Code »

* 'stable/vmalloc-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
net: xen-netback: use API provided by xenbus module to map rings
block: xen-blkback: use API provided by xenbus module to map rings
xen: use generic functions instead of xen_{alloc, free}_vm_area()

Linus Torvalds
2011-11-07 10:31:36 +0800

25 Oct, 2011

2 commits

04a875248 Merge branches 'stable/drivers-3.2', 'stable/drivers.bugfixes-3.2' and 'stable/p… ... Browse Code »

…ci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

* 'stable/drivers-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xenbus: don't rely on xen_initial_domain to detect local xenstore
xenbus: Fix loopback event channel assuming domain 0
xen/pv-on-hvm:kexec: Fix implicit declaration of function 'xen_hvm_domain'
xen/pv-on-hvm kexec: add xs_reset_watches to shutdown watches from old kernel
xen/pv-on-hvm kexec: update xs_wire.h:xsd_sockmsg_type from xen-unstable
xen/pv-on-hvm kexec+kdump: reset PV devices in kexec or crash kernel
xen/pv-on-hvm kexec: rebind virqs to existing eventchannel ports
xen/pv-on-hvm kexec: prevent crash in xenwatch_thread() when stale watch events arrive

* 'stable/drivers.bugfixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pciback: Check if the device is found instead of blindly assuming so.
xen/pciback: Do not dereference psdev during printk when it is NULL.
xen: remove XEN_PLATFORM_PCI config option
xen: XEN_PVHVM depends on PCI
xen/pciback: double lock typo
xen/pciback: use mutex rather than spinlock in vpci backend
xen/pciback: Use mutexes when working with Xenbus state transitions.
xen/pciback: miscellaneous adjustments
xen/pciback: use mutex rather than spinlock in passthrough backend
xen/pciback: use resource_size()

* 'stable/pci.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pci: support multi-segment systems
xen-swiotlb: When doing coherent alloc/dealloc check before swizzling the MFNs.
xen/pci: make bus notifier handler return sane values
xen-swiotlb: fix printk and panic args
xen-swiotlb: Fix wrong panic.
xen-swiotlb: Retry up three times to allocate Xen-SWIOTLB
xen-pcifront: Update warning comment to use 'e820_host' option.

Linus Torvalds
2011-10-25 15:19:36 +0800
31018acd4 Merge branches 'stable/bug.fixes-3.2' and 'stable/mmu.fixes' of git://git.kernel… ... Browse Code »

….org/pub/scm/linux/kernel/git/konrad/xen

* 'stable/bug.fixes-3.2' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/p2m/debugfs: Make type_name more obvious.
xen/p2m/debugfs: Fix potential pointer exception.
xen/enlighten: Fix compile warnings and set cx to known value.
xen/xenbus: Remove the unnecessary check.
xen/irq: If we fail during msi_capability_init return proper error code.
xen/events: Don't check the info for NULL as it is already done.
xen/events: BUG() when we can't allocate our event->irq array.

* 'stable/mmu.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: Fix selfballooning and ensure it doesn't go too far
xen/gntdev: Fix sleep-inside-spinlock
xen: modify kernel mappings corresponding to granted pages
xen: add an "highmem" parameter to alloc_xenballooned_pages
xen/p2m: Use SetPagePrivate and its friends for M2P overrides.
xen/p2m: Make debug/xen/mmu/p2m visible again.
Revert "xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set."

Linus Torvalds
2011-10-25 15:17:47 +0800

20 Oct, 2011

3 commits

a491dbef5 xen/p2m/debugfs: Make type_name more obvious. ... Browse Code »

Per Ian Campbell suggestion to defend against future breakage
in case we expand the P2M values, incorporate the defines
in the string array.

Suggested-by: Ian Campbell
Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2011-10-20 05:03:34 +0800
8404877ee xen/p2m/debugfs: Fix potential pointer exception. ... Browse Code »

We could be referencing the last + 1 element of level_name[]
array which would cause a pointer exception, because of the
initial setup of lvl=4.

[v1: No need to do this for type_name, pointed out by Ian Campbell]
Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2011-10-20 05:03:32 +0800
5e2878301 xen/enlighten: Fix compile warnings and set cx to known value. ... Browse Code »

We get:
linux/arch/x86/xen/enlighten.c: In function ‘xen_start_kernel’:
linux/arch/x86/xen/enlighten.c:226: warning: ‘cx’ may be used uninitialized in this function
linux/arch/x86/xen/enlighten.c:240: note: ‘cx’ was declared here

and the cx is really not set but passed in the xen_cpuid instruction
which masks the value with returned masked_ecx from cpuid. This
can potentially lead to invalid data being stored in cx.

Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2011-10-20 05:03:31 +0800

30 Sep, 2011

1 commit

4dcaebbf6 xen: use generic functions instead of xen_{alloc, free}_vm_area() ... Browse Code »

Replace calls to the Xen-specific xen_alloc_vm_area() and
xen_free_vm_area() functions with the generic equivalent
(alloc_vm_area() and free_vm_area()).

On x86, these were identical already.

Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk

David Vrabel
2011-09-30 03:02:18 +0800

29 Sep, 2011

6 commits

f3f436e33 xen: release all pages within 1-1 p2m mappings ... Browse Code »

In xen_memory_setup() all reserved regions and gaps are set to an
identity (1-1) p2m mapping. If an available page has a PFN within one
of these 1-1 mappings it will become inaccessible (as it MFN is lost)
so release them before setting up the mapping.

This can make an additional 256 MiB or more of RAM available
(depending on the size of the reserved regions in the memory map) if
the initial pages overlap with reserved regions.

The 1:1 p2m mappings are also extended to cover partial pages. This
fixes an issue with (for example) systems with a BIOS that puts the
DMI tables in a reserved region that begins on a non-page boundary.

Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk

David Vrabel
2011-09-29 23:12:15 +0800
dc91c728f xen: allow extra memory to be in multiple regions ... Browse Code »

Allow the extra memory (used by the balloon driver) to be in multiple
regions (typically two regions, one for low memory and one for high
memory). This allows the balloon driver to increase the number of
available low pages (if the initial number if pages is small).

As a side effect, the algorithm for building the e820 memory map is
simpler and more obviously correct as the map supplied by the
hypervisor is (almost) used as is (in particular, all reserved regions
and gaps are preserved). Only RAM regions are altered and RAM regions
above max_pfn + extra_pages are marked as unused (the region is split
in two if necessary).

Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk

David Vrabel
2011-09-29 23:12:10 +0800
8b5d44a5a xen: allow balloon driver to use more than one memory region ... Browse Code »

Allow the xen balloon driver to populate its list of extra pages from
more than one region of memory. This will allow platforms to provide
(for example) a region of low memory and a region of high memory.

The maximum possible number of extra regions is 128 (== E820MAX) which
is quite large so xen_extra_mem is placed in __initdata. This is safe
as both xen_memory_setup() and balloon_init() are in __init.

The balloon regions themselves are not altered (i.e., there is still
only the one region).

Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk

David Vrabel
2011-09-29 23:12:10 +0800
aa24411b6 xen/balloon: account for pages released during memory setup ... Browse Code »

In xen_memory_setup() pages that occur in gaps in the memory map are
released back to Xen. This reduces the domain's current page count in
the hypervisor. The Xen balloon driver does not correctly decrease
its initial current_pages count to reflect this. If 'delta' pages are
released and the target is adjusted the resulting reservation is
always 'delta' less than the requested target.

This affects dom0 if the initial allocation of pages overlaps the PCI
memory region but won't affect most domU guests that have been setup
with pseudo-physical memory maps that don't have gaps.

Fix this by accouting for the released pages when starting the balloon
driver.

If the domain's targets are managed by xapi, the domain may eventually
run out of memory and die because xapi currently gets its target
calculations wrong and whenever it is restarted it always reduces the
target by 'delta'.

Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk

David Vrabel
2011-09-29 23:12:09 +0800
b17d0b5c0 xen: XEN_PVHVM depends on PCI ... Browse Code »

Xen PV on HVM guests require PCI support because they need the
xen-platform-pci driver in order to initialize xenbus.

Signed-off-by: Stefano Stabellini
Signed-off-by: Konrad Rzeszutek Wilk

Stefano Stabellini
2011-09-29 22:52:16 +0800
0930bba67 xen: modify kernel mappings corresponding to granted pages ... Browse Code »

If we want to use granted pages for AIO, changing the mappings of a user
vma and the corresponding p2m is not enough, we also need to update the
kernel mappings accordingly.
Currently this is only needed for pages that are created for user usages
through /dev/xen/gntdev. As in, pages that have been in use by the
kernel and use the P2M will not need this special mapping.
However there are no guarantees that in the future the kernel won't
start accessing pages through the 1:1 even for internal usage.

In order to avoid the complexity of dealing with highmem, we allocated
the pages lowmem.
We issue a HYPERVISOR_grant_table_op right away in
m2p_add_override and we remove the mappings using another
HYPERVISOR_grant_table_op in m2p_remove_override.
Considering that m2p_add_override and m2p_remove_override are called
once per page we use multicalls and hypercall batching.

Use the kmap_op pointer directly as argument to do the mapping as it is
guaranteed to be present up until the unmapping is done.
Before issuing any unmapping multicalls, we need to make sure that the
mapping has already being done, because we need the kmap->handle to be
set correctly.

Signed-off-by: Stefano Stabellini
[v1: Removed GRANT_FRAME_BIT usage]
Signed-off-by: Konrad Rzeszutek Wilk

Stefano Stabellini
2011-09-29 22:32:58 +0800

27 Sep, 2011

1 commit

fdb9eb9f1 xen/dom0: set wallclock time in Xen ... Browse Code »

Signed-off-by: Jeremy Fitzhardinge

Jeremy Fitzhardinge
2011-09-27 02:04:39 +0800

24 Sep, 2011

2 commits

0f4b49eaf xen/p2m: Use SetPagePrivate and its friends for M2P overrides. ... Browse Code »

We use the page->private field and hence should use the proper
macros and set proper bits. Also WARN_ON in case somebody
tries to overwrite our data.

Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2011-09-24 10:22:33 +0800
a867db10e xen/p2m: Make debug/xen/mmu/p2m visible again. ... Browse Code »

We dropped a lot of the MMU debugfs in favour of using
tracing API - but there is one which just provides
mostly static information that was made invisible by this change.

Bring it back.

Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2011-09-24 10:22:32 +0800

17 Sep, 2011

1 commit

abbe0d3c2 Merge branch 'stable/bug.fixes' of git://oss.oracle.com/git/kwilk/xen ... Browse Code »

* 'stable/bug.fixes' of git://oss.oracle.com/git/kwilk/xen:
xen/i386: follow-up to "replace order-based range checking of M2P table by linear one"
xen/irq: Alter the locking to use a mutex instead of a spinlock.
xen/e820: if there is no dom0_mem=, don't tweak extra_pages.
xen: disable PV spinlocks on HVM

Linus Torvalds
2011-09-17 02:28:11 +0800

15 Sep, 2011

1 commit

61cca2fab xen/i386: follow-up to "replace order-based range checking of M2P table by linear one" ... Browse Code »

The numbers obtained from the hypervisor really can't ever lead to an
overflow here, only the original calculation going through the order
of the range could have. This avoids the (as Jeremy points outs)
somewhat ugly NULL-based calculation here.

Signed-off-by: Jan Beulich
Signed-off-by: Konrad Rzeszutek Wilk

Jan Beulich
2011-09-15 16:39:46 +0800

13 Sep, 2011

2 commits

e3b73c4a2 xen/e820: if there is no dom0_mem=, don't tweak extra_pages. ... Browse Code »
1

The patch "xen: use maximum reservation to limit amount of usable RAM"
(d312ae878b6aed3912e1acaaf5d0b2a9d08a4f11) breaks machines that
do not use 'dom0_mem=' argument with:

reserve RAM buffer: 000000133f2e2000 - 000000133fffffff
(XEN) mm.c:4976:d0 Global bit is set to kernel page fffff8117e
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
...

The reason being that the last E820 entry is created using the
'extra_pages' (which is based on how many pages have been freed).
The mentioned git commit sets the initial value of 'extra_pages'
using a hypercall which returns the number of pages (if dom0_mem
has been used) or -1 otherwise. If the later we return with
MAX_DOMAIN_PAGES as basis for calculation:

return min(max_pages, MAX_DOMAIN_PAGES);

and use it:

extra_limit = xen_get_max_pages();
if (extra_limit >= max_pfn)
extra_pages = extra_limit - max_pfn;
else
extra_pages = 0;

which means we end up with extra_pages = 128GB in PFNs (33554432)
- 8GB in PFNs (2097152, on this specific box, can be larger or smaller),
and then we add that value to the E820 making it:

Xen: 00000000ff000000 - 0000000100000000 (reserved)
Xen: 0000000100000000 - 000000133f2e2000 (usable)

which is clearly wrong. It should look as so:

Xen: 00000000ff000000 - 0000000100000000 (reserved)
Xen: 0000000100000000 - 000000027fbda000 (usable)

Naturally this problem does not present itself if dom0_mem=max:X
is used.

CC: stable@kernel.org
Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk

David Vrabel
2011-09-13 22:17:32 +0800
d9543314e Merge branch 'upstream/bugfix' of git://github.com/jsgf/linux-xen ... Browse Code »

* 'upstream/bugfix' of git://github.com/jsgf/linux-xen:
xen: use non-tracing preempt in xen_clocksource_read()

Linus Torvalds
2011-09-13 08:22:31 +0800

09 Sep, 2011

1 commit

f10cd522c xen: disable PV spinlocks on HVM ... Browse Code »

PV spinlocks cannot possibly work with the current code because they are
enabled after pvops patching has already been done, and because PV
spinlocks use a different data structure than native spinlocks so we
cannot switch between them dynamically. A spinlock that has been taken
once by the native code (__ticket_spin_lock) cannot be taken by
__xen_spin_lock even after it has been released.

Reported-and-Tested-by: Stefan Bader
Signed-off-by: Stefano Stabellini
Signed-off-by: Konrad Rzeszutek Wilk

Stefano Stabellini
2011-09-09 01:59:06 +0800

07 Sep, 2011

1 commit

115452675 Merge branch 'stable/bug.fixes' of git://oss.oracle.com/git/kwilk/xen ... Browse Code »

* 'stable/bug.fixes' of git://oss.oracle.com/git/kwilk/xen:
xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead.
xen: x86_32: do not enable iterrupts when returning from exception in interrupt context
xen: use maximum reservation to limit amount of usable RAM

Linus Torvalds
2011-09-07 22:46:48 +0800

02 Sep, 2011

2 commits

ed467e69f xen/smp: Warn user why they keel over - nosmp or noapic and what to use instead. ... Browse Code »
1

We have hit a couple of customer bugs where they would like to
use those parameters to run an UP kernel - but both of those
options turn of important sources of interrupt information so
we end up not being able to boot. The correct way is to
pass in 'dom0_max_vcpus=1' on the Xen hypervisor line and
the kernel will patch itself to be a UP kernel.

Fixes bug: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=637308

CC: stable@kernel.org
Acked-by: Ian Campbell
Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2011-09-02 00:54:49 +0800
d198d4991 xen: x86_32: do not enable iterrupts when returning from exception in interrupt context ... Browse Code »
1

If vmalloc page_fault happens inside of interrupt handler with interrupts
disabled then on exit path from exception handler when there is no pending
interrupts, the following code (arch/x86/xen/xen-asm_32.S:112):

cmpw $0x0001, XEN_vcpu_info_pending(%eax)
sete XEN_vcpu_info_mask(%eax)

will enable interrupts even if they has been previously disabled according to
eflags from the bounce frame (arch/x86/xen/xen-asm_32.S:99)

testb $X86_EFLAGS_IF>>8, 8+1+ESP_OFFSET(%esp)
setz XEN_vcpu_info_mask(%eax)

Solution is in setting XEN_vcpu_info_mask only when it should be set
according to
cmpw $0x0001, XEN_vcpu_info_pending(%eax)
but not clearing it if there isn't any pending events.

Reproducer for bug is attached to RHBZ 707552

CC: stable@kernel.org
Signed-off-by: Igor Mammedov
Acked-by: Jeremy Fitzhardinge
Signed-off-by: Konrad Rzeszutek Wilk

Igor Mammedov
2011-09-02 00:54:42 +0800

01 Sep, 2011

1 commit

d312ae878 xen: use maximum reservation to limit amount of usable RAM ... Browse Code »
3

Use the domain's maximum reservation to limit the amount of extra RAM
for the memory balloon. This reduces the size of the pages tables and
the amount of reserved low memory (which defaults to about 1/32 of the
total RAM).

On a system with 8 GiB of RAM with the domain limited to 1 GiB the
kernel reports:

Before:

Memory: 627792k/4472000k available

After:

Memory: 549740k/11132224k available

A increase of about 76 MiB (~1.5% of the unused 7 GiB). The reserved
low memory is also reduced from 253 MiB to 32 MiB. The total
additional usable RAM is 329 MiB.

For dom0, this requires at patch to Xen ('x86: use 'dom0_mem' to limit
the number of pages for dom0') (c/s 23790)

CC: stable@kernel.org
Signed-off-by: David Vrabel
Signed-off-by: Konrad Rzeszutek Wilk

David Vrabel
2011-09-01 21:41:40 +0800

25 Aug, 2011

1 commit

f1c39625d xen: use non-tracing preempt in xen_clocksource_read() ... Browse Code »

The tracing code used sched_clock() to get tracing timestamps, which
ends up calling xen_clocksource_read(). xen_clocksource_read() must
disable preemption, but if preemption tracing is enabled, this results
in infinite recursion.

I've only noticed this when boot-time tracing tests are enabled, but it
seems like a generic bug. It looks like it would also affect
kvm_clocksource_read().

Reported-by: Konrad Rzeszutek Wilk
Signed-off-by: Jeremy Fitzhardinge
Cc: Avi Kivity
Cc: Marcelo Tosatti

Jeremy Fitzhardinge
2011-08-25 00:54:24 +0800

23 Aug, 2011

1 commit

4762e252f Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen ... Browse Code »

* 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/tracing: Fix tracing config option properly
xen: Do not enable PV IPIs when vector callback not present
xen/x86: replace order-based range checking of M2P table by linear one
xen: xen-selfballoon.c needs more header files

Linus Torvalds
2011-08-23 02:25:44 +0800

22 Aug, 2011

2 commits

60c5f08e1 xen/tracing: Fix tracing config option properly ... Browse Code »

Steven Rostedt says we should use CONFIG_EVENT_TRACING.

Cc:Steven Rostedt
Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Konrad Rzeszutek Wilk

Jeremy Fitzhardinge
2011-08-22 23:28:33 +0800
3c05c4bed xen: Do not enable PV IPIs when vector callback not present ... Browse Code »
1

Fix regression for HVM case on older (
Date: Thu Dec 2 17:55:10 2010 +0000

xen: PV on HVM: support PV spinlocks and IPIs

This change replaced the SMP operations with event based handlers without
taking into account that this only works when the hypervisor supports
callback vectors. This causes unexplainable hangs early on boot for
HVM guests with more than one CPU.

BugLink: http://bugs.launchpad.net/bugs/791850

CC: stable@kernel.org
Signed-off-by: Stefan Bader
Signed-off-by: Stefano Stabellini
Tested-and-Reported-by: Stefan Bader
Signed-off-by: Konrad Rzeszutek Wilk

Stefano Stabellini
2011-08-22 23:28:09 +0800

17 Aug, 2011

1 commit

ccbcdf7cf xen/x86: replace order-based range checking of M2P table by linear one ... Browse Code »
1

The order-based approach is not only less efficient (requiring a shift
and a compare, typical generated code looking like this

mov eax, [machine_to_phys_order]
mov ecx, eax
shr ebx, cl
test ebx, ebx
jnz ...

whereas a direct check requires just a compare, like in

cmp ebx, [machine_to_phys_nr]
jae ...

), but also slightly dangerous in the 32-on-64 case - the element
address calculation can wrap if the next power of two boundary is
sufficiently far away from the actual upper limit of the table, and
hence can result in user space addresses being accessed (with it being
unknown what may actually be mapped there).

Additionally, the elimination of the mistaken use of fls() here (should
have been __fls()) fixes a latent issue on x86-64 that would trigger
if the code was run on a system with memory extending beyond the 44-bit
boundary.

CC: stable@kernel.org
Signed-off-by: Jan Beulich
[v1: Based on Jeremy's feedback]
Signed-off-by: Konrad Rzeszutek Wilk

Jan Beulich
2011-08-17 22:26:48 +0800

13 Aug, 2011

1 commit

06e727d2a Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip ... Browse Code »

* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip:
x86-64: Rework vsyscall emulation and add vsyscall= parameter
x86-64: Wire up getcpu syscall
x86: Remove unnecessary compile flag tweaks for vsyscall code
x86-64: Add vsyscall:emulate_vsyscall trace event
x86-64: Add user_64bit_mode paravirt op
x86-64, xen: Enable the vvar mapping
x86-64: Work around gold bug 13023
x86-64: Move the "user" vsyscall segment out of the data segment.
x86-64: Pad vDSO to a page boundary

Linus Torvalds
2011-08-13 11:46:24 +0800

10 Aug, 2011

1 commit

10fe570fc Revert "xen/debug: WARN_ON when identity PFN has no _PAGE_IOMAP flag set." ... Browse Code »

We don' use it anymore and there are more false positives.

This reverts commit fc25151d9ac7d809239fe68de0a1490b504bb94a.

Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2011-08-10 01:04:08 +0800

07 Aug, 2011

1 commit

45a05f948 Merge branch 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen ... Browse Code »

* 'stable/bug.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/trace: Fix compile error when CONFIG_XEN_PRIVILEGED_GUEST is not set
xen: Fix misleading WARN message at xen_release_chunk
xen: Fix printk() format in xen/setup.c
xen/tracing: it looks like we wanted CONFIG_FTRACE
xen/self-balloon: Add dependency on tmem.
xen/balloon: Fix compile errors - missing header files.
xen/grant: Fix compile warning.
xen/pciback: remove duplicated #include

Linus Torvalds
2011-08-07 03:22:30 +0800

05 Aug, 2011

1 commit

c00c8aa2d xen/trace: Fix compile error when CONFIG_XEN_PRIVILEGED_GUEST is not set ... Browse Code »

with CONFIG_XEN and CONFIG_FTRACE set we get this:

arch/x86/xen/trace.c:22: error: ‘__HYPERVISOR_console_io’ undeclared here (not in a function)
arch/x86/xen/trace.c:22: error: array index in initializer not of integer type
arch/x86/xen/trace.c:22: error: (near initialization for ‘xen_hypercall_names’)
arch/x86/xen/trace.c:23: error: ‘__HYPERVISOR_physdev_op_compat’ undeclared here (not in a function)

Issue was that the definitions of __HYPERVISOR were not pulled
if CONFIG_XEN_PRIVILEGED_GUEST was not set.

Reported-by: Randy Dunlap
Acked-by: Randy Dunlap
Acked-by: Ingo Molnar
Signed-off-by: Konrad Rzeszutek Wilk

Konrad Rzeszutek Wilk
2011-08-05 21:43:02 +0800