Doug / smarc-fsl-linux-kernel | Embedian Git Server

27 Oct, 2012

1 commit

622f202a4 Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull x86 fixes from Ingo Molnar:
"This fixes a couple of nasty page table initialization bugs which were
causing kdump regressions. A clean rearchitecturing of the code is in
the works - meanwhile these are reverts that restore the
best-known-working state of the kernel.

There's also EFI fixes and other small fixes."

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86, mm: Undo incorrect revert in arch/x86/mm/init.c
x86: efi: Turn off efi_enabled after setup on mixed fw/kernel
x86, mm: Find_early_table_space based on ranges that are actually being mapped
x86, mm: Use memblock memory loop instead of e820_RAM
x86, mm: Trim memory in memblock to be page aligned
x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt
x86/efi: Fix oops caused by incorrect set_memory_uc() usage
x86-64: Fix page table accounting
Revert "x86/mm: Fix the size calculation of mapping tables"
MAINTAINERS: Add EFI git repository location

Linus Torvalds
2012-10-27 00:35:46 +0800

26 Oct, 2012

3 commits

8b724e2a1 Merge tag 'efi-for-3.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mflemin… ... Browse Code »

…g/efi into x86/urgent

Pull EFI fixes from Matt Fleming:

"Fix oops with EFI variables on mixed 32/64-bit firmware/kernels and
document EFI git repository location on kernel.org."

Conflicts:
arch/x86/include/asm/efi.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2012-10-26 16:17:38 +0800
f82f64dd9 x86, mm: Undo incorrect revert in arch/x86/mm/init.c ... Browse Code »

Commit

844ab6f9 x86, mm: Find_early_table_space based on ranges that are actually being mapped

added back some lines back wrongly that has been removed in commit

7b16bbf97 Revert "x86/mm: Fix the size calculation of mapping tables"

remove them again.

Signed-off-by: Yinghai Lu
Link: http://lkml.kernel.org/r/CAE9FiQW_vuaYQbmagVnxT2DGsYc=9tNeAbdBq53sYkitPOwxSQ@mail.gmail.com
Acked-by: Jacob Shin
Signed-off-by: H. Peter Anvin

Yinghai Lu
2012-10-26 06:45:45 +0800
5189c2a7c x86: efi: Turn off efi_enabled after setup on mixed fw/kernel ... Browse Code »

When 32-bit EFI is used with 64-bit kernel (or vice versa), turn off
efi_enabled once setup is done. Beyond setup, it is normally used to
determine if runtime services are available and we will have none.

This will resolve issues stemming from efivars modprobe panicking on a
32/64-bit setup, as well as some reboot issues on similar setups.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=45991

Reported-by: Marko Kohtala
Reported-by: Maxim Kammerer
Signed-off-by: Olof Johansson
Acked-by: Maarten Lankhorst
Cc: stable@kernel.org # 3.4 - 3.6
Cc: Matthew Garrett
Signed-off-by: Matt Fleming

Olof Johansson
2012-10-26 02:09:40 +0800

25 Oct, 2012

3 commits

844ab6f99 x86, mm: Find_early_table_space based on ranges that are actually being mapped ... Browse Code »

Current logic finds enough space for direct mapping page tables from 0
to end. Instead, we only need to find enough space to cover mr[0].start
to mr[nr_range].end -- the range that is actually being mapped by
init_memory_mapping()

This is needed after 1bbbbe779aabe1f0768c2bf8f8c0a5583679b54a, to address
the panic reported here:

https://lkml.org/lkml/2012/10/20/160
https://lkml.org/lkml/2012/10/21/157

Signed-off-by: Jacob Shin
Link: http://lkml.kernel.org/r/20121024195311.GB11779@jshin-Toonie
Tested-by: Tom Rini
Signed-off-by: H. Peter Anvin

Jacob Shin
2012-10-25 04:37:04 +0800
1f2ff682a x86, mm: Use memblock memory loop instead of e820_RAM ... Browse Code »

We need to handle E820_RAM and E820_RESERVED_KERNEL at the same time.

Also memblock has page aligned range for ram, so we could avoid mapping
partial pages.

Signed-off-by: Yinghai Lu
Link: http://lkml.kernel.org/r/CAE9FiQVZirvaBMFYRfXMmWEcHbKSicQEHz4VAwUv0xFCk51ZNw@mail.gmail.com
Acked-by: Jacob Shin
Signed-off-by: H. Peter Anvin
Cc:

Yinghai Lu
2012-10-25 02:52:36 +0800
6ede1fd3c x86, mm: Trim memory in memblock to be page aligned ... Browse Code »

We will not map partial pages, so need to make sure memblock
allocation will not allocate those bytes out.

Also we will use for_each_mem_pfn_range() to loop to map memory
range to keep them consistent.

Signed-off-by: Yinghai Lu
Link: http://lkml.kernel.org/r/CAE9FiQVZirvaBMFYRfXMmWEcHbKSicQEHz4VAwUv0xFCk51ZNw@mail.gmail.com
Acked-by: Jacob Shin
Signed-off-by: H. Peter Anvin
Cc:

Yinghai Lu
2012-10-25 02:52:21 +0800

24 Oct, 2012

15 commits

94777fc51 x86/irq/ioapic: Check for valid irq_cfg pointer in smp_irq_move_cleanup_interrupt ... Browse Code »

Posting this patch to fix an issue concerning sparse irq's that
I raised a while back. There was discussion about adding
refcounting to sparse irqs (to fix other potential race
conditions), but that does not appear to have been addressed
yet. This covers the only issue of this type that I've
encountered in this area.

A NULL pointer dereference can occur in
smp_irq_move_cleanup_interrupt() if we haven't yet setup the
irq_cfg pointer in the irq_desc.irq_data.chip_data.

In create_irq_nr() there is a window where we have set
vector_irq in __assign_irq_vector(), but not yet called
irq_set_chip_data() to set the irq_cfg pointer.

Should an IRQ_MOVE_CLEANUP_VECTOR hit the cpu in question during
this time, smp_irq_move_cleanup_interrupt() will attempt to
process the aforementioned irq, but panic when accessing
irq_cfg.

Only continue processing the irq if irq_cfg is non-NULL.

Signed-off-by: Dimitri Sivanich
Cc: Suresh Siddha
Cc: Joerg Roedel
Cc: Yinghai Lu
Cc: Alexander Gordeev
Link: http://lkml.kernel.org/r/20121016125021.GA22935@sgi.com
Signed-off-by: Ingo Molnar

Dimitri Sivanich
2012-10-24 18:53:51 +0800
64dfab8e8 perf/x86: Remove unused variable in nhmex_rbox_alter_er() ... Browse Code »

The variable port is initialized but never used
otherwise, so remove the unused variable.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

Signed-off-by: Wei Yongjun
Cc: Yan, Zheng
Cc: a.p.zijlstra@chello.nl
Cc: paulus@samba.org
Cc: acme@ghostprotocols.net
Link: http://lkml.kernel.org/r/CAPgLHd8NZkYSkZm22FpZxiEh6HcA0q-V%3D29vdnheiDhgrJZ%2Byw@mail.gmail.com
Signed-off-by: Ingo Molnar

Wei Yongjun
2012-10-24 18:51:40 +0800
3e8fa263a x86/efi: Fix oops caused by incorrect set_memory_uc() usage ... Browse Code »

Calling __pa() with an ioremap'd address is invalid. If we
encounter an efi_memory_desc_t without EFI_MEMORY_WB set in
->attribute we currently call set_memory_uc(), which in turn
calls __pa() on a potentially ioremap'd address.

On CONFIG_X86_32 this results in the following oops:

BUG: unable to handle kernel paging request at f7f22280
IP: [] reserve_ram_pages_type+0x89/0x210
*pdpt = 0000000001978001 *pde = 0000000001ffb067 *pte = 0000000000000000
Oops: 0000 [#1] PREEMPT SMP
Modules linked in:

Pid: 0, comm: swapper Not tainted 3.0.0-acpi-efi-0805 #3
EIP: 0060:[] EFLAGS: 00010202 CPU: 0
EIP is at reserve_ram_pages_type+0x89/0x210
EAX: 0070e280 EBX: 38714000 ECX: f7814000 EDX: 00000000
ESI: 00000000 EDI: 38715000 EBP: c189fef0 ESP: c189fea8
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
Process swapper (pid: 0, ti=c189e000 task=c18bbe60 task.ti=c189e000)
Stack:
80000200 ff108000 00000000 c189ff00 00038714 00000000 00000000 c189fed0
c104f8ca 00038714 00000000 00038715 00000000 00000000 00038715 00000000
00000010 38715000 c189ff48 c1025aff 38715000 00000000 00000010 00000000
Call Trace:
[] ? page_is_ram+0x1a/0x40
[] reserve_memtype+0xdf/0x2f0
[] set_memory_uc+0x49/0xa0
[] efi_enter_virtual_mode+0x1c2/0x3aa
[] start_kernel+0x291/0x2f2
[] ? loglevel+0x1b/0x1b
[] i386_start_kernel+0xbf/0xc8

The only time we can call set_memory_uc() for a memory region is
when it is part of the direct kernel mapping. For the case where
we ioremap a memory region we must leave it alone.

This patch reimplements the fix from e8c7106280a3 ("x86, efi:
Calling __pa() with an ioremap()ed address is invalid") which
was reverted in e1ad783b12ec because it caused a regression on
some MacBooks (they hung at boot). The regression was caused
because the commit only marked EFI_RUNTIME_SERVICES_DATA as
E820_RESERVED_EFI, when it should have marked all regions that
have the EFI_MEMORY_RUNTIME attribute.

Despite first impressions, it's not possible to use
ioremap_cache() to map all cached memory regions on
CONFIG_X86_64 because of the way that the memory map might be
configured as detailed in the following bug report,

https://bugzilla.redhat.com/show_bug.cgi?id=748516

e.g. some of the EFI memory regions *need* to be mapped as part
of the direct kernel mapping.

Signed-off-by: Matt Fleming
Cc: Matthew Garrett
Cc: Zhang Rui
Cc: Huang Ying
Cc: Keith Packard
Cc: Linus Torvalds
Cc: Andrew Morton
Link: http://lkml.kernel.org/r/1350649546-23541-1-git-send-email-matt@console-pimps.org
Signed-off-by: Ingo Molnar

Matt Fleming
2012-10-24 18:48:47 +0800
e4074b304 perf/x86: Enable overflow on Intel KNC with a custom knc_pmu_handle_irq() ... Browse Code »

Although based on the Intel P6 design, the interrupt mechnanism
for KNC more closely resembles the Intel architectural
perfmon one.

We can't just re-use that code though, because KNC has different
MSR numbers for the status and ack registers.

In this case we just cut-and paste from perf_event_intel.c
with some minor changes, as it looks like it would not be
worth the trouble to change that code to be MSR-configurable.

Signed-off-by: Vince Weaver
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: eranian@gmail.com
Cc: Meadows Lawrence F
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210171304410.23243@vincent-weaver-1.um.maine.edu
[ Small stylistic edits. ]
Signed-off-by: Ingo Molnar

Vince Weaver
2012-10-24 18:00:49 +0800
7d011962a perf/x86: Remove cpuc->enable check on Intl KNC event enable/disable ... Browse Code »

x86_pmu.enable() is called from x86_pmu_enable() with
cpuc->enabled set to 0. This means we weren't re-enabling the
counters after a context switch.

This patch just removes the check, as it should't be necessary
(and the equivelent x86_ generic code does not have the checks).

The origin of this problem is the KNC driver being based on the
P6 one. The P6 driver also has this issue, but works anyway
due to various lucky accidents.

Signed-off-by: Vince Weaver
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: eranian@gmail.com
Cc: Meadows
Cc: Lawrence F
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210171303290.23243@vincent-weaver-1.um.maine.edu
Signed-off-by: Ingo Molnar

Vince Weaver
2012-10-24 18:00:49 +0800
ae5ba47a9 perf/x86: Make Intel KNC use full 40-bit width of counters ... Browse Code »

Early versions of Intel KNC chips have a bug where bits above 32
were not properly set. We worked around this by only using the
bottom 32 bits (out of 40 that should be available).

It turns out this workaround breaks overflow handling.

The buggy silicon will in theory never be used in production
systems, so remove this workaround so we get proper overflow
support.

Signed-off-by: Vince Weaver
Cc: Peter Zijlstra
Cc: Paul Mackerras
Cc: Arnaldo Carvalho de Melo
Cc: eranian@gmail.com
Cc: Meadows Lawrence F
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210171302140.23243@vincent-weaver-1.um.maine.edu
Signed-off-by: Ingo Molnar

Vince Weaver
2012-10-24 18:00:48 +0800
032c3851f perf/x86/uncore: Handle pci_read_config_dword() errors ... Browse Code »

This, beyond handling corner cases, also fixes some build warnings:

arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function ‘snbep_uncore_pci_disable_box’:
arch/x86/kernel/cpu/perf_event_intel_uncore.c:124:9: warning: ‘config’ is used uninitialized in this function [-Wuninitialized]
arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function ‘snbep_uncore_pci_enable_box’:
arch/x86/kernel/cpu/perf_event_intel_uncore.c:135:9: warning: ‘config’ is used uninitialized in this function [-Wuninitialized]
arch/x86/kernel/cpu/perf_event_intel_uncore.c: In function ‘snbep_uncore_pci_read_counter’:
arch/x86/kernel/cpu/perf_event_intel_uncore.c:164:2: warning: ‘count’ is used uninitialized in this function [-Wuninitialized]

Signed-off-by: Yan, Zheng
Cc: a.p.zijlstra@chello.nl
Link: http://lkml.kernel.org/r/1351068140-13456-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar

Yan, Zheng
2012-10-24 16:57:03 +0800
876ee61aa x86-64: Fix page table accounting ... Browse Code »

Commit 20167d3421a089a1bf1bd680b150dc69c9506810 ("x86-64: Fix
accounting in kernel_physical_mapping_init()") went a little too
far by entirely removing the counting of pre-populated page
tables: this should be done at boot time (to cover the page
tables set up in early boot code), but shouldn't be done during
memory hot add.

Hence, re-add the removed increments of "pages", but make them
and the one in phys_pte_init() conditional upon !after_bootmem.

Reported-Acked-and-Tested-by: Hugh Dickins
Signed-off-by: Jan Beulich
Cc:
Link: http://lkml.kernel.org/r/506DAFBA020000780009FA8C@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar

Jan Beulich
2012-10-24 16:50:25 +0800
58e9eaf06 perf/x86: Remove P6 cpuc->enabled check ... Browse Code »

Between 2.6.33 and 2.6.34 the PMU code was made modular.

The x86_pmu_enable() call was extended to disable cpuc->enabled
and iterate the counters, enabling one at a time, before calling
enable_all() at the end, followed by re-enabling cpuc->enabled.

Since cpuc->enabled was set to 0, that change effectively caused
the "val |= ARCH_PERFMON_EVENTSEL_ENABLE;" code in p6_pmu_enable_event()
and p6_pmu_disable_event() to be dead code that was never called.

This change removes this code (which was confusing) and adds some
extra commentary to make it more clear what is going on.

Signed-off-by: Vince Weaver
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210191732000.14552@vincent-weaver-1.um.maine.edu
Signed-off-by: Ingo Molnar

Vince Weaver
2012-10-24 16:32:00 +0800
e09df4788 perf/x86: Update/fix generic events on P6 PMU ... Browse Code »

This patch updates the generic events on p6, including some new
extended cache events.

Values for these events were taken from the equivelant PAPI
predefined events.

Tested on a Pentium II.

Signed-off-by: Vince Weaver
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210191730080.14552@vincent-weaver-1.um.maine.edu
Signed-off-by: Ingo Molnar

Vince Weaver
2012-10-24 16:31:58 +0800
7991c9ca4 perf/x86: Fix P6 FP_ASSIST event constraint ... Browse Code »

According to Intel SDM Volume 3B, FP_ASSIST is limited to Counter 1 only,
not Counter 0.

Tested on a Pentium II.

Signed-off-by: Vince Weaver
Signed-off-by: Peter Zijlstra
Link: http://lkml.kernel.org/r/alpine.DEB.2.02.1210191728570.14552@vincent-weaver-1.um.maine.edu
Signed-off-by: Ingo Molnar

Vince Weaver
2012-10-24 16:31:57 +0800
7b16bbf97 Revert "x86/mm: Fix the size calculation of mapping tables" ... Browse Code »

Commit:

722bc6b16771 x86/mm: Fix the size calculation of mapping tables

Tried to address the issue that the first 2/4M should use 4k pages
if PSE enabled, but extra counts should only be valid for x86_32.

This commit caused a kdump regression: the kdump kernel hangs.

Work is in progress to fundamentally fix the various page table
initialization issues that we have, via the design suggested
by H. Peter Anvin, but it's not ready yet to be merged.

So, to get a working kdump revert to the last known working version,
which is the revert of this commit and of a followup fix (which was
incomplete):

bd2753b2dda7 x86/mm: Only add extra pages count for the first memory range during pre-allocation

Tested kdump on physical and virtual machines.

Signed-off-by: Dave Young
Acked-by: Yinghai Lu
Acked-by: Cong Wang
Acked-by: Flavio Leitner
Tested-by: Flavio Leitner
Cc: Dan Carpenter
Cc: Cong Wang
Cc: Flavio Leitner
Cc: Tejun Heo
Cc: ianfang.cn@gmail.com
Cc: Vivek Goyal
Cc: Linus Torvalds
Cc: Andrew Morton
Cc:
Signed-off-by: Ingo Molnar

Dave Young
2012-10-24 15:38:25 +0800
bffd5fc26 x86/perf: Fix virtualization sanity check ... Browse Code »

In check_hw_exists() we try to detect non-emulated MSR accesses
by writing an arbitrary value into one of the PMU registers
and check if it's value after a readout is still the same.
This algorithm silently assumes that the register does not contain
the magic value already, which is wrong in at least one situation.

Fix the algorithm to really do a read-modify-write cycle. This fixes
a warning under Xen under some circumstances on AMD family 10h CPUs.

The reasons in more details actually sound like a story from
Believe It or Not!:

First you need an AMD family 10h/12h CPU. These do not reset the
PERF_CTR registers on a reboot.
Now you boot bare metal Linux, which goes successfully through this
check, but leaves the magic value of 0xabcd in the register. You
don't use the performance counters, but do a reboot (warm reset).
Then you choose to boot Xen. The check will be triggered with a
recent Linux kernel as Dom0 again, trying to write 0xabcd into the
MSR. Xen silently drops the write (expected), but the subsequent read
will return the value in the register, which just happens to be the
expected magic value. Thus the test misleadingly succeeds, leaving
the kernel in the belief that the PMU is available. This will trigger
the following message:

[ 0.020294] ------------[ cut here ]------------
[ 0.020311] WARNING: at arch/x86/xen/enlighten.c:730 xen_apic_write+0x15/0x17()
[ 0.020318] Hardware name: empty
[ 0.020323] Modules linked in:
[ 0.020334] Pid: 1, comm: swapper/0 Not tainted 3.3.8 #7
[ 0.020340] Call Trace:
[ 0.020354] [] warn_slowpath_common+0x80/0x98
[ 0.020369] [] warn_slowpath_null+0x15/0x17
[ 0.020378] [] xen_apic_write+0x15/0x17
[ 0.020392] [] perf_events_lapic_init+0x2e/0x30
[ 0.020410] [] init_hw_perf_events+0x250/0x407
[ 0.020419] [] ? check_bugs+0x2d/0x2d
[ 0.020430] [] do_one_initcall+0x7a/0x131
[ 0.020444] [] kernel_init+0x91/0x15d
[ 0.020456] [] kernel_thread_helper+0x4/0x10
[ 0.020471] [] ? retint_restore_args+0x5/0x6
[ 0.020481] [] ? gs_change+0x13/0x13
[ 0.020500] ---[ end trace a7919e7f17c0a725 ]---

The new code will change every of the 16 low bits read from the
register and tries to write and read-back that modified number
from the MSR.

Signed-off-by: Andre Przywara
Signed-off-by: Peter Zijlstra
Cc: Arnaldo Carvalho de Melo
Cc: Avi Kivity
Link: http://lkml.kernel.org/r/1349797115-28346-2-git-send-email-andre.przywara@amd.com
Signed-off-by: Ingo Molnar

Andre Przywara
2012-10-24 14:53:13 +0800
0e9e3e306 Merge tag 'stable/for-linus-3.7-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen ... Browse Code »

Pull xen bug-fixes from Konrad Rzeszutek Wilk:
- Fix mysterious SIGSEGV or SIGKILL in applications due to corrupting
of the %eip when returning from a signal handler.
- Fix various ARM compile issues after the merge fallout.
- Continue on making more of the Xen generic code usable by ARM
platform.
- Fix SR-IOV passthrough to mirror multifunction PCI devices.
- Fix various compile warnings.
- Remove hypercalls that don't exist anymore.

* tag 'stable/for-linus-3.7-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen: dbgp: Fix warning when CONFIG_PCI is not enabled.
xen: arm: comment on why 64-bit xen_pfn_t is safe even on 32 bit
xen: balloon: use correct type for frame_list
xen/x86: don't corrupt %eip when returning from a signal handler
xen: arm: make p2m operations NOPs
xen: balloon: don't include e820.h
xen: grant: use xen_pfn_t type for frame_list.
xen: events: pirq_check_eoi_map is X86 specific
xen: XENMEM_translate_gpfn_list was remove ages ago and is unused.
xen: sysfs: fix build warning.
xen: sysfs: include err.h for PTR_ERR etc
xen: xenbus: quirk uses x86 specific cpuid
xen PV passthru: assign SR-IOV virtual functions to separate virtual slots
xen/xenbus: Fix compile warning.
xen/x86: remove duplicated include from enlighten.c

Linus Torvalds
2012-10-24 10:17:27 +0800
3d0ceac12 Merge tag 'kvm-3.7-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

Pull kvm fixes from Avi Kivity:
"KVM updates for 3.7-rc2"

* tag 'kvm-3.7-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM guest: exit idleness when handling KVM_PV_REASON_PAGE_NOT_PRESENT
KVM: apic: fix LDR calculation in x2apic mode
KVM: MMU: fix release noslot pfn

Linus Torvalds
2012-10-24 09:08:42 +0800

23 Oct, 2012

3 commits

c5e015d49 KVM guest: exit idleness when handling KVM_PV_REASON_PAGE_NOT_PRESENT ... Browse Code »

KVM_PV_REASON_PAGE_NOT_PRESENT kicks cpu out of idleness, but we haven't
marked that spot as an exit from idleness.

Not doing so can cause RCU warnings such as:

[ 732.788386] ===============================
[ 732.789803] [ INFO: suspicious RCU usage. ]
[ 732.790032] 3.7.0-rc1-next-20121019-sasha-00002-g6d8d02d-dirty #63 Tainted: G W
[ 732.790032] -------------------------------
[ 732.790032] include/linux/rcupdate.h:738 rcu_read_lock() used illegally while idle!
[ 732.790032]
[ 732.790032] other info that might help us debug this:
[ 732.790032]
[ 732.790032]
[ 732.790032] RCU used illegally from idle CPU!
[ 732.790032] rcu_scheduler_active = 1, debug_locks = 1
[ 732.790032] RCU used illegally from extended quiescent state!
[ 732.790032] 2 locks held by trinity-child31/8252:
[ 732.790032] #0: (&rq->lock){-.-.-.}, at: [] __schedule+0x178/0x8f0
[ 732.790032] #1: (rcu_read_lock){.+.+..}, at: [] cpuacct_charge+0xe/0x200
[ 732.790032]
[ 732.790032] stack backtrace:
[ 732.790032] Pid: 8252, comm: trinity-child31 Tainted: G W 3.7.0-rc1-next-20121019-sasha-00002-g6d8d02d-dirty #63
[ 732.790032] Call Trace:
[ 732.790032] [] lockdep_rcu_suspicious+0x10b/0x120
[ 732.790032] [] cpuacct_charge+0x90/0x200
[ 732.790032] [] ? cpuacct_charge+0xe/0x200
[ 732.790032] [] update_curr+0x1a3/0x270
[ 732.790032] [] dequeue_entity+0x2a/0x210
[ 732.790032] [] dequeue_task_fair+0x45/0x130
[ 732.790032] [] dequeue_task+0x89/0xa0
[ 732.790032] [] deactivate_task+0x1e/0x20
[ 732.790032] [] __schedule+0x879/0x8f0
[ 732.790032] [] ? trace_hardirqs_off+0xd/0x10
[ 732.790032] [] ? kvm_async_pf_task_wait+0x1d5/0x2b0
[ 732.790032] [] schedule+0x55/0x60
[ 732.790032] [] kvm_async_pf_task_wait+0x1f4/0x2b0
[ 732.790032] [] ? abort_exclusive_wait+0xb0/0xb0
[ 732.790032] [] ? prepare_to_wait+0x25/0x90
[ 732.790032] [] do_async_page_fault+0x56/0xa0
[ 732.790032] [] async_page_fault+0x28/0x30

Signed-off-by: Sasha Levin
Acked-by: Gleb Natapov
Acked-by: Paul E. McKenney
Signed-off-by: Avi Kivity

Sasha Levin
2012-10-23 00:03:28 +0800
7f46ddbd4 KVM: apic: fix LDR calculation in x2apic mode ... Browse Code »

Signed-off-by: Gleb Natapov
Reviewed-by: Chegu Vinod
Tested-by: Chegu Vinod
Signed-off-by: Avi Kivity

Gleb Natapov
2012-10-23 00:03:27 +0800
f3ac1a4b6 KVM: MMU: fix release noslot pfn ... Browse Code »

We can not directly call kvm_release_pfn_clean to release the pfn
since we can meet noslot pfn which is used to cache mmio info into
spte

Signed-off-by: Xiao Guangrong
Cc: stable@vger.kernel.org
Signed-off-by: Avi Kivity

Xiao Guangrong
2012-10-23 00:03:25 +0800

22 Oct, 2012

2 commits

f38787f4f Merge branch 'uprobes/core' of git://git.kernel.org/pub/scm/linux/kernel/git/ole… ... Browse Code »

…g/misc into perf/urgent

Pull various uprobes bugfixes from Oleg Nesterov - mostly race and
failure path fixes.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2012-10-22 00:18:17 +0800
957b9095e Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/opro… ... Browse Code »

…file into perf/urgent

Pull event-wrapping Oprofile fix from Robert Richter.

Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2012-10-22 00:11:20 +0800

20 Oct, 2012

8 commits

a05123bdd perf/x86: Disable uncore on virtualized CPUs ... Browse Code »

Initializing uncore PMU on virtualized CPU may hang the kernel.
This is because kvm does not emulate the entire hardware. Thers
are lots of uncore related MSRs, making kvm enumerate them all
is a non-trival task. So just disable uncore on virtualized CPU.

Signed-off-by: Yan, Zheng
Tested-by: Pekka Enberg
Cc: a.p.zijlstra@chello.nl
Cc: eranian@google.com
Cc: andi@firstfloor.org
Cc: avi@redhat.com
Link: http://lkml.kernel.org/r/1345540117-14164-1-git-send-email-zheng.z.yan@intel.com
Signed-off-by: Ingo Molnar

Yan, Zheng
2012-10-20 16:07:02 +0800
8c1bee685 Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

Pull perf fixes from Ingo Molnar:
"Assorted small fixes"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf python: Properly link with libtraceevent
perf hists browser: Add back callchain folding symbol
perf tools: Fix build on sparc.
perf python: Link with libtraceevent
perf python: Initialize 'page_size' variable
tools lib traceevent: Fix missed freeing of subargs in free_arg() in filter
lib tools traceevent: Add back pevent assignment in __pevent_parse_format()
perf hists browser: Fix off-by-two bug on the first column
perf tools: Remove warnings on JIT samples for srcline sort key
perf tools: Fix segfault when using srcline sort key
perf: Require exclude_guest to use PEBS - kernel side enforcement
perf tool: Precise mode requires exclude_guest

Linus Torvalds
2012-10-20 09:39:36 +0800
a448a0318 Merge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/g… ... Browse Code »

…it/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

* The python binding needs to link with libtraceevent and to initialize
the 'page_size' variable so that mmaping works again.

* The callchain folding character that appears on the TUI just before
the overhead had disappeared due to recent changes, add it back.

* Intel PEBS in VT-x context uses the DS address as a guest linear address,
even though its programmed by the host as a host linear address. This either
results in guest memory corruption and or the hardware faulting and 'crashing'
the virtual machine. Therefore we have to disable PEBS on VT-x enter and
re-enable on VT-x exit, enforcing a strict exclude_guest.

Kernel side enforcement fix by Peter Zijlstra, tooling side fix by David Ahern.

* Fix build on sparc due to UAPI, fix from David Miller.

* Fixes for the srclike sort key for unresolved symbols and when processing
samples in JITted code, where we don't have an ELF file, just an special
symbol table, fixes from Namhyung Kim.

* Fix some leaks in libtraceevent, from Steven Rostedt.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>

Ingo Molnar
2012-10-20 08:40:26 +0800
e05dacd71 Merge commit 'v3.7-rc1' into stable/for-linus-3.7 ... Browse Code »

* commit 'v3.7-rc1': (10892 commits)
Linux 3.7-rc1
x86, boot: Explicitly include autoconf.h for hostprogs
perf: Fix UAPI fallout
ARM: config: make sure that platforms are ordered by option string
ARM: config: sort select statements alphanumerically
UAPI: (Scripted) Disintegrate include/linux/byteorder
UAPI: (Scripted) Disintegrate include/linux
UAPI: Unexport linux/blk_types.h
UAPI: Unexport part of linux/ppp-comp.h
perf: Handle new rbtree implementation
procfs: don't need a PATH_MAX allocation to hold a string representation of an int
vfs: embed struct filename inside of names_cache allocation if possible
audit: make audit_inode take struct filename
vfs: make path_openat take a struct filename pointer
vfs: turn do_path_lookup into wrapper around struct filename variant
audit: allow audit code to satisfy getname requests from its names_list
vfs: define struct filename and have getname() return it
btrfs: Fix compilation with user namespace support enabled
userns: Fix posix_acl_file_xattr_userns gid conversion
userns: Properly print bluetooth socket uids
...

Konrad Rzeszutek Wilk
2012-10-20 03:19:19 +0800
a349e23d1 xen/x86: don't corrupt %eip when returning from a signal handler ... Browse Code »

In 32 bit guests, if a userspace process has %eax == -ERESTARTSYS
(-512) or -ERESTARTNOINTR (-513) when it is interrupted by an event
/and/ the process has a pending signal then %eip (and %eax) are
corrupted when returning to the main process after handling the
signal. The application may then crash with SIGSEGV or a SIGILL or it
may have subtly incorrect behaviour (depending on what instruction it
returned to).

The occurs because handle_signal() is incorrectly thinking that there
is a system call that needs to restarted so it adjusts %eip and %eax
to re-execute the system call instruction (even though user space had
not done a system call).

If %eax == -514 (-ERESTARTNOHAND (-514) or -ERESTART_RESTARTBLOCK
(-516) then handle_signal() only corrupted %eax (by setting it to
-EINTR). This may cause the application to crash or have incorrect
behaviour.

handle_signal() assumes that regs->orig_ax >= 0 means a system call so
any kernel entry point that is not for a system call must push a
negative value for orig_ax. For example, for physical interrupts on
bare metal the inverse of the vector is pushed and page_fault() sets
regs->orig_ax to -1, overwriting the hardware provided error code.

xen_hypervisor_callback() was incorrectly pushing 0 for orig_ax
instead of -1.

Classic Xen kernels pushed %eax which works as %eax cannot be both
non-negative and -RESTARTSYS (etc.), but using -1 is consistent with
other non-system call entry points and avoids some of the tests in
handle_signal().

There were similar bugs in xen_failsafe_callback() of both 32 and
64-bit guests. If the fault was corrected and the normal return path
was used then 0 was incorrectly pushed as the value for orig_ax.

Signed-off-by: David Vrabel
Acked-by: Jan Beulich
Acked-by: Ian Campbell
Cc: stable@vger.kernel.org
Signed-off-by: Konrad Rzeszutek Wilk

David Vrabel
2012-10-20 03:17:59 +0800
ef32f8929 xen: grant: use xen_pfn_t type for frame_list. ... Browse Code »

This correctly sizes it as 64 bit on ARM but leaves it as unsigned
long on x86 (therefore no intended change on x86).

The long and ulong guest handles are now unused (and a bit dangerous)
so remove them.

Acked-by: Stefano Stabellini
Signed-off-by: Ian Campbell
Signed-off-by: Konrad Rzeszutek Wilk

Ian Campbell
2012-10-20 03:17:55 +0800
37ea0fcb6 xen: sysfs: fix build warning. ... Browse Code »

Define PRI macros for xen_ulong_t and xen_pfn_t and use to fix:
drivers/xen/sys-hypervisor.c:288:4: warning: format '%lx' expects argument of type 'long unsigned int', but argument 3 has type 'xen_ulong_t' [-Wformat]

Ideally this would use PRIx64 on ARM but these (or equivalent) don't
seem to be available in the kernel.

Acked-by: Stefano Stabellini
Signed-off-by: Ian Campbell
Signed-off-by: Konrad Rzeszutek Wilk

Ian Campbell
2012-10-20 03:17:51 +0800
c2103b7ef xen/x86: remove duplicated include from enlighten.c ... Browse Code »

Remove duplicated include.

dpatch engine is used to auto generate this patch.
(https://github.com/weiyj/dpatch)

CC: stable@vger.kernel.org
Signed-off-by: Wei Yongjun
Signed-off-by: Konrad Rzeszutek Wilk

Wei Yongjun
2012-10-20 03:17:30 +0800

19 Oct, 2012

3 commits

4533d8627 Merge commit '5bc66170dc486556a1e36fd384463536573f4b82 ' into x86/urgent ... Browse Code »

From Borislav Petkov :

Below is a RAS fix which reverts the addition of a sysfs attribute
which we agreed is not needed, post-factum. And this should go in now
because that sysfs attribute is going to end up in 3.7 otherwise and
thus exposed to userspace; removing it then would be a lot harder.

This is done as a merge rather than a simple patch/cherry-pick since
the baseline for this patch was not in the previous x86/urgent.

Signed-off-by: H. Peter Anvin

H. Peter Anvin
2012-10-19 22:55:09 +0800
5bc66170d x86, MCE: Remove bios_cmci_threshold sysfs attribute ... Browse Code »

450cc201038f3 ("x86/mce: Provide boot argument to honour bios-set CMCI
threshold") added the bios_cmci_threshold sysfs attribute which was
supposed to communicate to userspace tools that BIOS CMCI threshold has
been honoured.

However, this info is not of any importance to userspace - it should
rather get the actual error count it has been thresholded already from
MCi_STATUS[38:52].

So drop this before it becomes a used interface (good thing we caught
this early in 3.7-rc1, right after the merge window closed).

Cc: Naveen N. Rao
Acked-by: Tony Luck
Link: http://lkml.kernel.org/r/20121017105940.GA14590@x1.osrc.amd.com
Signed-off-by: Borislav Petkov

Borislav Petkov
2012-10-19 21:22:29 +0800
32bec973a crypto: aesni - fix XTS mode on x86-32, add wrapper function for asmlinkage aesni_enc() ... Browse Code »

Calling convention for internal functions and 'asmlinkage' functions is
different on x86-32. Therefore do not directly cast aesni_enc as XTS tweak
function, but use wrapper function in between. Fixes crash with "XTS +
aesni_intel + x86-32" combination.

Cc: stable@vger.kernel.org
Reported-by: Krzysztof Kolasa
Signed-off-by: Jussi Kivilinna
Acked-by: David S. Miller
Signed-off-by: Linus Torvalds

Jussi Kivilinna
2012-10-19 05:01:33 +0800

18 Oct, 2012

2 commits

21c5e50e1 x86, amd, mce: Avoid NULL pointer reference on CPU northbridge lookup ... Browse Code »

When booting on a federated multi-server system (NumaScale), the
processor Northbridge lookup returns NULL; add guards to prevent this
causing an oops.

On those systems, the northbridge is accessed through MMIO and the
"normal" northbridge enumeration in amd_nb.c doesn't work since we're
generating the northbridge ID from the initial APIC ID and the last
is not unique on those systems. Long story short, we end up without
northbridge descriptors.

Signed-off-by: Daniel J Blueman
Cc: stable@vger.kernel.org # 3.6
Link: http://lkml.kernel.org/r/1349073725-14093-1-git-send-email-daniel@numascale-asia.com
[ Boris: beef up commit message ]
Signed-off-by: Borislav Petkov
Signed-off-by: H. Peter Anvin

Daniel J Blueman
2012-10-18 02:25:32 +0800
1bbbbe779 x86: Exclude E820_RESERVED regions and memory holes above 4 GB from direct mapping. ... Browse Code »

On systems with very large memory (1 TB in our case), BIOS may report a
reserved region or a hole in the E820 map, even above the 4 GB range. Exclude
these from the direct mapping.

[ hpa: this should be done not just for > 4 GB but for everything above the legacy
region (1 MB), at the very least. That, however, turns out to require significant
restructuring. That work is well underway, but is not suitable for rc/stable. ]

Cc: stable@kernel.org # > 2.6.32
Signed-off-by: Jacob Shin
Link: http://lkml.kernel.org/r/1319145326-13902-1-git-send-email-jacob.shin@amd.com
Signed-off-by: H. Peter Anvin

Jacob Shin
2012-10-18 01:59:39 +0800