Doug / smarc-fsl-linux-kernel | Embedian Git Server

11 Sep, 2010

1 commit

5ee5e97ee x86, tsc: Fix a preemption leak in restore_sched_clock_state() ... Browse Code »

A real life genuine preemption leak..

Reported-and-tested-by: Jeff Chua
Signed-off-by: Peter Zijlstra
Acked-by: Suresh Siddha
Signed-off-by: Linus Torvalds

Peter Zijlstra
2010-09-11 09:17:45 +0800

10 Sep, 2010

1 commit

be6200aac Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Perform hardware_enable in CPU_STARTING callback
KVM: i8259: fix migration
KVM: fix i8259 oops when no vcpus are online
KVM: x86 emulator: fix regression with cmpxchg8b on i386 hosts

Linus Torvalds
2010-09-10 23:02:45 +0800

09 Sep, 2010

5 commits

1faa6ec8c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mcheck: Avoid duplicate sysfs links/files for thresholding banks
io-mapping: Fix the address space annotations
x86: Fix the address space annotations of iomap_atomic_prot_pfn()
x86, mm: Fix CONFIG_VMSPLIT_1G and 2G_OPT trampoline
x86, hwmon: Fix unsafe smp_processor_id() in thermal_throttle_add_dev

Linus Torvalds
2010-09-09 02:14:10 +0800
899edae61 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86: Try to handle unknown nmis with an enabled PMU
perf, x86: Fix handle_irq return values
perf, x86: Fix accidentally ack'ing a second event on intel perf counter
oprofile, x86: fix init_sysfs() function stub
lockup_detector: Sync touch_*_watchdog back to old semantics
tracing: Fix a race in function profile
oprofile, x86: fix init_sysfs error handling
perf_events: Fix time tracking for events with pid != -1 and cpu != -1
perf: Initialize callchains roots's childen hits
oprofile: fix crash when accessing freed task structs

Linus Torvalds
2010-09-09 02:13:16 +0800
eebb5f31b KVM: i8259: fix migration ... Browse Code »

Top of kvm_kpic_state structure should have the same memory layout as
kvm_pic_state since it is copied by memcpy.

Signed-off-by: Gleb Natapov
Signed-off-by: Avi Kivity

Gleb Natapov
2010-09-09 01:50:58 +0800
ae0635b35 KVM: fix i8259 oops when no vcpus are online ... Browse Code »

If there are no vcpus, found will be NULL. Check before doing anything with
it.

Signed-off-by: Avi Kivity

Avi Kivity
2010-09-09 01:50:56 +0800
16518d5ad KVM: x86 emulator: fix regression with cmpxchg8b on i386 hosts ... Browse Code »

operand::val and operand::orig_val are 32-bit on i386, whereas cmpxchg8b
operands are 64-bit.

Fix by adding val64 and orig_val64 union members to struct operand, and
using them where needed.

Signed-off-by: Avi Kivity
Signed-off-by: Marcelo Tosatti

Avi Kivity
2010-09-09 01:50:55 +0800

08 Sep, 2010

1 commit

d56557af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: bus speed strings should be const
PCI hotplug: Fix build with CONFIG_ACPI unset
PCI: PCIe: Remove the port driver module exit routine
PCI: PCIe: Move PCIe PME code to the pcie directory
PCI: PCIe: Disable PCIe port services during port initialization
PCI: PCIe: Ask BIOS for control of all native services at once
ACPI/PCI: Negotiate _OSC control bits before requesting them
ACPI/PCI: Do not preserve _OSC control bits returned by a query
ACPI/PCI: Make acpi_pci_query_osc() return control bits
ACPI/PCI: Reorder checks in acpi_pci_osc_control_set()
PCI: PCIe: Introduce commad line switch for disabling port services
PCI: PCIe AER: Introduce pci_aer_available()
x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set
PCI: provide stub pci_domain_nr function for !CONFIG_PCI configs

Linus Torvalds
2010-09-08 07:00:17 +0800

05 Sep, 2010

2 commits

1389298f7 x86, mcheck: Avoid duplicate sysfs links/files for thresholding banks ... Browse Code »

kobject_add_internal failed for threshold_bank2 with -EEXIST,
don't try to register things with the same name in the same
directory:

Pid: 1, comm: swapper Tainted: G W 2.6.31 #1
Call Trace:
[] ? kobject_add_internal+0x156/0x180
[] ? kobject_add+0x66/0x6b
[] ? kobject_init+0x42/0x82
[] ? kobject_create_and_add+0x34/0x63
[] ? threshold_create_bank+0x14f/0x259
[] ? mce_create_device+0x8d/0x1b8
[] ? threshold_init_device+0x3f/0x80
[] ? threshold_init_device+0x0/0x80
[] ? do_one_initcall+0x4f/0x143
[] ? kernel_init+0x14c/0x1a2
[] ? child_rip+0xa/0x20
[] ? kernel_init+0x0/0x1a2
[] ? child_rip+0x0/0x20
kobject_create_and_add: kobject_add error: -17

(Probably the for_each_cpu loop should be entirely removed.)

Signed-off-by: Andreas Herrmann
LKML-Reference:
Signed-off-by: Ingo Molnar

Andreas Herrmann
2010-09-05 20:35:49 +0800
cc1a8e523 x86: Fix the address space annotations of iomap_atomic_prot_pfn() ... Browse Code »

This patch fixes the sparse warnings when the return pointer of
iomap_atomic_prot_pfn() is used as an argument of iowrite32()
and friends.

Signed-off-by: Francisco Jerez
LKML-Reference:
Cc: Andrew Morton
Signed-off-by: Ingo Molnar

Francisco Jerez
2010-09-05 20:26:14 +0800

03 Sep, 2010

3 commits

4177c42a6 perf, x86: Try to handle unknown nmis with an enabled PMU ... Browse Code »

When the PMU is enabled it is valid to have unhandled nmis, two
events could trigger 'simultaneously' raising two back-to-back
NMIs. If the first NMI handles both, the latter will be empty
and daze the CPU.

The solution to avoid an 'unknown nmi' massage in this case was
simply to stop the nmi handler chain when the PMU is enabled by
stating the nmi was handled. This has the drawback that a) we
can not detect unknown nmis anymore, and b) subsequent nmi
handlers are not called.

This patch addresses this. Now, we check this unknown NMI if it
could be a PMU back-to-back NMI. Otherwise we pass it and let
the kernel handle the unknown nmi.

This is a debug log:

cpu #6, nmi #32333, skip_nmi #32330, handled = 1, time = 1934364430
cpu #6, nmi #32334, skip_nmi #32330, handled = 1, time = 1934704616
cpu #6, nmi #32335, skip_nmi #32336, handled = 2, time = 1936032320
cpu #6, nmi #32336, skip_nmi #32336, handled = 0, time = 1936034139
cpu #6, nmi #32337, skip_nmi #32336, handled = 1, time = 1936120100
cpu #6, nmi #32338, skip_nmi #32336, handled = 1, time = 1936404607
cpu #6, nmi #32339, skip_nmi #32336, handled = 1, time = 1937983416
cpu #6, nmi #32340, skip_nmi #32341, handled = 2, time = 1938201032
cpu #6, nmi #32341, skip_nmi #32341, handled = 0, time = 1938202830
cpu #6, nmi #32342, skip_nmi #32341, handled = 1, time = 1938443743
cpu #6, nmi #32343, skip_nmi #32341, handled = 1, time = 1939956552
cpu #6, nmi #32344, skip_nmi #32341, handled = 1, time = 1940073224
cpu #6, nmi #32345, skip_nmi #32341, handled = 1, time = 1940485677
cpu #6, nmi #32346, skip_nmi #32347, handled = 2, time = 1941947772
cpu #6, nmi #32347, skip_nmi #32347, handled = 1, time = 1941949818
cpu #6, nmi #32348, skip_nmi #32347, handled = 0, time = 1941951591
Uhhuh. NMI received for unknown reason 00 on CPU 6.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue

Deltas:

nmi #32334 340186
nmi #32335 1327704
nmi #32336 1819 <<<< back-to-back nmi [1]
nmi #32337 85961
nmi #32338 284507
nmi #32339 1578809
nmi #32340 217616
nmi #32341 1798 <<<< back-to-back nmi [2]
nmi #32342 240913
nmi #32343 1512809
nmi #32344 116672
nmi #32345 412453
nmi #32346 1462095 <<<< 1st nmi (standard) handling 2 counters
nmi #32347 2046 <<<< 2nd nmi (back-to-back) handling one
counter nmi #32348 1773 <<<< 3rd nmi (back-to-back)
handling no counter! [3]

For back-to-back nmi detection there are the following rules:

The PMU nmi handler was handling more than one counter and no
counter was handled in the subsequent nmi (see [1] and [2]
above).

There is another case if there are two subsequent back-to-back
nmis [3]. The 2nd is detected as back-to-back because the first
handled more than one counter. If the second handles one counter
and the 3rd handles nothing, we drop the 3rd nmi because it
could be a back-to-back nmi.

Signed-off-by: Robert Richter
Signed-off-by: Peter Zijlstra
[ renamed nmi variable to pmu_nmi to avoid clash with .nmi in entry.S ]
Signed-off-by: Don Zickus
Cc: peterz@infradead.org
Cc: gorcunov@gmail.com
Cc: fweisbec@gmail.com
Cc: ying.huang@intel.com
Cc: ming.m.lin@intel.com
Cc: eranian@google.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Robert Richter
2010-09-03 14:05:18 +0800
de725dec9 perf, x86: Fix handle_irq return values ... Browse Code »

Now that we rely on the number of handled overflows, ensure all
handle_irq implementations actually return the right number.

Signed-off-by: Peter Zijlstra
Signed-off-by: Don Zickus
Cc: peterz@infradead.org
Cc: robert.richter@amd.com
Cc: gorcunov@gmail.com
Cc: fweisbec@gmail.com
Cc: ying.huang@intel.com
Cc: ming.m.lin@intel.com
Cc: eranian@google.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-09-03 14:05:18 +0800
2e556b5b3 perf, x86: Fix accidentally ack'ing a second event on intel perf counter ... Browse Code »

During testing of a patch to stop having the perf subsytem
swallow nmis, it was uncovered that Nehalem boxes were randomly
getting unknown nmis when using the perf tool.

Moving the ack'ing of the PMI closer to when we get the status
allows the hardware to properly re-set the PMU bit signaling
another PMI was triggered during the processing of the first
PMI. This allows the new logic for dealing with the
shortcomings of multiple PMIs to handle the extra NMI by
'eat'ing it later.

Now one can wonder why are we getting a second PMI when we
disable all the PMUs in the begining of the NMI handler to
prevent such a case, for that I do not know. But I know the fix
below helps deal with this quirk.

Tested on multiple Nehalems where the problem was occuring.
With the patch, the code now loops a second time to handle the
second PMI (whereas before it was not).

Signed-off-by: Don Zickus
Cc: peterz@infradead.org
Cc: robert.richter@amd.com
Cc: gorcunov@gmail.com
Cc: fweisbec@gmail.com
Cc: ying.huang@intel.com
Cc: ming.m.lin@intel.com
Cc: eranian@google.com
LKML-Reference:
Signed-off-by: Ingo Molnar

Don Zickus
2010-09-03 14:05:17 +0800

02 Sep, 2010

2 commits

b4c69d45c Merge branch 'urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/rric/opro… ... Browse Code »

…file into perf/urgent

Ingo Molnar
2010-09-02 04:31:07 +0800
269f45c25 oprofile, x86: fix init_sysfs() function stub ... Browse Code »

The use of the return value of init_sysfs() with commit

10f0412 oprofile, x86: fix init_sysfs error handling

discovered the following build error for !CONFIG_PM:

.../linux/arch/x86/oprofile/nmi_int.c: In function ‘op_nmi_init’:
.../linux/arch/x86/oprofile/nmi_int.c:784: error: expected expression before ‘do’
make[2]: *** [arch/x86/oprofile/nmi_int.o] Error 1
make[1]: *** [arch/x86/oprofile] Error 2

This patch fixes this.

Reported-by: Ingo Molnar
Cc: stable@kernel.org
Signed-off-by: Robert Richter

Robert Richter
2010-09-02 03:23:01 +0800

31 Aug, 2010

1 commit

10f0412f5 oprofile, x86: fix init_sysfs error handling ... Browse Code »

On failure init_sysfs() might not properly free resources. The error
code of the function is not checked. And, when reinitializing the exit
function might be called twice. This patch fixes all this.

Cc: stable@kernel.org
Signed-off-by: Robert Richter

Robert Richter
2010-08-31 16:26:26 +0800

26 Aug, 2010

1 commit

d4348c678 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86, Pentium4: Clear the P4_CCCR_FORCE_OVF flag
tracing/trace_stack: Fix stack trace on ppc64

Linus Torvalds
2010-08-26 01:50:07 +0800

25 Aug, 2010

3 commits

5e686019d Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep states
sched: Fix rq->clock synchronization when migrating tasks

Linus Torvalds
2010-08-25 23:40:56 +0800
8d3309199 perf, x86, Pentium4: Clear the P4_CCCR_FORCE_OVF flag ... Browse Code »

If on Pentium4 CPUs the FORCE_OVF flag is set then an NMI happens
on every event, which can generate a flood of NMIs. Clear it.

Reported-by: Vince Weaver
Signed-off-by: Lin Ming
Signed-off-by: Cyrill Gorcunov
Cc: Frederic Weisbecker
Cc: Peter Zijlstra
Cc:
Signed-off-by: Ingo Molnar

Lin Ming
2010-08-25 21:15:33 +0800
b7d460897 x86, mm: Fix CONFIG_VMSPLIT_1G and 2G_OPT trampoline ... Browse Code »

rc2 kernel crashes when booting second cpu on this CONFIG_VMSPLIT_2G_OPT
laptop: whereas cloning from kernel to low mappings pgd range does need
to limit by both KERNEL_PGD_PTRS and KERNEL_PGD_BOUNDARY, cloning kernel
pgd range itself must not be limited by the smaller KERNEL_PGD_BOUNDARY.

Signed-off-by: Hugh Dickins
LKML-Reference:
Signed-off-by: H. Peter Anvin

Hugh Dickins
2010-08-25 14:05:17 +0800

24 Aug, 2010

1 commit

c05e1e23b Merge branch 'for-upstream/pvhvm' of git://xenbits.xensource.com/people/ianc/linux-2.6 ... Browse Code »

* 'for-upstream/pvhvm' of git://xenbits.xensource.com/people/ianc/linux-2.6:
xen: pvhvm: make it clearer that XEN_UNPLUG_* define bits in a bitfield
xen: pvhvm: rename xen_emul_unplug=ignore to =unnnecessary
xen: pvhvm: allow user to request no emulated device unplug

Linus Torvalds
2010-08-24 09:29:18 +0800

23 Aug, 2010

3 commits

1dc7ce99b xen: pvhvm: rename xen_emul_unplug=ignore to =unnnecessary ... Browse Code »

It is not immediately clear what this option causes to become
ignored. The actual meaning is that it is not necessary to unplug the
emulated devices to safely use the PV ones, even if the platform does
not support the unplug protocol. (pressumably the user will only add
this option if they have ensured that their domain configuration is
safe).

I think xen_emul_unplug=unnecessary better captures this.

Signed-off-by: Ian Campbell
Acked-by: Jeremy Fitzhardinge
Acked-by: Stefano Stabellini

Ian Campbell
2010-08-23 18:59:29 +0800
c93a4dfb3 xen: pvhvm: allow user to request no emulated device unplug ... Browse Code »

this allows the user to disable pvhvm and revert to emulated devices
in case of a system misconfiguration (e.g. initramfs with only
emulated drivers in it).

Signed-off-by: Ian Campbell
Acked-by: Jeremy Fitzhardinge
Acked-by: Stefano Stabellini

Ian Campbell
2010-08-23 18:59:28 +0800
3dc8d7f07 Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PIT: free irq source id in handling error path
KVM: destroy workqueue on kvm_create_pit() failures
KVM: fix poison overwritten caused by using wrong xstate size

Linus Torvalds
2010-08-23 02:27:36 +0800

22 Aug, 2010

1 commit

ddb0c5a68 Replace Configure with Enable in description of MAXSMP ... Browse Code »

The "Configure" word tends to make user believe they have to say 'yes'
to be able to choose the number of procs/nodes. "Enable" should be
unambiguous enough.

Signed-off-by: Samuel Thibault
Signed-off-by: Linus Torvalds

Samuel Thibault
2010-08-22 03:38:58 +0800

21 Aug, 2010

2 commits

51e3c1b55 x86, hwmon: Fix unsafe smp_processor_id() in thermal_throttle_add_dev ... Browse Code »

Fix BUG: using smp_processor_id() in preemptible thermal_throttle_add_dev.
We know the cpu number when calling thermal_throttle_add_dev, so we can
remove smp_processor_id call in thermal_throttle_add_dev by supplying
the cpu number as argument.

This should resolve kernel bugzilla 16615/16629.

Signed-off-by: Sergey Senozhatsky
LKML-Reference:
Cc: Fenghua Yu
Cc: Joerg Roedel
Cc: Maciej Rutecki
Signed-off-by: H. Peter Anvin

Sergey Senozhatsky
2010-08-21 10:56:00 +0800
36423a5ed Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, apic: Fix apic=debug boot crash
x86, hotplug: Serialize CPU hotplug to avoid bringup concurrency issues
x86-32: Fix dummy trampoline-related inline stubs
x86-32: Separate 1:1 pagetables from swapper_pg_dir
x86, cpu: Fix regression in AMD errata checking code

Linus Torvalds
2010-08-21 05:25:08 +0800

20 Aug, 2010

4 commits

cd7240c0b x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep states ... Browse Code »

TSC's get reset after suspend/resume (even on cpu's with invariant TSC
which runs at a constant rate across ACPI P-, C- and T-states). And in
some systems BIOS seem to reinit TSC to arbitrary large value (still
sync'd across cpu's) during resume.

This leads to a scenario of scheduler rq->clock (sched_clock_cpu()) less
than rq->age_stamp (introduced in 2.6.32). This leads to a big value
returned by scale_rt_power() and the resulting big group power set by the
update_group_power() is causing improper load balancing between busy and
idle cpu's after suspend/resume.

This resulted in multi-threaded workloads (like kernel-compilation) go
slower after suspend/resume cycle on core i5 laptops.

Fix this by recomputing cyc2ns_offset's during resume, so that
sched_clock() continues from the point where it was left off during
suspend.

Reported-by: Florian Pritz
Signed-off-by: Suresh Siddha
Cc: # [v2.6.32+]
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Suresh Siddha
2010-08-20 20:59:02 +0800
05e407603 x86, apic: Fix apic=debug boot crash ... Browse Code »

Fix a boot crash when apic=debug is used and the APIC is
not properly initialized.

This issue appears during Xen Dom0 kernel boot but the
fix is generic and the crash could occur on real hardware
as well.

Signed-off-by: Daniel Kiper
Cc: xen-devel@lists.xensource.com
Cc: konrad.wilk@oracle.com
Cc: jeremy@goop.org
Cc: # .35.x, .34.x, .33.x, .32.x
LKML-Reference:
Signed-off-by: Ingo Molnar

Daniel Kiper
2010-08-20 16:18:28 +0800
d7c53c9e8 x86, hotplug: Serialize CPU hotplug to avoid bringup concurrency issues ... Browse Code »

When testing cpu hotplug code on 32-bit we kept hitting the "CPU%d:
Stuck ??" message due to multiple cores concurrently accessing the
cpu_callin_mask, among others.

Since these codepaths are not protected from concurrent access due to
the fact that there's no sane reason for making an already complex
code unnecessarily more complex - we hit the issue only when insanely
switching cores off- and online - serialize hotplugging cores on the
sysfs level and be done with it.

[ v2.1: fix !HOTPLUG_CPU build ]

Cc:
Signed-off-by: Borislav Petkov
LKML-Reference:
Signed-off-by: H. Peter Anvin

Borislav Petkov
2010-08-20 05:47:43 +0800
b3ea36b7a Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
kprobes/x86: Fix the return address of multiple kretprobes
perf tools: Fix build error on read only source.
perf, x86: Fix Intel-nhm PMU programming errata workaround

Linus Torvalds
2010-08-20 00:06:49 +0800

19 Aug, 2010

4 commits

737480a0d kprobes/x86: Fix the return address of multiple kretprobes ... Browse Code »

Fix the return address of subsequent kretprobes when multiple
kretprobes are set on the same function.

For example:

# cd /sys/kernel/debug/tracing
# echo "r:event1 sys_symlink" > kprobe_events
# echo "r:event2 sys_symlink" >> kprobe_events
# echo 1 > events/kprobes/enable
# ln -s /tmp/foo /tmp/bar

(without this patch)

# cat trace
ln-897 [000] 20404.133727: event1: (kretprobe_trampoline+0x0/0x4c
Reviewed-by: Masami Hiramatsu
Cc: Frederic Weisbecker
Cc: Ananth N Mavinakayanahalli
Cc: Peter Zijlstra
Cc: YOSHIFUJI Hideaki
LKML-Reference:
Signed-off-by: Ingo Molnar

KUMANO Syuhei
2010-08-19 18:49:56 +0800
8848a9106 x86-32: Fix dummy trampoline-related inline stubs ... Browse Code »

Fix dummy inline stubs for trampoline-related functions when no
trampolines exist (until we get rid of the no-trampoline case
entirely.)

Signed-off-by: H. Peter Anvin
Cc: Joerg Roedel
Cc: Borislav Petkov
LKML-Reference:

H. Peter Anvin
2010-08-19 03:42:24 +0800
fd89a1379 x86-32: Separate 1:1 pagetables from swapper_pg_dir ... Browse Code »

This patch fixes machine crashes which occur when heavily exercising the
CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by
AMD Erratum 383 and result in a fatal machine check exception. Here's
the scenario:

1. On 32-bit, the swapper_pg_dir page table is used as the initial page
table for booting a secondary CPU.

2. To make this work, swapper_pg_dir needs a direct mapping of physical
memory in it (the low mappings). By adding those low, large page (2M)
mappings (PAE kernel), we create the necessary conditions for Erratum
383 to occur.

3. Other CPUs which do not participate in the off- and onlining game may
use swapper_pg_dir while the low mappings are present (when leave_mm is
called). For all steps below, the CPU referred to is a CPU that is using
swapper_pg_dir, and not the CPU which is being onlined.

4. The presence of the low mappings in swapper_pg_dir can result
in TLB entries for addresses below __PAGE_OFFSET to be established
speculatively. These TLB entries are marked global and large.

5. When the CPU with such TLB entry switches to another page table, this
TLB entry remains because it is global.

6. The process then generates an access to an address covered by the
above TLB entry but there is a permission mismatch - the TLB entry
covers a large global page not accessible to userspace.

7. Due to this permission mismatch a new 4kb, user TLB entry gets
established. Further, Erratum 383 provides for a small window of time
where both TLB entries are present. This results in an uncorrectable
machine check exception signalling a TLB multimatch which panics the
machine.

There are two ways to fix this issue:

1. Always do a global TLB flush when a new cr3 is loaded and the
old page table was swapper_pg_dir. I consider this a hack hard
to understand and with performance implications

2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit
does.

This patch implements solution 2. It introduces a trampoline_pg_dir
which has the same layout as swapper_pg_dir with low_mappings. This page
table is used as the initial page table of the booting CPU. Later in the
bringup process, it switches to swapper_pg_dir and does a global TLB
flush. This fixes the crashes in our test cases.

-v2: switch to swapper_pg_dir right after entering start_secondary() so
that we are able to access percpu data which might not be mapped in the
trampoline page table.

Signed-off-by: Joerg Roedel
LKML-Reference:
Signed-off-by: Borislav Petkov
Signed-off-by: H. Peter Anvin

Joerg Roedel
2010-08-19 00:17:20 +0800
07a7795ca x86, cpu: Fix regression in AMD errata checking code ... Browse Code »

A bug in the family-model-stepping matching code caused the presence of
errata to go undetected when OSVW was not used. This causes hangs on
some K8 systems because the E400 workaround is not enabled.

Signed-off-by: Hans Rosenfeld
LKML-Reference:
Signed-off-by: H. Peter Anvin

Hans Rosenfeld
2010-08-19 00:16:28 +0800

18 Aug, 2010

4 commits

351af0725 perf, x86: Fix Intel-nhm PMU programming errata workaround ... Browse Code »

Fix the Errata AAK100/AAP53/BD53 workaround, the officialy documented
workaround we implemented in:

11164cd: perf, x86: Add Nehelem PMU programming errata workaround

doesn't actually work fully and causes a stuck PMU state
under load and non-functioning perf profiling.

A functional workaround was found by trial & error.

Affects all Nehalem-class Intel PMUs.

Signed-off-by: Zhang Yanmin
Signed-off-by: Peter Zijlstra
LKML-Reference:
Cc: Arjan van de Ven
Cc: "H. Peter Anvin"
Cc: # .35.x
Signed-off-by: Ingo Molnar

Zhang, Yanmin
2010-08-18 17:17:39 +0800
392abeea5 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb ... Browse Code »

* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
vt,console,kdb: preserve console_blanked while in kdb
vt: fix regression warnings from KMS merge
arm,kgdb: fix GDB_MAX_REGS no longer used
kgdb: add missing __percpu markup in arch/x86/kernel/kgdb.c
kdb: fix compile error without CONFIG_KALLSYMS

Linus Torvalds
2010-08-18 09:36:19 +0800
d7627467b Make do_execve() take a const filename pointer ... Browse Code »

Make do_execve() take a const filename pointer so that kernel_execve() compiles
correctly on ARM:

arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type

This also requires the argv and envp arguments to be consted twice, once for
the pointer array and once for the strings the array points to. This is
because do_execve() passes a pointer to the filename (now const) to
copy_strings_kernel(). A simpler alternative would be to cast the filename
pointer in do_execve() when it's passed to copy_strings_kernel().

do_execve() may not change any of the strings it is passed as part of the argv
or envp lists as they are some of them in .rodata, so marking these strings as
const should be fine.

Further kernel_execve() and sys_execve() need to be changed to match.

This has been test built on x86_64, frv, arm and mips.

Signed-off-by: David Howells
Tested-by: Ralf Baechle
Acked-by: Russell King
Signed-off-by: Linus Torvalds

David Howells
2010-08-18 09:07:43 +0800
23b90cfd7 x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set ... Browse Code »

Otherwise we'll duplicate definitions with the pci.h stubs.

Reported-by: Randy Dunlap
Acked-by: Randy Dunlap
Signed-off-by: Jesse Barnes

Jesse Barnes
2010-08-18 00:29:36 +0800

17 Aug, 2010

1 commit

6b5d7a9f6 KVM: PIT: free irq source id in handling error path ... Browse Code »

Free irq source id if create pit workqueue fail

Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity

Xiao Guangrong
2010-08-17 17:04:23 +0800