Eric Lee / linux-smarc-t335x-v3.2

19 Oct, 2010

1 commit

e360adbe2 irq_work: Add generic hardirq context callbacks ... Browse Code »

Provide a mechanism that allows running code in IRQ context. It is
most useful for NMI code that needs to interact with the rest of the
system -- like wakeup a task to drain buffers.

Perf currently has such a mechanism, so extract that and provide it as
a generic feature, independent of perf so that others may also
benefit.

The IRQ context callback is generated through self-IPIs where
possible, or on architectures like powerpc the decrementer (the
built-in timer facility) is set to generate an interrupt immediately.

Architectures that don't have anything like this get to do with a
callback from the timer tick. These architectures can call
irq_work_run() at the tail of any IRQ handlers that might enqueue such
work (like the perf IRQ handler) to avoid undue latencies in
processing the work.

Signed-off-by: Peter Zijlstra
Acked-by: Kyle McMartin
Acked-by: Martin Schwidefsky
[ various fixes ]
Signed-off-by: Huang Ying
LKML-Reference:
Signed-off-by: Ingo Molnar

Peter Zijlstra
2010-10-19 01:58:50 +0800

08 Oct, 2010

1 commit

7cd2541cf Merge commit 'v2.6.36-rc7' into perf/core ... Browse Code »

Conflicts:
arch/x86/kernel/module.c

Merge reason: Resolve the conflict, pick up fixes.

Signed-off-by: Ingo Molnar

Ingo Molnar
2010-10-08 16:46:27 +0800

28 Sep, 2010

2 commits

050026fea Merge branch 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'x86/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Avoid 'constant_test_bit()' misoptimization due to cast to non-volatile

Linus Torvalds
2010-09-28 12:19:27 +0800
6a6aa2b7e Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86/amd-iommu: Fix rounding-bug in __unmap_single
x86/amd-iommu: Work around S3 BIOS bug
x86/amd-iommu: Set iommu configuration flags in enable-loop
x86, setup: Fix earlyprintk=serial,0x3f8,115200
x86, setup: Fix earlyprintk=serial,ttyS0,115200

Linus Torvalds
2010-09-28 03:22:21 +0800

27 Sep, 2010

1 commit

c9e2fbd90 x86: Avoid 'constant_test_bit()' misoptimization due to cast to non-volatile ... Browse Code »

While debugging bit_spin_lock() hang, it was tracked down to gcc-4.4
misoptimization of non-inlined constant_test_bit() due to non-volatile
addr when 'const volatile unsigned long *addr' cast to 'unsigned long *'
with subsequent unconditional jump to pause (and not to the test) leading
to hang.

Compiling with gcc-4.3 or disabling CONFIG_OPTIMIZE_INLINING yields inlined
constant_test_bit() and correct jump, thus working around the kernel bug.

Other arches than asm-x86 may implement this slightly differently;
2.6.29 mitigates the misoptimization by changing the function prototype
(commit c4295fbb6048d85f0b41c5ced5cbf63f6811c46c) but probably fixing the issue
itself is better.

Signed-off-by: Alexander Chumachenko
Signed-off-by: Michael Shigorin
Acked-by: Linus Torvalds
Signed-off-by: H. Peter Anvin

Alexander Chumachenko
2010-09-27 13:43:07 +0800

25 Sep, 2010

1 commit

a46590533 x86/hwmon: fix initialization of coretemp ... Browse Code »

Using cpuid_eax() to determine feature availability on other than
the current CPU is invalid. And feature availability should also be
checked in the hotplug code path.

Signed-off-by: Jan Beulich
Cc: Rudolf Marek
Cc: Fenghua Yu
Signed-off-by: Guenter Roeck

Jan Beulich
2010-09-25 02:44:19 +0800

24 Sep, 2010

2 commits

7329cf020 Merge branch 'amd-iommu/2.6.36' of git://git.kernel.org/pub/scm/linux/kernel/git… ... Browse Code »

…/joro/linux-2.6-iommu into x86/urgent

Ingo Molnar
2010-09-24 17:19:53 +0800
a5a2bad55 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/ro… ... Browse Code »

…stedt/linux-2.6-trace into perf/core

Ingo Molnar
2010-09-24 15:12:05 +0800

23 Sep, 2010

6 commits

4c894f47b x86/amd-iommu: Work around S3 BIOS bug ... Browse Code »

This patch adds a workaround for an IOMMU BIOS problem to
the AMD IOMMU driver. The result of the bug is that the
IOMMU does not execute commands anymore when the system
comes out of the S3 state resulting in system failure. The
bug in the BIOS is that is does not restore certain hardware
specific registers correctly. This workaround reads out the
contents of these registers at boot time and restores them
on resume from S3. The workaround is limited to the specific
IOMMU chipset where this problem occurs.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel

Joerg Roedel
2010-09-23 22:26:03 +0800
e9bf51971 x86/amd-iommu: Set iommu configuration flags in enable-loop ... Browse Code »

This patch moves the setting of the configuration and
feature flags out out the acpi table parsing path and moves
it into the iommu-enable path. This is needed to reliably
fix resume-from-s3.

Cc: stable@kernel.org
Signed-off-by: Joerg Roedel

Joerg Roedel
2010-09-23 22:24:50 +0800
95fccd465 jump label: Remove duplicate structure for x86 ... Browse Code »

The structure in the x86 jump label code uses the typedef jump_label_t,
which is defined by the #ifdef arch type. The structure does not need
to be duplicated there.

Signed-off-by: Steven Rostedt

Steven Rostedt
2010-09-23 05:37:43 +0800
d9f5ab7b1 jump label: x86 support ... Browse Code »

add x86 support for jump label. I'm keeping this patch separate so its clear
to arch maintainers what was required for x86 support this new feature.
Hopefully, it wouldn't be too painful for other archs.

Signed-off-by: Jason Baron
LKML-Reference:

[ cleaned up some formatting ]

Signed-off-by: Steven Rostedt

Jason Baron
2010-09-23 04:33:03 +0800
bf5438fca jump label: Base patch for jump label ... Browse Code »

base patch to implement 'jump labeling'. Based on a new 'asm goto' inline
assembly gcc mechanism, we can now branch to labels from an 'asm goto'
statment. This allows us to create a 'no-op' fastpath, which can subsequently
be patched with a jump to the slowpath code. This is useful for code which
might be rarely used, but which we'd like to be able to call, if needed.
Tracepoints are the current usecase that these are being implemented for.

Acked-by: David S. Miller
Signed-off-by: Jason Baron
LKML-Reference:

[ cleaned up some formating ]

Signed-off-by: Steven Rostedt

Jason Baron
2010-09-23 04:29:41 +0800
90edf27fb Merge branch 'linus' into perf/core ... Browse Code »

Conflicts:
kernel/hw_breakpoint.c

Merge reason: resolve the conflict.

Signed-off-by: Ingo Molnar

Ingo Molnar
2010-09-23 00:45:01 +0800

22 Sep, 2010

1 commit

87ac6fa26 Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel… ... Browse Code »

…/git/tip/linux-2.6-tip

* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
hw breakpoints: Fix pid namespace bug
x86: Fix instruction breakpoint encoding
oprofile: Add Support for Intel CPU Family 6 / Model 22 (Intel Celeron 540)
kprobes: Fix Kconfig dependency

Linus Torvalds
2010-09-22 04:21:42 +0800

21 Sep, 2010

3 commits

7ed569206 Merge commit 'v2.6.36-rc5' into perf/core ... Browse Code »

Merge reason: Pick up the latest fixes in -rc5.

Signed-off-by: Ingo Molnar

Ingo Molnar
2010-09-21 19:55:11 +0800
fa6f2cc77 jump label: Make text_poke_early() globally visible ... Browse Code »

Make text_poke_early available outside of alternative.c. The jump label
patchset wants to make use of it in order to set up the optimal no-op
sequences at run-time.

Signed-off-by: Jason Baron
LKML-Reference:
Signed-off-by: Steven Rostedt

Jason Baron
2010-09-21 06:19:51 +0800
f49aa4485 jump label: Make dynamic no-op selection available outside of ftrace ... Browse Code »

Move Steve's code for finding the best 5-byte no-op from ftrace.c to
alternative.c. The idea is that other consumers (in this case jump label)
want to make use of that code.

Signed-off-by: Jason Baron
LKML-Reference:
Signed-off-by: Steven Rostedt

Jason Baron
2010-09-21 06:19:39 +0800

17 Sep, 2010

2 commits

a5b617368 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: hpet: Work around hardware stupidity
x86, build: Disable -fPIE when compiling with CONFIG_CC_STACKPROTECTOR=y
x86, cpufeature: Suppress compiler warning with gcc 3.x
x86, UV: Fix initialization of max_pnode

Linus Torvalds
2010-09-17 10:38:08 +0800
89e45aac4 x86: Fix instruction breakpoint encoding ... Browse Code »

Lengths and types of breakpoints are encoded in a half byte
into CPU registers. However when we extract these values
and store them, we add a high half byte part to them: 0x40 to the
length and 0x80 to the type.
When that gets reloaded to the CPU registers, the high part
is masked.

While making the instruction breakpoints available for perf,
I zapped that high part on instruction breakpoint encoding
and that broke the arch -> generic translation used by ptrace
instruction breakpoints. Writing dr7 to set an inst breakpoint
was then failing.

There is no apparent reason for these high parts so we could get
rid of them altogether. That's an invasive change though so let's
do that later and for now fix the problem by restoring that inst
breakpoint high part encoding in this sole patch.

Reported-by: Kelvie Wong
Signed-off-by: Frederic Weisbecker
Cc: Prasad
Cc: Mahesh Salgaonkar
Cc: Will Deacon

Frederic Weisbecker
2010-09-17 09:24:13 +0800

15 Sep, 2010

3 commits

3aabae7d9 Merge branch 'tip/perf/core' of git://git.kernel.org/pub/scm/linux/kernel/git/ro… ... Browse Code »

…stedt/linux-2.6-trace into perf/core

Ingo Molnar
2010-09-15 16:27:31 +0800
c41d68a51 compat: Make compat_alloc_user_space() incorporate the access_ok() ... Browse Code »

compat_alloc_user_space() expects the caller to independently call
access_ok() to verify the returned area. A missing call could
introduce problems on some architectures.

This patch incorporates the access_ok() check into
compat_alloc_user_space() and also adds a sanity check on the length.
The existing compat_alloc_user_space() implementations are renamed
arch_compat_alloc_user_space() and are used as part of the
implementation of the new global function.

This patch assumes NULL will cause __get_user()/__put_user() to either
fail or access userspace on all architectures. This should be
followed by checking the return value of compat_access_user_space()
for NULL in the callers, at which time the access_ok() in the callers
can also be removed.

Reported-by: Ben Hawkes
Signed-off-by: H. Peter Anvin
Acked-by: Benjamin Herrenschmidt
Acked-by: Chris Metcalf
Acked-by: David S. Miller
Acked-by: Ingo Molnar
Acked-by: Thomas Gleixner
Acked-by: Tony Luck
Cc: Andrew Morton
Cc: Arnd Bergmann
Cc: Fenghua Yu
Cc: H. Peter Anvin
Cc: Heiko Carstens
Cc: Helge Deller
Cc: James Bottomley
Cc: Kyle McMartin
Cc: Martin Schwidefsky
Cc: Paul Mackerras
Cc: Ralf Baechle
Cc:

H. Peter Anvin
2010-09-15 07:08:45 +0800
54ff7e595 x86: hpet: Work around hardware stupidity ... Browse Code »

This more or less reverts commits 08be979 (x86: Force HPET
readback_cmp for all ATI chipsets) and 30a564be (x86, hpet: Restrict
read back to affected ATI chipsets) to the status of commit 8da854c
(x86, hpet: Erratum workaround for read after write of HPET
comparator).

The delta to commit 8da854c is mostly comments and the change from
WARN_ONCE to printk_once as we know the call path of this function
already.

This needs really in depth explanation:

First of all the HPET design is a complete failure. Having a counter
compare register which generates an interrupt on matching values
forces the software to do at least one superfluous readback of the
counter register.

While it is nice in theory to program "absolute" time events it is
practically useless because the timer runs at some absurd frequency
which can never be matched to real world units. So we are forced to
calculate a relative delta and this forces a readout of the actual
counter value, adding the delta and programming the compare
register. When the delta is small enough we run into the danger that
we program a compare value which is already in the past. Due to the
compare for equal nature of HPET we need to read back the counter
value after writing the compare rehgister (btw. this is necessary for
absolute timeouts as well) to make sure that we did not miss the timer
event. We try to work around that by setting the minimum delta to a
value which is larger than the theoretical time which elapses between
the counter readout and the compare register write, but that's only
true in theory. A NMI or SMI which hits between the readout and the
write can easily push us beyond that limit. This would result in
waiting for the next HPET timer interrupt until the 32bit wraparound
of the counter happens which takes about 306 seconds.

So we designed the next event function to look like:

match = read_cnt() + delta;
write_compare_ref(match);
return read_cnt() < match ? 0 : -ETIME;

At some point we got into trouble with certain ATI chipsets. Even the
above "safe" procedure failed. The reason was that the write to the
compare register was delayed probably for performance reasons. The
theory was that they wanted to avoid the synchronization of the write
with the HPET clock, which is understandable. So the write does not
hit the compare register directly instead it goes to some intermediate
register which is copied to the real compare register in sync with the
HPET clock. That opens another window for hitting the dreaded "wait
for a wraparound" problem.

To work around that "optimization" we added a read back of the compare
register which either enforced the update of the just written value or
just delayed the readout of the counter enough to avoid the issue. We
unfortunately never got any affirmative info from ATI/AMD about this.

One thing is sure, that we nuked the performance "optimization" that
way completely and I'm pretty sure that the result is worse than
before some HW folks came up with those.

Just for paranoia reasons I added a check whether the read back
compare register value was the same as the value we wrote right
before. That paranoia check triggered a couple of years after it was
added on an Intel ICH9 chipset. Venki added a workaround (commit
8da854c) which was reading the compare register twice when the first
check failed. We considered this to be a penalty in general and
restricted the readback (thus the wasted CPU cycles) to the known to
be affected ATI chipsets.

This turned out to be a utterly wrong decision. 2.6.35 testers
experienced massive problems and finally one of them bisected it down
to commit 30a564be which spured some further investigation.

Finally we got confirmation that the write to the compare register can
be delayed by up to two HPET clock cycles which explains the problems
nicely. All we can do about this is to go back to Venki's initial
workaround in a slightly modified version.

Just for the record I need to say, that all of this could have been
avoided if hardware designers and of course the HPET committee would
have thought about the consequences for a split second. It's out of my
comprehension why designing a working timer is so hard. There are two
ways to achieve it:

1) Use a counter wrap around aware compare_reg
Reported-by: Artur Skawina
Reported-by: Damien Wyart
Reported-by: John Drescher
Cc: Venkatesh Pallipadi
Cc: Ingo Molnar
Cc: H. Peter Anvin
Cc: Arjan van de Ven
Cc: Andreas Herrmann
Cc: Borislav Petkov
Cc: stable@kernel.org
Acked-by: Suresh Siddha
Signed-off-by: Thomas Gleixner

Thomas Gleixner
2010-09-15 06:55:13 +0800

14 Sep, 2010

1 commit

2fd818642 x86, cpufeature: Suppress compiler warning with gcc 3.x ... Browse Code »

Gcc 3.x generates a warning

arch/x86/include/asm/cpufeature.h: In function `__static_cpu_has':
arch/x86/include/asm/cpufeature.h:326: warning: asm operand 1 probably doesn't match constraints

on each file.
But static_cpu_has() for gcc 3.x does not need __static_cpu_has().

Signed-off-by: Tetsuo Handa
LKML-Reference:
Signed-off-by: H. Peter Anvin

Tetsuo Handa
2010-09-14 05:48:41 +0800

10 Sep, 2010

1 commit

be6200aac Merge branch 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm ... Browse Code »

* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Perform hardware_enable in CPU_STARTING callback
KVM: i8259: fix migration
KVM: fix i8259 oops when no vcpus are online
KVM: x86 emulator: fix regression with cmpxchg8b on i386 hosts

Linus Torvalds
2010-09-10 23:02:45 +0800

09 Sep, 2010

2 commits

1faa6ec8c Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mcheck: Avoid duplicate sysfs links/files for thresholding banks
io-mapping: Fix the address space annotations
x86: Fix the address space annotations of iomap_atomic_prot_pfn()
x86, mm: Fix CONFIG_VMSPLIT_1G and 2G_OPT trampoline
x86, hwmon: Fix unsafe smp_processor_id() in thermal_throttle_add_dev

Linus Torvalds
2010-09-09 02:14:10 +0800
16518d5ad KVM: x86 emulator: fix regression with cmpxchg8b on i386 hosts ... Browse Code »

operand::val and operand::orig_val are 32-bit on i386, whereas cmpxchg8b
operands are 64-bit.

Fix by adding val64 and orig_val64 union members to struct operand, and
using them where needed.

Signed-off-by: Avi Kivity
Signed-off-by: Marcelo Tosatti

Avi Kivity
2010-09-09 01:50:55 +0800

08 Sep, 2010

1 commit

d56557af1 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6 ... Browse Code »

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: bus speed strings should be const
PCI hotplug: Fix build with CONFIG_ACPI unset
PCI: PCIe: Remove the port driver module exit routine
PCI: PCIe: Move PCIe PME code to the pcie directory
PCI: PCIe: Disable PCIe port services during port initialization
PCI: PCIe: Ask BIOS for control of all native services at once
ACPI/PCI: Negotiate _OSC control bits before requesting them
ACPI/PCI: Do not preserve _OSC control bits returned by a query
ACPI/PCI: Make acpi_pci_query_osc() return control bits
ACPI/PCI: Reorder checks in acpi_pci_osc_control_set()
PCI: PCIe: Introduce commad line switch for disabling port services
PCI: PCIe AER: Introduce pci_aer_available()
x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set
PCI: provide stub pci_domain_nr function for !CONFIG_PCI configs

Linus Torvalds
2010-09-08 07:00:17 +0800

05 Sep, 2010

1 commit

cc1a8e523 x86: Fix the address space annotations of iomap_atomic_prot_pfn() ... Browse Code »

This patch fixes the sparse warnings when the return pointer of
iomap_atomic_prot_pfn() is used as an argument of iowrite32()
and friends.

Signed-off-by: Francisco Jerez
LKML-Reference:
Cc: Andrew Morton
Signed-off-by: Ingo Molnar

Francisco Jerez
2010-09-05 20:26:14 +0800

01 Sep, 2010

1 commit

c9cf4a019 perf, x86, Pentium4: Add RAW events verification ... Browse Code »

Implements verification of

- Bits of ESCR EventMask field (meaningful bits in field are hardware
predefined and others bits should be set to zero)

- INSTR_COMPLETED event (it is available on predefined cpu model only)

- Thread shared events (they should be guarded by "perf_event_paranoid"
sysctl due to security reason). The side effect of this action is
that PERF_COUNT_HW_BUS_CYCLES become a "paranoid" general event.

Signed-off-by: Cyrill Gorcunov
Tested-by: Lin Ming
Cc: Frederic Weisbecker
Cc: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Cyrill Gorcunov
2010-09-01 14:26:56 +0800

25 Aug, 2010

1 commit

5e686019d Merge branch 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kerne… ... Browse Code »

…l/git/tip/linux-2.6-tip

* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep states
sched: Fix rq->clock synchronization when migrating tasks

Linus Torvalds
2010-08-25 23:40:56 +0800

21 Aug, 2010

1 commit

36423a5ed Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, apic: Fix apic=debug boot crash
x86, hotplug: Serialize CPU hotplug to avoid bringup concurrency issues
x86-32: Fix dummy trampoline-related inline stubs
x86-32: Separate 1:1 pagetables from swapper_pg_dir
x86, cpu: Fix regression in AMD errata checking code

Linus Torvalds
2010-08-21 05:25:08 +0800

20 Aug, 2010

1 commit

cd7240c0b x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep states ... Browse Code »

TSC's get reset after suspend/resume (even on cpu's with invariant TSC
which runs at a constant rate across ACPI P-, C- and T-states). And in
some systems BIOS seem to reinit TSC to arbitrary large value (still
sync'd across cpu's) during resume.

This leads to a scenario of scheduler rq->clock (sched_clock_cpu()) less
than rq->age_stamp (introduced in 2.6.32). This leads to a big value
returned by scale_rt_power() and the resulting big group power set by the
update_group_power() is causing improper load balancing between busy and
idle cpu's after suspend/resume.

This resulted in multi-threaded workloads (like kernel-compilation) go
slower after suspend/resume cycle on core i5 laptops.

Fix this by recomputing cyc2ns_offset's during resume, so that
sched_clock() continues from the point where it was left off during
suspend.

Reported-by: Florian Pritz
Signed-off-by: Suresh Siddha
Cc: # [v2.6.32+]
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar

Suresh Siddha
2010-08-20 20:59:02 +0800

19 Aug, 2010

2 commits

8848a9106 x86-32: Fix dummy trampoline-related inline stubs ... Browse Code »

Fix dummy inline stubs for trampoline-related functions when no
trampolines exist (until we get rid of the no-trampoline case
entirely.)

Signed-off-by: H. Peter Anvin
Cc: Joerg Roedel
Cc: Borislav Petkov
LKML-Reference:

H. Peter Anvin
2010-08-19 03:42:24 +0800
fd89a1379 x86-32: Separate 1:1 pagetables from swapper_pg_dir ... Browse Code »

This patch fixes machine crashes which occur when heavily exercising the
CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by
AMD Erratum 383 and result in a fatal machine check exception. Here's
the scenario:

1. On 32-bit, the swapper_pg_dir page table is used as the initial page
table for booting a secondary CPU.

2. To make this work, swapper_pg_dir needs a direct mapping of physical
memory in it (the low mappings). By adding those low, large page (2M)
mappings (PAE kernel), we create the necessary conditions for Erratum
383 to occur.

3. Other CPUs which do not participate in the off- and onlining game may
use swapper_pg_dir while the low mappings are present (when leave_mm is
called). For all steps below, the CPU referred to is a CPU that is using
swapper_pg_dir, and not the CPU which is being onlined.

4. The presence of the low mappings in swapper_pg_dir can result
in TLB entries for addresses below __PAGE_OFFSET to be established
speculatively. These TLB entries are marked global and large.

5. When the CPU with such TLB entry switches to another page table, this
TLB entry remains because it is global.

6. The process then generates an access to an address covered by the
above TLB entry but there is a permission mismatch - the TLB entry
covers a large global page not accessible to userspace.

7. Due to this permission mismatch a new 4kb, user TLB entry gets
established. Further, Erratum 383 provides for a small window of time
where both TLB entries are present. This results in an uncorrectable
machine check exception signalling a TLB multimatch which panics the
machine.

There are two ways to fix this issue:

1. Always do a global TLB flush when a new cr3 is loaded and the
old page table was swapper_pg_dir. I consider this a hack hard
to understand and with performance implications

2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit
does.

This patch implements solution 2. It introduces a trampoline_pg_dir
which has the same layout as swapper_pg_dir with low_mappings. This page
table is used as the initial page table of the booting CPU. Later in the
bringup process, it switches to swapper_pg_dir and does a global TLB
flush. This fixes the crashes in our test cases.

-v2: switch to swapper_pg_dir right after entering start_secondary() so
that we are able to access percpu data which might not be mapped in the
trampoline page table.

Signed-off-by: Joerg Roedel
LKML-Reference:
Signed-off-by: Borislav Petkov
Signed-off-by: H. Peter Anvin

Joerg Roedel
2010-08-19 00:17:20 +0800

18 Aug, 2010

2 commits

d7627467b Make do_execve() take a const filename pointer ... Browse Code »

Make do_execve() take a const filename pointer so that kernel_execve() compiles
correctly on ARM:

arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type

This also requires the argv and envp arguments to be consted twice, once for
the pointer array and once for the strings the array points to. This is
because do_execve() passes a pointer to the filename (now const) to
copy_strings_kernel(). A simpler alternative would be to cast the filename
pointer in do_execve() when it's passed to copy_strings_kernel().

do_execve() may not change any of the strings it is passed as part of the argv
or envp lists as they are some of them in .rodata, so marking these strings as
const should be fine.

Further kernel_execve() and sys_execve() need to be changed to match.

This has been test built on x86_64, frv, arm and mips.

Signed-off-by: David Howells
Tested-by: Ralf Baechle
Acked-by: Russell King
Signed-off-by: Linus Torvalds

David Howells
2010-08-18 09:07:43 +0800
23b90cfd7 x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set ... Browse Code »

Otherwise we'll duplicate definitions with the pci.h stubs.

Reported-by: Randy Dunlap
Acked-by: Randy Dunlap
Signed-off-by: Jesse Barnes

Jesse Barnes
2010-08-18 00:29:36 +0800

15 Aug, 2010

1 commit

bf56fba67 archs: replace unifdef-y with header-y ... Browse Code »

unifdef-y and header-y have same semantic, so drop unifdef-y

Signed-off-by: Sam Ravnborg

Sam Ravnborg
2010-08-15 04:26:51 +0800

14 Aug, 2010

2 commits

c206d44ff Merge branch 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, UV: Make kdump avoid stack dumps - fix !CONFIG_KEXEC breakage
x86, UV: Initialize BAU hub map
x86, UV: Make kdump avoid stack dumps

Linus Torvalds
2010-08-14 09:00:25 +0800
c78873252 Mark arguments to certain syscalls as being const ... Browse Code »

Mark arguments to certain system calls as being const where they should be but
aren't. The list includes:

(*) The filename arguments of various stat syscalls, execve(), various utimes
syscalls and some mount syscalls.

(*) The filename arguments of some syscall helpers relating to the above.

(*) The buffer argument of various write syscalls.

Signed-off-by: David Howells
Acked-by: David S. Miller
Signed-off-by: Linus Torvalds

David Howells
2010-08-14 07:53:13 +0800