11 Sep, 2010
1 commit
-
A real life genuine preemption leak..
Reported-and-tested-by: Jeff Chua
Signed-off-by: Peter Zijlstra
Acked-by: Suresh Siddha
Signed-off-by: Linus Torvalds
10 Sep, 2010
1 commit
-
* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: Perform hardware_enable in CPU_STARTING callback
KVM: i8259: fix migration
KVM: fix i8259 oops when no vcpus are online
KVM: x86 emulator: fix regression with cmpxchg8b on i386 hosts
09 Sep, 2010
5 commits
-
…git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, mcheck: Avoid duplicate sysfs links/files for thresholding banks
io-mapping: Fix the address space annotations
x86: Fix the address space annotations of iomap_atomic_prot_pfn()
x86, mm: Fix CONFIG_VMSPLIT_1G and 2G_OPT trampoline
x86, hwmon: Fix unsafe smp_processor_id() in thermal_throttle_add_dev -
…/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86: Try to handle unknown nmis with an enabled PMU
perf, x86: Fix handle_irq return values
perf, x86: Fix accidentally ack'ing a second event on intel perf counter
oprofile, x86: fix init_sysfs() function stub
lockup_detector: Sync touch_*_watchdog back to old semantics
tracing: Fix a race in function profile
oprofile, x86: fix init_sysfs error handling
perf_events: Fix time tracking for events with pid != -1 and cpu != -1
perf: Initialize callchains roots's childen hits
oprofile: fix crash when accessing freed task structs -
Top of kvm_kpic_state structure should have the same memory layout as
kvm_pic_state since it is copied by memcpy.Signed-off-by: Gleb Natapov
Signed-off-by: Avi Kivity -
If there are no vcpus, found will be NULL. Check before doing anything with
it.Signed-off-by: Avi Kivity
-
operand::val and operand::orig_val are 32-bit on i386, whereas cmpxchg8b
operands are 64-bit.Fix by adding val64 and orig_val64 union members to struct operand, and
using them where needed.Signed-off-by: Avi Kivity
Signed-off-by: Marcelo Tosatti
08 Sep, 2010
1 commit
-
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
PCI: bus speed strings should be const
PCI hotplug: Fix build with CONFIG_ACPI unset
PCI: PCIe: Remove the port driver module exit routine
PCI: PCIe: Move PCIe PME code to the pcie directory
PCI: PCIe: Disable PCIe port services during port initialization
PCI: PCIe: Ask BIOS for control of all native services at once
ACPI/PCI: Negotiate _OSC control bits before requesting them
ACPI/PCI: Do not preserve _OSC control bits returned by a query
ACPI/PCI: Make acpi_pci_query_osc() return control bits
ACPI/PCI: Reorder checks in acpi_pci_osc_control_set()
PCI: PCIe: Introduce commad line switch for disabling port services
PCI: PCIe AER: Introduce pci_aer_available()
x86/PCI: only define pci_domain_nr if PCI and PCI_DOMAINS are set
PCI: provide stub pci_domain_nr function for !CONFIG_PCI configs
05 Sep, 2010
2 commits
-
kobject_add_internal failed for threshold_bank2 with -EEXIST,
don't try to register things with the same name in the same
directory:Pid: 1, comm: swapper Tainted: G W 2.6.31 #1
Call Trace:
[] ? kobject_add_internal+0x156/0x180
[] ? kobject_add+0x66/0x6b
[] ? kobject_init+0x42/0x82
[] ? kobject_create_and_add+0x34/0x63
[] ? threshold_create_bank+0x14f/0x259
[] ? mce_create_device+0x8d/0x1b8
[] ? threshold_init_device+0x3f/0x80
[] ? threshold_init_device+0x0/0x80
[] ? do_one_initcall+0x4f/0x143
[] ? kernel_init+0x14c/0x1a2
[] ? child_rip+0xa/0x20
[] ? kernel_init+0x0/0x1a2
[] ? child_rip+0x0/0x20
kobject_create_and_add: kobject_add error: -17(Probably the for_each_cpu loop should be entirely removed.)
Signed-off-by: Andreas Herrmann
LKML-Reference:
Signed-off-by: Ingo Molnar -
This patch fixes the sparse warnings when the return pointer of
iomap_atomic_prot_pfn() is used as an argument of iowrite32()
and friends.Signed-off-by: Francisco Jerez
LKML-Reference:
Cc: Andrew Morton
Signed-off-by: Ingo Molnar
03 Sep, 2010
3 commits
-
When the PMU is enabled it is valid to have unhandled nmis, two
events could trigger 'simultaneously' raising two back-to-back
NMIs. If the first NMI handles both, the latter will be empty
and daze the CPU.The solution to avoid an 'unknown nmi' massage in this case was
simply to stop the nmi handler chain when the PMU is enabled by
stating the nmi was handled. This has the drawback that a) we
can not detect unknown nmis anymore, and b) subsequent nmi
handlers are not called.This patch addresses this. Now, we check this unknown NMI if it
could be a PMU back-to-back NMI. Otherwise we pass it and let
the kernel handle the unknown nmi.This is a debug log:
cpu #6, nmi #32333, skip_nmi #32330, handled = 1, time = 1934364430
cpu #6, nmi #32334, skip_nmi #32330, handled = 1, time = 1934704616
cpu #6, nmi #32335, skip_nmi #32336, handled = 2, time = 1936032320
cpu #6, nmi #32336, skip_nmi #32336, handled = 0, time = 1936034139
cpu #6, nmi #32337, skip_nmi #32336, handled = 1, time = 1936120100
cpu #6, nmi #32338, skip_nmi #32336, handled = 1, time = 1936404607
cpu #6, nmi #32339, skip_nmi #32336, handled = 1, time = 1937983416
cpu #6, nmi #32340, skip_nmi #32341, handled = 2, time = 1938201032
cpu #6, nmi #32341, skip_nmi #32341, handled = 0, time = 1938202830
cpu #6, nmi #32342, skip_nmi #32341, handled = 1, time = 1938443743
cpu #6, nmi #32343, skip_nmi #32341, handled = 1, time = 1939956552
cpu #6, nmi #32344, skip_nmi #32341, handled = 1, time = 1940073224
cpu #6, nmi #32345, skip_nmi #32341, handled = 1, time = 1940485677
cpu #6, nmi #32346, skip_nmi #32347, handled = 2, time = 1941947772
cpu #6, nmi #32347, skip_nmi #32347, handled = 1, time = 1941949818
cpu #6, nmi #32348, skip_nmi #32347, handled = 0, time = 1941951591
Uhhuh. NMI received for unknown reason 00 on CPU 6.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continueDeltas:
nmi #32334 340186
nmi #32335 1327704
nmi #32336 1819 <<<< back-to-back nmi [1]
nmi #32337 85961
nmi #32338 284507
nmi #32339 1578809
nmi #32340 217616
nmi #32341 1798 <<<< back-to-back nmi [2]
nmi #32342 240913
nmi #32343 1512809
nmi #32344 116672
nmi #32345 412453
nmi #32346 1462095 <<<< 1st nmi (standard) handling 2 counters
nmi #32347 2046 <<<< 2nd nmi (back-to-back) handling one
counter nmi #32348 1773 <<<< 3rd nmi (back-to-back)
handling no counter! [3]For back-to-back nmi detection there are the following rules:
The PMU nmi handler was handling more than one counter and no
counter was handled in the subsequent nmi (see [1] and [2]
above).There is another case if there are two subsequent back-to-back
nmis [3]. The 2nd is detected as back-to-back because the first
handled more than one counter. If the second handles one counter
and the 3rd handles nothing, we drop the 3rd nmi because it
could be a back-to-back nmi.Signed-off-by: Robert Richter
Signed-off-by: Peter Zijlstra
[ renamed nmi variable to pmu_nmi to avoid clash with .nmi in entry.S ]
Signed-off-by: Don Zickus
Cc: peterz@infradead.org
Cc: gorcunov@gmail.com
Cc: fweisbec@gmail.com
Cc: ying.huang@intel.com
Cc: ming.m.lin@intel.com
Cc: eranian@google.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
Now that we rely on the number of handled overflows, ensure all
handle_irq implementations actually return the right number.Signed-off-by: Peter Zijlstra
Signed-off-by: Don Zickus
Cc: peterz@infradead.org
Cc: robert.richter@amd.com
Cc: gorcunov@gmail.com
Cc: fweisbec@gmail.com
Cc: ying.huang@intel.com
Cc: ming.m.lin@intel.com
Cc: eranian@google.com
LKML-Reference:
Signed-off-by: Ingo Molnar -
During testing of a patch to stop having the perf subsytem
swallow nmis, it was uncovered that Nehalem boxes were randomly
getting unknown nmis when using the perf tool.Moving the ack'ing of the PMI closer to when we get the status
allows the hardware to properly re-set the PMU bit signaling
another PMI was triggered during the processing of the first
PMI. This allows the new logic for dealing with the
shortcomings of multiple PMIs to handle the extra NMI by
'eat'ing it later.Now one can wonder why are we getting a second PMI when we
disable all the PMUs in the begining of the NMI handler to
prevent such a case, for that I do not know. But I know the fix
below helps deal with this quirk.Tested on multiple Nehalems where the problem was occuring.
With the patch, the code now loops a second time to handle the
second PMI (whereas before it was not).Signed-off-by: Don Zickus
Cc: peterz@infradead.org
Cc: robert.richter@amd.com
Cc: gorcunov@gmail.com
Cc: fweisbec@gmail.com
Cc: ying.huang@intel.com
Cc: ming.m.lin@intel.com
Cc: eranian@google.com
LKML-Reference:
Signed-off-by: Ingo Molnar
02 Sep, 2010
2 commits
-
…file into perf/urgent
-
The use of the return value of init_sysfs() with commit
10f0412 oprofile, x86: fix init_sysfs error handling
discovered the following build error for !CONFIG_PM:
.../linux/arch/x86/oprofile/nmi_int.c: In function ‘op_nmi_init’:
.../linux/arch/x86/oprofile/nmi_int.c:784: error: expected expression before ‘do’
make[2]: *** [arch/x86/oprofile/nmi_int.o] Error 1
make[1]: *** [arch/x86/oprofile] Error 2This patch fixes this.
Reported-by: Ingo Molnar
Cc: stable@kernel.org
Signed-off-by: Robert Richter
31 Aug, 2010
1 commit
-
On failure init_sysfs() might not properly free resources. The error
code of the function is not checked. And, when reinitializing the exit
function might be called twice. This patch fixes all this.Cc: stable@kernel.org
Signed-off-by: Robert Richter
26 Aug, 2010
1 commit
-
…/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
perf, x86, Pentium4: Clear the P4_CCCR_FORCE_OVF flag
tracing/trace_stack: Fix stack trace on ppc64
25 Aug, 2010
3 commits
-
…l/git/tip/linux-2.6-tip
* 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep states
sched: Fix rq->clock synchronization when migrating tasks -
If on Pentium4 CPUs the FORCE_OVF flag is set then an NMI happens
on every event, which can generate a flood of NMIs. Clear it.Reported-by: Vince Weaver
Signed-off-by: Lin Ming
Signed-off-by: Cyrill Gorcunov
Cc: Frederic Weisbecker
Cc: Peter Zijlstra
Cc:
Signed-off-by: Ingo Molnar -
rc2 kernel crashes when booting second cpu on this CONFIG_VMSPLIT_2G_OPT
laptop: whereas cloning from kernel to low mappings pgd range does need
to limit by both KERNEL_PGD_PTRS and KERNEL_PGD_BOUNDARY, cloning kernel
pgd range itself must not be limited by the smaller KERNEL_PGD_BOUNDARY.Signed-off-by: Hugh Dickins
LKML-Reference:
Signed-off-by: H. Peter Anvin
24 Aug, 2010
1 commit
-
* 'for-upstream/pvhvm' of git://xenbits.xensource.com/people/ianc/linux-2.6:
xen: pvhvm: make it clearer that XEN_UNPLUG_* define bits in a bitfield
xen: pvhvm: rename xen_emul_unplug=ignore to =unnnecessary
xen: pvhvm: allow user to request no emulated device unplug
23 Aug, 2010
3 commits
-
It is not immediately clear what this option causes to become
ignored. The actual meaning is that it is not necessary to unplug the
emulated devices to safely use the PV ones, even if the platform does
not support the unplug protocol. (pressumably the user will only add
this option if they have ensured that their domain configuration is
safe).I think xen_emul_unplug=unnecessary better captures this.
Signed-off-by: Ian Campbell
Acked-by: Jeremy Fitzhardinge
Acked-by: Stefano Stabellini -
this allows the user to disable pvhvm and revert to emulated devices
in case of a system misconfiguration (e.g. initramfs with only
emulated drivers in it).Signed-off-by: Ian Campbell
Acked-by: Jeremy Fitzhardinge
Acked-by: Stefano Stabellini -
* 'kvm-updates/2.6.36' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: PIT: free irq source id in handling error path
KVM: destroy workqueue on kvm_create_pit() failures
KVM: fix poison overwritten caused by using wrong xstate size
22 Aug, 2010
1 commit
-
The "Configure" word tends to make user believe they have to say 'yes'
to be able to choose the number of procs/nodes. "Enable" should be
unambiguous enough.Signed-off-by: Samuel Thibault
Signed-off-by: Linus Torvalds
21 Aug, 2010
2 commits
-
Fix BUG: using smp_processor_id() in preemptible thermal_throttle_add_dev.
We know the cpu number when calling thermal_throttle_add_dev, so we can
remove smp_processor_id call in thermal_throttle_add_dev by supplying
the cpu number as argument.This should resolve kernel bugzilla 16615/16629.
Signed-off-by: Sergey Senozhatsky
LKML-Reference:
Cc: Fenghua Yu
Cc: Joerg Roedel
Cc: Maciej Rutecki
Signed-off-by: H. Peter Anvin -
…git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, apic: Fix apic=debug boot crash
x86, hotplug: Serialize CPU hotplug to avoid bringup concurrency issues
x86-32: Fix dummy trampoline-related inline stubs
x86-32: Separate 1:1 pagetables from swapper_pg_dir
x86, cpu: Fix regression in AMD errata checking code
20 Aug, 2010
4 commits
-
TSC's get reset after suspend/resume (even on cpu's with invariant TSC
which runs at a constant rate across ACPI P-, C- and T-states). And in
some systems BIOS seem to reinit TSC to arbitrary large value (still
sync'd across cpu's) during resume.This leads to a scenario of scheduler rq->clock (sched_clock_cpu()) less
than rq->age_stamp (introduced in 2.6.32). This leads to a big value
returned by scale_rt_power() and the resulting big group power set by the
update_group_power() is causing improper load balancing between busy and
idle cpu's after suspend/resume.This resulted in multi-threaded workloads (like kernel-compilation) go
slower after suspend/resume cycle on core i5 laptops.Fix this by recomputing cyc2ns_offset's during resume, so that
sched_clock() continues from the point where it was left off during
suspend.Reported-by: Florian Pritz
Signed-off-by: Suresh Siddha
Cc: # [v2.6.32+]
Signed-off-by: Peter Zijlstra
LKML-Reference:
Signed-off-by: Ingo Molnar -
Fix a boot crash when apic=debug is used and the APIC is
not properly initialized.This issue appears during Xen Dom0 kernel boot but the
fix is generic and the crash could occur on real hardware
as well.Signed-off-by: Daniel Kiper
Cc: xen-devel@lists.xensource.com
Cc: konrad.wilk@oracle.com
Cc: jeremy@goop.org
Cc: # .35.x, .34.x, .33.x, .32.x
LKML-Reference:
Signed-off-by: Ingo Molnar -
When testing cpu hotplug code on 32-bit we kept hitting the "CPU%d:
Stuck ??" message due to multiple cores concurrently accessing the
cpu_callin_mask, among others.Since these codepaths are not protected from concurrent access due to
the fact that there's no sane reason for making an already complex
code unnecessarily more complex - we hit the issue only when insanely
switching cores off- and online - serialize hotplugging cores on the
sysfs level and be done with it.[ v2.1: fix !HOTPLUG_CPU build ]
Cc:
Signed-off-by: Borislav Petkov
LKML-Reference:
Signed-off-by: H. Peter Anvin -
…/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
kprobes/x86: Fix the return address of multiple kretprobes
perf tools: Fix build error on read only source.
perf, x86: Fix Intel-nhm PMU programming errata workaround
19 Aug, 2010
4 commits
-
Fix the return address of subsequent kretprobes when multiple
kretprobes are set on the same function.For example:
# cd /sys/kernel/debug/tracing
# echo "r:event1 sys_symlink" > kprobe_events
# echo "r:event2 sys_symlink" >> kprobe_events
# echo 1 > events/kprobes/enable
# ln -s /tmp/foo /tmp/bar(without this patch)
# cat trace
ln-897 [000] 20404.133727: event1: (kretprobe_trampoline+0x0/0x4c
Reviewed-by: Masami Hiramatsu
Cc: Frederic Weisbecker
Cc: Ananth N Mavinakayanahalli
Cc: Peter Zijlstra
Cc: YOSHIFUJI Hideaki
LKML-Reference:
Signed-off-by: Ingo Molnar -
Fix dummy inline stubs for trampoline-related functions when no
trampolines exist (until we get rid of the no-trampoline case
entirely.)Signed-off-by: H. Peter Anvin
Cc: Joerg Roedel
Cc: Borislav Petkov
LKML-Reference: -
This patch fixes machine crashes which occur when heavily exercising the
CPU hotplug codepaths on a 32-bit kernel. These crashes are caused by
AMD Erratum 383 and result in a fatal machine check exception. Here's
the scenario:1. On 32-bit, the swapper_pg_dir page table is used as the initial page
table for booting a secondary CPU.2. To make this work, swapper_pg_dir needs a direct mapping of physical
memory in it (the low mappings). By adding those low, large page (2M)
mappings (PAE kernel), we create the necessary conditions for Erratum
383 to occur.3. Other CPUs which do not participate in the off- and onlining game may
use swapper_pg_dir while the low mappings are present (when leave_mm is
called). For all steps below, the CPU referred to is a CPU that is using
swapper_pg_dir, and not the CPU which is being onlined.4. The presence of the low mappings in swapper_pg_dir can result
in TLB entries for addresses below __PAGE_OFFSET to be established
speculatively. These TLB entries are marked global and large.5. When the CPU with such TLB entry switches to another page table, this
TLB entry remains because it is global.6. The process then generates an access to an address covered by the
above TLB entry but there is a permission mismatch - the TLB entry
covers a large global page not accessible to userspace.7. Due to this permission mismatch a new 4kb, user TLB entry gets
established. Further, Erratum 383 provides for a small window of time
where both TLB entries are present. This results in an uncorrectable
machine check exception signalling a TLB multimatch which panics the
machine.There are two ways to fix this issue:
1. Always do a global TLB flush when a new cr3 is loaded and the
old page table was swapper_pg_dir. I consider this a hack hard
to understand and with performance implications2. Do not use swapper_pg_dir to boot secondary CPUs like 64-bit
does.This patch implements solution 2. It introduces a trampoline_pg_dir
which has the same layout as swapper_pg_dir with low_mappings. This page
table is used as the initial page table of the booting CPU. Later in the
bringup process, it switches to swapper_pg_dir and does a global TLB
flush. This fixes the crashes in our test cases.-v2: switch to swapper_pg_dir right after entering start_secondary() so
that we are able to access percpu data which might not be mapped in the
trampoline page table.Signed-off-by: Joerg Roedel
LKML-Reference:
Signed-off-by: Borislav Petkov
Signed-off-by: H. Peter Anvin -
A bug in the family-model-stepping matching code caused the presence of
errata to go undetected when OSVW was not used. This causes hangs on
some K8 systems because the E400 workaround is not enabled.Signed-off-by: Hans Rosenfeld
LKML-Reference:
Signed-off-by: H. Peter Anvin
18 Aug, 2010
4 commits
-
Fix the Errata AAK100/AAP53/BD53 workaround, the officialy documented
workaround we implemented in:11164cd: perf, x86: Add Nehelem PMU programming errata workaround
doesn't actually work fully and causes a stuck PMU state
under load and non-functioning perf profiling.A functional workaround was found by trial & error.
Affects all Nehalem-class Intel PMUs.
Signed-off-by: Zhang Yanmin
Signed-off-by: Peter Zijlstra
LKML-Reference:
Cc: Arjan van de Ven
Cc: "H. Peter Anvin"
Cc: # .35.x
Signed-off-by: Ingo Molnar -
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
vt,console,kdb: preserve console_blanked while in kdb
vt: fix regression warnings from KMS merge
arm,kgdb: fix GDB_MAX_REGS no longer used
kgdb: add missing __percpu markup in arch/x86/kernel/kgdb.c
kdb: fix compile error without CONFIG_KALLSYMS -
Make do_execve() take a const filename pointer so that kernel_execve() compiles
correctly on ARM:arch/arm/kernel/sys_arm.c:88: warning: passing argument 1 of 'do_execve' discards qualifiers from pointer target type
This also requires the argv and envp arguments to be consted twice, once for
the pointer array and once for the strings the array points to. This is
because do_execve() passes a pointer to the filename (now const) to
copy_strings_kernel(). A simpler alternative would be to cast the filename
pointer in do_execve() when it's passed to copy_strings_kernel().do_execve() may not change any of the strings it is passed as part of the argv
or envp lists as they are some of them in .rodata, so marking these strings as
const should be fine.Further kernel_execve() and sys_execve() need to be changed to match.
This has been test built on x86_64, frv, arm and mips.
Signed-off-by: David Howells
Tested-by: Ralf Baechle
Acked-by: Russell King
Signed-off-by: Linus Torvalds -
Otherwise we'll duplicate definitions with the pci.h stubs.
Reported-by: Randy Dunlap
Acked-by: Randy Dunlap
Signed-off-by: Jesse Barnes
17 Aug, 2010
1 commit
-
Free irq source id if create pit workqueue fail
Signed-off-by: Xiao Guangrong
Signed-off-by: Avi Kivity