Eric Lee / smarc-fsl-linux-kernel

05 Dec, 2011

1 commit

4fc349011 x86-64: Set siginfo and context on vsyscall emulation faults ... Browse Code »

To make this work, we teach the page fault handler how to send
signals on failed uaccess. This only works for user addresses
(kernel addresses will never hit the page fault handler in the
first place), so we need to generate signals for those
separately.

This gets the tricky case right: if the user buffer spans
multiple pages and only the second page is invalid, we set
cr2 and si_addr correctly. UML relies on this behavior to
"fault in" pages as needed.

We steal a bit from thread_info.uaccess_err to enable this.
Before this change, uaccess_err was a 32-bit boolean value.

This fixes issues with UML when vsyscall=emulate.

Reported-by: Adrian Bunk
Signed-off-by: Andy Lutomirski
Cc: richard -rw- weinberger
Cc: H. Peter Anvin
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/4c8f91de7ec5cd2ef0f59521a04e1015f11e42b4.1320712291.git.luto@amacapital.net
Signed-off-by: Ingo Molnar

Andy Lutomirski
2011-12-05 19:17:27 +0800

28 Oct, 2011

1 commit

ca836a254 Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip ... Browse Code »

* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86-64, doc: Remove int 0xcc from entry_64.S documentation
x86, vsyscall: Add missing to arch/x86/mm/fault.c

Fix up trivial conflicts in arch/x86/mm/fault.c (asm/fixmap.h vs
asm/vsyscall.h: both work, which to use? Whatever..)

Linus Torvalds
2011-10-28 20:46:02 +0800

29 Sep, 2011

1 commit

e05139f25 x86-64: Don't apply destructive erratum workaround on unaffected CPUs ... Browse Code »

Erratum 93 applies to AMD K8 CPUs only, and its workaround
(forcing the upper 32 bits of %rip to all get set under certain
conditions) is actually getting in the way of analyzing page
faults occurring during EFI physical mode runtime calls (in
particular the page table walk shown is completely unrelated to
the actual fault). This is because typically EFI runtime code
lives in the space between 2G and 4G, which - modulo the above
manipulation - is likely to overlap with the kernel or modules
area.

While even for the other errata workarounds their taking effect
could be limited to just the affected CPUs, none of them appears
to be destructive, and they're generally getting called only
outside of performance critical paths, so they're being left
untouched.

Signed-off-by: Jan Beulich
Link: http://lkml.kernel.org/r/4E835FE30200007800058464@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar

Jan Beulich
2011-09-29 01:04:48 +0800

16 Aug, 2011

2 commits

fab1167c4 x86, vsyscall: Add missing <asm/fixmap.h> to arch/x86/mm/fault.c ... Browse Code »

arch/x86/mm/fault.c now depend on having the symbol VSYSCALL_START
defined, which is best handled by including (it isn't
unreasonable we may want other fixed addresses in this file in the
future, and so it is cleaner than including
directly.)

This addresses an x86-64 allnoconfig build failure. On other
configurations it was masked by an indirect path:

-> -> ->

... however, the first such include is conditional on CONFIG_X86_LOCAL_APIC.

Originally-by: Randy Dunlap
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/CA%2B55aFxsOMc9=p02r8-QhJ=h=Mqwckk4_Pnx9LQt5%2BfqMp_exQ@mail.gmail.com
Signed-off-by: H. Peter Anvin

H. Peter Anvin
2011-08-16 23:04:02 +0800
cedf03bd9 x86: fix mm/fault.c build ... Browse Code »

arch/x86/mm/fault.c needs to include asm/vsyscall.h to fix a
build error:

arch/x86/mm/fault.c: In function '__bad_area_nosemaphore':
arch/x86/mm/fault.c:728: error: 'VSYSCALL_START' undeclared (first use in this function)

Signed-off-by: Randy Dunlap
Signed-off-by: Linus Torvalds

Randy Dunlap
2011-08-16 10:10:50 +0800

13 Aug, 2011

1 commit

06e727d2a Merge branch 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip ... Browse Code »

* 'x86-vdso-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-tip:
x86-64: Rework vsyscall emulation and add vsyscall= parameter
x86-64: Wire up getcpu syscall
x86: Remove unnecessary compile flag tweaks for vsyscall code
x86-64: Add vsyscall:emulate_vsyscall trace event
x86-64: Add user_64bit_mode paravirt op
x86-64, xen: Enable the vvar mapping
x86-64: Work around gold bug 13023
x86-64: Move the "user" vsyscall segment out of the data segment.
x86-64: Pad vDSO to a page boundary

Linus Torvalds
2011-08-13 11:46:24 +0800

11 Aug, 2011

1 commit

3ae36655b x86-64: Rework vsyscall emulation and add vsyscall= parameter ... Browse Code »

There are three choices:

vsyscall=native: Vsyscalls are native code that issues the
corresponding syscalls.

vsyscall=emulate (default): Vsyscalls are emulated by instruction
fault traps, tested in the bad_area path. The actual contents of
the vsyscall page is the same as the vsyscall=native case except
that it's marked NX. This way programs that make assumptions about
what the code in the page does will not be confused when they read
that code.

vsyscall=none: Trying to execute a vsyscall will segfault.

Signed-off-by: Andy Lutomirski
Link: http://lkml.kernel.org/r/8449fb3abf89851fd6b2260972666a6f82542284.1312988155.git.luto@mit.edu
Signed-off-by: H. Peter Anvin

Andy Lutomirski
2011-08-11 08:26:46 +0800

05 Aug, 2011

1 commit

318f5a2a6 x86-64: Add user_64bit_mode paravirt op ... Browse Code »

Three places in the kernel assume that the only long mode CPL 3
selector is __USER_CS. This is not true on Xen -- Xen's sysretq
changes cs to the magic value 0xe033.

Two of the places are corner cases, but as of "x86-64: Improve
vsyscall emulation CS and RIP handling"
(c9712944b2a12373cb6ff8059afcfb7e826a6c54), vsyscalls will segfault
if called with Xen's extra CS selector. This causes a panic when
older init builds die.

It seems impossible to make Xen use __USER_CS reliably without
taking a performance hit on every system call, so this fixes the
tests instead with a new paravirt op. It's a little ugly because
ptrace.h can't include paravirt.h.

Signed-off-by: Andy Lutomirski
Link: http://lkml.kernel.org/r/f4fcb3947340d9e96ce1054a432f183f9da9db83.1312378163.git.luto@mit.edu
Reported-by: Konrad Rzeszutek Wilk
Signed-off-by: H. Peter Anvin

Andy Lutomirski
2011-08-05 07:13:49 +0800

01 Jul, 2011

1 commit

a8b0ca17b perf: Remove the nmi parameter from the swevent and overflow interface ... Browse Code »
1

The nmi parameter indicated if we could do wakeups from the current
context, if not, we would set some state and self-IPI and let the
resulting interrupt do the wakeup.

For the various event classes:

- hardware: nmi=0; PMI is in fact an NMI or we run irq_work_run from
the PMI-tail (ARM etc.)
- tracepoint: nmi=0; since tracepoint could be from NMI context.
- software: nmi=[0,1]; some, like the schedule thing cannot
perform wakeups, and hence need 0.

As one can see, there is very little nmi=1 usage, and the down-side of
not using it is that on some platforms some software events can have a
jiffy delay in wakeup (when arch_irq_work_raise isn't implemented).

The up-side however is that we can remove the nmi parameter and save a
bunch of conditionals in fast paths.

Signed-off-by: Peter Zijlstra
Cc: Michael Cree
Cc: Will Deacon
Cc: Deng-Cheng Zhu
Cc: Anton Blanchard
Cc: Eric B Munson
Cc: Heiko Carstens
Cc: Paul Mundt
Cc: David S. Miller
Cc: Frederic Weisbecker
Cc: Jason Wessel
Cc: Don Zickus
Link: http://lkml.kernel.org/n/tip-agjev8eu666tvknpb3iaj0fg@git.kernel.org
Signed-off-by: Ingo Molnar

Peter Zijlstra
2011-07-01 17:06:35 +0800

26 May, 2011

1 commit

b80ef10e8 x86: Move do_page_fault()'s error path under unlikely() ... Browse Code »

Ingo suggested SIGKILL check should be moved into slowpath
function. This will reduce the page fault fastpath impact
of this recent commit:

37b23e0525d3: x86,mm: make pagefault killable

Suggested-by: Ingo Molnar
Signed-off-by: KOSAKI Motohiro
Cc: kamezawa.hiroyu@jp.fujitsu.com
Cc: minchan.kim@gmail.com
Cc: willy@linux.intel.com
Cc: Andrew Morton
Cc: Linus Torvalds
Link: http://lkml.kernel.org/r/4DDE0B5C.9050907@jp.fujitsu.com
Signed-off-by: Ingo Molnar

KOSAKI Motohiro
2011-05-26 19:54:03 +0800

25 May, 2011

1 commit

37b23e052 x86,mm: make pagefault killable ... Browse Code »
688

When an oom killing occurs, almost all processes are getting stuck at the
following two points.

1) __alloc_pages_nodemask
2) __lock_page_or_retry

1) is not very problematic because TIF_MEMDIE leads to an allocation
failure and getting out from page allocator.

2) is more problematic. In an OOM situation, zones typically don't have
page cache at all and memory starvation might lead to greatly reduced IO
performance. When a fork bomb occurs, TIF_MEMDIE tasks don't die quickly,
meaning that a fork bomb may create new process quickly rather than the
oom-killer killing it. Then, the system may become livelocked.

This patch makes the pagefault interruptible by SIGKILL.

Signed-off-by: KOSAKI Motohiro
Reviewed-by: KAMEZAWA Hiroyuki
Cc: Minchan Kim
Cc: Matthew Wilcox
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

KOSAKI Motohiro
2011-05-25 23:39:08 +0800

21 May, 2011

1 commit

268bb0ce3 sanitize <linux/prefetch.h> usage ... Browse Code »

Commit e66eed651fd1 ("list: remove prefetching from regular list
iterators") removed the include of prefetch.h from list.h, which
uncovered several cases that had apparently relied on that rather
obscure header file dependency.

So this fixes things up a bit, using

grep -L linux/prefetch.h $(git grep -l '[^a-z_]prefetchw*(' -- '*.[ch]')
grep -L 'prefetchw*(' $(git grep -l 'linux/prefetch.h' -- '*.[ch]')

to guide us in finding files that either need
inclusion, or have it despite not needing it.

There are more of them around (mostly network drivers), but this gets
many core ones.

Reported-by: Stephen Rothwell
Signed-off-by: Linus Torvalds

Linus Torvalds
2011-05-21 03:50:29 +0800

10 Mar, 2011

2 commits

a79e53d85 x86/mm: Fix pgd_lock deadlock ... Browse Code »

It's forbidden to take the page_table_lock with the irq disabled
or if there's contention the IPIs (for tlb flushes) sent with
the page_table_lock held will never run leading to a deadlock.

Nobody takes the pgd_lock from irq context so the _irqsave can be
removed.

Signed-off-by: Andrea Arcangeli
Acked-by: Rik van Riel
Tested-by: Konrad Rzeszutek Wilk
Signed-off-by: Andrew Morton
Cc: Peter Zijlstra
Cc: Linus Torvalds
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar

Andrea Arcangeli
2011-03-10 16:41:57 +0800
f86268549 x86/mm: Handle mm_fault_error() in kernel space ... Browse Code »

mm_fault_error() should not execute oom-killer, if page fault
occurs in kernel space. E.g. in copy_from_user()/copy_to_user().

This would happen if we find ourselves in OOM on a
copy_to_user(), or a copy_from_user() which faults.

Without this patch, the kernels hangs up in copy_from_user(),
because OOM killer sends SIG_KILL to current process, but it
can't handle a signal while in syscall, then the kernel returns
to copy_from_user(), reexcute current command and provokes
page_fault again.

With this patch the kernel return -EFAULT from copy_from_user().

The code, which checks that page fault occurred in kernel space,
has been copied from do_sigbus().

This situation is handled by the same way on powerpc, xtensa,
tile, ...

Signed-off-by: Andrey Vagin
Signed-off-by: Andrew Morton
Cc: "H. Peter Anvin"
Cc: Linus Torvalds
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar

Andrey Vagin
2011-03-10 16:41:40 +0800

27 Oct, 2010

2 commits

68da336a1 x86: access_error API cleanup ... Browse Code »

access_error() already takes error_code as an argument, so there is
no need for an additional write flag.

Signed-off-by: Michel Lespinasse
Acked-by: Rik van Riel
Cc: Nick Piggin
Acked-by: Wu Fengguang
Cc: Ying Han
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Thomas Gleixner
Acked-by: "H. Peter Anvin"
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michel Lespinasse
2010-10-27 07:52:09 +0800
d065bd810 mm: retry page fault when blocking on disk transfer ... Browse Code »
645

This change reduces mmap_sem hold times that are caused by waiting for
disk transfers when accessing file mapped VMAs.

It introduces the VM_FAULT_ALLOW_RETRY flag, which indicates that the call
site wants mmap_sem to be released if blocking on a pending disk transfer.
In that case, filemap_fault() returns the VM_FAULT_RETRY status bit and
do_page_fault() will then re-acquire mmap_sem and retry the page fault.

It is expected that the retry will hit the same page which will now be
cached, and thus it will complete with a low mmap_sem hold time.

Tests:

- microbenchmark: thread A mmaps a large file and does random read accesses
to the mmaped area - achieves about 55 iterations/s. Thread B does
mmap/munmap in a loop at a separate location - achieves 55 iterations/s
before, 15000 iterations/s after.

- We are seeing related effects in some applications in house, which show
significant performance regressions when running without this change.

[akpm@linux-foundation.org: fix warning & crash]
Signed-off-by: Michel Lespinasse
Acked-by: Rik van Riel
Acked-by: Linus Torvalds
Cc: Nick Piggin
Reviewed-by: Wu Fengguang
Cc: Ying Han
Cc: Peter Zijlstra
Cc: Ingo Molnar
Cc: Thomas Gleixner
Acked-by: "H. Peter Anvin"
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Michel Lespinasse
2010-10-27 07:52:09 +0800

22 Oct, 2010

2 commits

46e387bbd Merge branch 'hwpoison-hugepages' into hwpoison ... Browse Code »

Conflicts:
mm/memory-failure.c

Andi Kleen
2010-10-22 23:40:48 +0800
c3b86a294 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86-32, percpu: Correct the ordering of the percpu readmostly section
x86, mm: Enable ARCH_DMA_ADDR_T_64BIT with X86_64 || HIGHMEM64G
x86: Spread tlb flush vector between nodes
percpu: Introduce a read-mostly percpu API
x86, mm: Fix incorrect data type in vmalloc_sync_all()
x86, mm: Hold mm->page_table_lock while doing vmalloc_sync
x86, mm: Fix bogus whitespace in sync_global_pgds()
x86-32: Fix sparse warning for the __PHYSICAL_MASK calculation
x86, mm: Add RESERVE_BRK_ARRAY() helper
mm, x86: Saving vmcore with non-lazy freeing of vmas
x86, kdump: Change copy_oldmem_page() to use cached addressing
x86, mm: fix uninitialized addr in kernel_physical_mapping_init()
x86, kmemcheck: Remove double test
x86, mm: Make spurious_fault check explicitly check the PRESENT bit
x86-64, mem: Update all PGDs for direct mapping and vmemmap mapping changes
x86, mm: Separate x86_64 vmalloc_sync_all() into separate functions
x86, mm: Avoid unnecessary TLB flush

Linus Torvalds
2010-10-22 04:47:29 +0800

21 Oct, 2010

1 commit

f01f7c56a x86, mm: Fix incorrect data type in vmalloc_sync_all() ... Browse Code »

arch/x86/mm/fault.c: In function 'vmalloc_sync_all':
arch/x86/mm/fault.c:238: warning: assignment makes integer from pointer without a cast

introduced by 617d34d9e5d8326ec8f188c616aa06ac59d083fe.

Signed-off-by: Borislav Petkov
LKML-Reference:
Signed-off-by: H. Peter Anvin

Borislav Petkov
2010-10-21 03:54:04 +0800

20 Oct, 2010

1 commit

617d34d9e x86, mm: Hold mm->page_table_lock while doing vmalloc_sync ... Browse Code »

Take mm->page_table_lock while syncing the vmalloc region. This prevents
a race with the Xen pagetable pin/unpin code, which expects that the
page_table_lock is already held. If this race occurs, then Xen can see
an inconsistent page type (a page can either be read/write or a pagetable
page, and pin/unpin converts it between them), which will cause either
the pin or the set_p[gm]d to fail; either will crash the kernel.

vmalloc_sync_all() should be called rarely, so this extra use of
page_table_lock should not interfere with its normal users.

The mm pointer is stashed in the pgd page's index field, as that won't
be otherwise used for pgds.

Reported-by: Ian Campbell
Originally-by: Jan Beulich
LKML-Reference:
Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: H. Peter Anvin

Jeremy Fitzhardinge
2010-10-20 04:57:08 +0800

15 Oct, 2010

1 commit

ebc8827f7 x86: Barf when vmalloc and kmemcheck faults happen in NMI ... Browse Code »

In x86, faults exit by executing the iret instruction, which then
reenables NMIs if we faulted in NMI context. Then if a fault
happens in NMI, another NMI can nest after the fault exits.

But we don't yet support nested NMIs because we have only one NMI
stack. To prevent from that, check that vmalloc and kmemcheck
faults don't happen in this context. Most of the other kernel faults
in NMIs can be more easily spotted by finding explicit
copy_from,to_user() calls on review.

Signed-off-by: Frederic Weisbecker
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: H. Peter Anvin
Cc: Mathieu Desnoyers
Cc: Peter Zijlstra

Frederic Weisbecker
2010-10-15 02:43:36 +0800

08 Oct, 2010

1 commit

f672b49b0 x86: HWPOISON: Report correct address granuality for huge hwpoison faults ... Browse Code »

An earlier patch fixed the hwpoison fault handling to encode the
huge page size in the fault code of the page fault handler.

This is needed to report this information in SIGBUS to user space.

This is a straight forward patch to pass this information
through to the signal handling in the x86 specific fault.c

Cc: x86@kernel.org
Cc: Naoya Horiguchi
Cc: fengguang.wu@intel.com
Signed-off-by: Andi Kleen

Andi Kleen
2010-10-08 15:32:46 +0800

27 Aug, 2010

2 commits

660a293ea x86, mm: Make spurious_fault check explicitly check the PRESENT bit ... Browse Code »

pte_present() returns true even present bit isn't set but _PAGE_PROTNONE
(global bit) bit is set. While with CONFIG_DEBUG_PAGEALLOC, free pages have
global bit set but present bit clear. This patch makes we could catch
free pages access with CONFIG_DEBUG_PAGEALLOC enabled.

[ hpa: added a comment in the code as a warning to janitors ]

Signed-off-by: Shaohua Li
LKML-Reference:
Signed-off-by: H. Peter Anvin

Shaohua Li
2010-08-27 07:00:21 +0800
6afb5157b x86, mm: Separate x86_64 vmalloc_sync_all() into separate functions ... Browse Code »

No behavior change.

Move some of vmalloc_sync_all() code into a new function
sync_global_pgds() that will be useful for memory hotplug.

Signed-off-by: Haicheng Li
LKML-Reference:
Reviewed-by: Wu Fengguang
Reviewed-by: Andi Kleen
Signed-off-by: H. Peter Anvin

Haicheng Li
2010-08-27 05:02:29 +0800

14 Aug, 2010

1 commit

960545691 x86: don't send SIGBUS for kernel page faults ... Browse Code »

It's wrong for several reasons, but the most direct one is that the
fault may be for the stack accesses to set up a previous SIGBUS. When
we have a kernel exception, the kernel exception handler does all the
fixups, not some user-level signal handler.

Even apart from the nested SIGBUS issue, it's also wrong to give out
kernel fault addresses in the signal handler info block, or to send a
SIGBUS when a system call already returns EFAULT.

Signed-off-by: Linus Torvalds

Linus Torvalds
2010-08-14 00:49:20 +0800

06 Dec, 2009

1 commit

6ec22f9b0 Merge branch 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86: Limit number of per cpu TSC sync messages
x86: dumpstack, 64-bit: Disable preemption when walking the IRQ/exception stacks
x86: dumpstack: Clean up the x86_stack_ids[][] initalization and other details
x86, cpu: mv display_cacheinfo -> cpu_detect_cache_sizes
x86: Suppress stack overrun message for init_task
x86: Fix cpu_devs[] initialization in early_cpu_init()
x86: Remove CPU cache size output for non-Intel too
x86: Minimise printk spew from per-vendor init code
x86: Remove the CPU cache size printk's
cpumask: Avoid cpumask_t in arch/x86/kernel/apic/nmi.c
x86: Make sure we also print a Code: line for show_regs()

Linus Torvalds
2009-12-06 07:33:27 +0800

23 Nov, 2009

1 commit

0e7810be3 x86: Suppress stack overrun message for init_task ... Browse Code »

init_task doesn't get its stack end location set to
STACK_END_MAGIC, and hence the message is confusing
rather than helpful in this case.

Signed-off-by: Jan Beulich
LKML-Reference:
Signed-off-by: Ingo Molnar

Jan Beulich
2009-11-23 18:45:34 +0800

17 Oct, 2009

1 commit

bb3c3e807 Merge commit 'v2.6.32-rc5' into perf/probes ... Browse Code »

Conflicts:
kernel/trace/trace_event_profile.c

Merge reason: update to -rc5 and resolve conflict.

Signed-off-by: Ingo Molnar

Ingo Molnar
2009-10-17 15:58:25 +0800

24 Sep, 2009

2 commits

db1682636 Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6 ... Browse Code »

* 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (21 commits)
HWPOISON: Enable error_remove_page on btrfs
HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs
HWPOISON: Add madvise() based injector for hardware poisoned pages v4
HWPOISON: Enable error_remove_page for NFS
HWPOISON: Enable .remove_error_page for migration aware file systems
HWPOISON: The high level memory error handler in the VM v7
HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process
HWPOISON: shmem: call set_page_dirty() with locked page
HWPOISON: Define a new error_remove_page address space op for async truncation
HWPOISON: Add invalidate_inode_page
HWPOISON: Refactor truncate to allow direct truncating of page v2
HWPOISON: check and isolate corrupted free pages v2
HWPOISON: Handle hardware poisoned pages in try_to_unmap
HWPOISON: Use bitmask/action code for try_to_unmap behaviour
HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2
HWPOISON: Add poison check to page fault handling
HWPOISON: Add basic support for poisoned pages in fault handler v3
HWPOISON: Add new SIGBUS error codes for hardware poison signals
HWPOISON: Add support for poison swap entries v2
HWPOISON: Export some rmap vma locking to outside world
...

Linus Torvalds
2009-09-24 22:53:22 +0800
d7a4b414e Merge commit 'linus/master' into tracing/kprobes ... Browse Code »

Conflicts:
kernel/trace/Makefile
kernel/trace/trace.h
kernel/trace/trace_event_types.h
kernel/trace/trace_export.c

Merge reason:
Sync with latest significant tracing core changes.

Frederic Weisbecker
2009-09-24 05:08:43 +0800

21 Sep, 2009

1 commit

cdd6c482c perf: Do the big rename: Performance Counters -> Performance Events ... Browse Code »

Bye-bye Performance Counters, welcome Performance Events!

In the past few months the perfcounters subsystem has grown out its
initial role of counting hardware events, and has become (and is
becoming) a much broader generic event enumeration, reporting, logging,
monitoring, analysis facility.

Naming its core object 'perf_counter' and naming the subsystem
'perfcounters' has become more and more of a misnomer. With pending
code like hw-breakpoints support the 'counter' name is less and
less appropriate.

All in one, we've decided to rename the subsystem to 'performance
events' and to propagate this rename through all fields, variables
and API names. (in an ABI compatible fashion)

The word 'event' is also a bit shorter than 'counter' - which makes
it slightly more convenient to write/handle as well.

Thanks goes to Stephane Eranian who first observed this misnomer and
suggested a rename.

User-space tooling and ABI compatibility is not affected - this patch
should be function-invariant. (Also, defconfigs were not touched to
keep the size down.)

This patch has been generated via the following script:

FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

sed -i \
-e 's/PERF_EVENT_/PERF_RECORD_/g' \
-e 's/PERF_COUNTER/PERF_EVENT/g' \
-e 's/perf_counter/perf_event/g' \
-e 's/nb_counters/nb_events/g' \
-e 's/swcounter/swevent/g' \
-e 's/tpcounter_event/tp_event/g' \
$FILES

for N in $(find . -name perf_counter.[ch]); do
M=$(echo $N | sed 's/perf_counter/perf_event/g')
mv $N $M
done

FILES=$(find . -name perf_event.*)

sed -i \
-e 's/COUNTER_MASK/REG_MASK/g' \
-e 's/COUNTER/EVENT/g' \
-e 's/\/event_id/g' \
-e 's/counter/event/g' \
-e 's/Counter/Event/g' \
$FILES

... to keep it as correct as possible. This script can also be
used by anyone who has pending perfcounters patches - it converts
a Linux kernel tree over to the new naming. We tried to time this
change to the point in time where the amount of pending patches
is the smallest: the end of the merge window.

Namespace clashes were fixed up in a preparatory patch - and some
stylistic fallout will be fixed up in a subsequent patch.

( NOTE: 'counters' are still the proper terminology when we deal
with hardware registers - and these sed scripts are a bit
over-eager in renaming them. I've undone some of that, but
in case there's something left where 'counter' would be
better than 'event' we can undo that on an individual basis
instead of touching an otherwise nicely automated patch. )

Suggested-by: Stephane Eranian
Acked-by: Peter Zijlstra
Acked-by: Paul Mackerras
Reviewed-by: Arjan van de Ven
Cc: Mike Galbraith
Cc: Arnaldo Carvalho de Melo
Cc: Frederic Weisbecker
Cc: Steven Rostedt
Cc: Benjamin Herrenschmidt
Cc: David Howells
Cc: Kyle McMartin
Cc: Martin Schwidefsky
Cc: "David S. Miller"
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc:
LKML-Reference:
Signed-off-by: Ingo Molnar

Ingo Molnar
2009-09-21 20:28:04 +0800

16 Sep, 2009

1 commit

a6e04aa92 HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2 ... Browse Code »

Add VM_FAULT_HWPOISON handling to the x86 page fault handler. This is
very similar to VM_FAULT_OOM, the only difference is that a different
si_code is passed to user space and the new addr_lsb field is initialized.

v2: Make the printk more verbose/unique

Cc: x86@kernel.org

Signed-off-by: Andi Kleen

Andi Kleen
2009-09-16 17:50:09 +0800

14 Sep, 2009

1 commit

7dfd54a90 Merge branch 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip ... Browse Code »

* 'x86-mm-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
x86, highmem_32.c: Clean up comment
x86, pgtable.h: Clean up types
x86: Clean up dump_pagetable()

Linus Torvalds
2009-09-14 22:59:32 +0800

30 Aug, 2009

1 commit

62c9295f9 kprobes/x86: Fix to add __kprobes to in-kernel fault handing functions ... Browse Code »

Add __kprobes to the functions which handle in-kernel fixable page
faults. Since kprobes can cause those in-kernel page faults by accessing
kprobe data structures, probing those fault functions will cause
fault-int3-loop (do_page_fault has already been marked as __kprobes).

Signed-off-by: Masami Hiramatsu
Acked-by: Ananth N Mavinakayanahalli
Cc: Ingo Molnar
LKML-Reference:
Signed-off-by: Frederic Weisbecker

Masami Hiramatsu
2009-08-30 09:08:26 +0800

11 Jul, 2009

1 commit

a1a08d1cb x86: Remove spurious printk level from segfault message ... Browse Code »

Since commit 5fd29d6c ("printk: clean up handling of log-levels
and newlines"), the kernel logs segfaults like:

gnome-power-man[24509]: segfault at 20 ip 00007f9d4950465a sp 00007fffbb50fc70 error 4 in libgobject-2.0.so.0.2103.0[7f9d494f7000+45000]

with the extra "" being KERN_INFO. This happens because the
printk in show_signal_msg() started with KERN_CONT and then
used "%s" to pass in the real level; and KERN_CONT is no longer
an empty string, and printk only pays attention to the level at
the very beginning of the format string.

Therefore, remove the KERN_CONT from this printk, since it is
now actively causing problems (and never really made any
sense).

Signed-off-by: Roland Dreier
Cc: Linus Torvalds
LKML-Reference:
Signed-off-by: Ingo Molnar

Roland Dreier
2009-07-11 15:56:19 +0800

09 Jul, 2009

1 commit

ad361c988 Remove multiple KERN_ prefixes from printk formats ... Browse Code »

Commit 5fd29d6ccbc98884569d6f3105aeca70858b3e0f ("printk: clean up
handling of log-levels and newlines") changed printk semantics. printk
lines with multiple KERN_ prefixes are no longer emitted as
before the patch.

is now included in the output on each additional use.

Remove all uses of multiple KERN_s in formats.

Signed-off-by: Joe Perches
Signed-off-by: Linus Torvalds

Joe Perches
2009-07-09 01:30:03 +0800

29 Jun, 2009

1 commit

087975b06 x86: Clean up dump_pagetable() ... Browse Code »

Use pgtable access helpers for 32-bit version dump_pagetable()
and get rid of __typeof__() operators. This needs to make
pmd_pfn() available for 2-level pgtable.

Also, remove some casts for 64-bit version dump_pagetable().

Signed-off-by: Akinobu Mita
LKML-Reference:
Signed-off-by: Ingo Molnar

Akinobu Mita
2009-06-29 12:14:42 +0800

22 Jun, 2009

1 commit

d06063cc2 Move FAULT_FLAG_xyz into handle_mm_fault() callers ... Browse Code »

This allows the callers to now pass down the full set of FAULT_FLAG_xyz
flags to handle_mm_fault(). All callers have been (mechanically)
converted to the new calling convention, there's almost certainly room
for architectures to clean up their code and then add FAULT_FLAG_RETRY
when that support is added.

Signed-off-by: Linus Torvalds

Linus Torvalds
2009-06-22 04:08:22 +0800

21 Jun, 2009

1 commit

c4c5ab308 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/… ... Browse Code »

…git/tip/linux-2.6-tip

* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (45 commits)
x86, mce: fix error path in mce_create_device()
x86: use zalloc_cpumask_var for mce_dev_initialized
x86: fix duplicated sysfs attribute
x86: de-assembler-ize asm/desc.h
i386: fix/simplify espfix stack switching, move it into assembly
i386: fix return to 16-bit stack from NMI handler
x86, ioapic: Don't call disconnect_bsp_APIC if no APIC present
x86: Remove duplicated #include's
x86: msr.h linux/types.h is only required for __KERNEL__
x86: nmi: Add Intel processor 0x6f4 to NMI perfctr1 workaround
x86, mce: mce_intel.c needs <asm/apic.h>
x86: apic/io_apic.c: dmar_msi_type should be static
x86, io_apic.c: Work around compiler warning
x86: mce: Don't touch THERMAL_APIC_VECTOR if no active APIC present
x86: mce: Handle banks == 0 case in K7 quirk
x86, boot: use .code16gcc instead of .code16
x86: correct the conversion of EFI memory types
x86: cap iomem_resource to addressable physical memory
x86, mce: rename _64.c files which are no longer 64-bit-specific
x86, mce: mce.h cleanup
...

Manually fix up trivial conflict in arch/x86/mm/fault.c

Linus Torvalds
2009-06-21 01:49:48 +0800

16 Jun, 2009

1 commit

5dfaf90f8 x86: mm: Read cr2 before prefetching the mmap_lock ... Browse Code »

Prefetch instructions can generate spurious faults on certain
models of older CPUs. The faults themselves cannot be stopped
and they can occur pretty much anywhere - so the way we solve
them is that we detect certain patterns and ignore the fault.

There is one small path of code where we must not take faults
though: the #PF handler execution leading up to the reading
of the CR2 (the faulting address). If we take a fault there
then we destroy the CR2 value (with that of the prefetching
instruction's) and possibly mishandle user-space or
kernel-space pagefaults.

It turns out that in current upstream we do exactly that:

prefetchw(&mm->mmap_sem);

/* Get the faulting address: */
address = read_cr2();

This is not good.

So turn around the order: first read the cr2 then prefetch
the lock address. Reading cr2 is plenty fast (2 cycles) so
delaying the prefetch by this amount shouldnt be a big issue
performance-wise.

[ And this might explain a mystery fault.c warning that sometimes
occurs on one an old AMD/Semptron based test-system i have -
which does have such prefetch problems. ]

Cc: Mathieu Desnoyers
Cc: Linus Torvalds
Cc: Peter Zijlstra
Cc: Nick Piggin
Cc: Pekka Enberg
Cc: Vegard Nossum
Cc: Jeremy Fitzhardinge
Cc: Hugh Dickins
LKML-Reference:
Signed-off-by: Ingo Molnar

Ingo Molnar
2009-06-16 16:23:32 +0800