Eric Lee / smarc-fsl-linux-kernel

03 May, 2007

40 commits

d6dd61c83 [PATCH] x86: PARAVIRT: add hooks to intercept mm creation and destruction ... Browse Code »

Add hooks to allow a paravirt implementation to track the lifetime of
an mm. Paravirtualization requires three hooks, but only two are
needed in common code. They are:

arch_dup_mmap, which is called when a new mmap is created at fork

arch_exit_mmap, which is called when the last process reference to an
mm is dropped, which typically happens on exit and exec.

The third hook is activate_mm, which is called from the arch-specific
activate_mm() macro/function, and so doesn't need stub versions for
other architectures. It's called when an mm is first used.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: linux-arch@vger.kernel.org
Cc: James Bottomley
Acked-by: Ingo Molnar

Jeremy Fitzhardinge
2007-05-03 01:27:14 +0800
5311ab62c [PATCH] i386: PARAVIRT: Allow paravirt backend to choose kernel PMD sharing ... Browse Code »

Normally when running in PAE mode, the 4th PMD maps the kernel address space,
which can be shared among all processes (since they all need the same kernel
mappings).

Xen, however, does not allow guests to have the kernel pmd shared between page
tables, so parameterize pgtable.c to allow both modes of operation.

There are several side-effects of this. One is that vmalloc will update the
kernel address space mappings, and those updates need to be propagated into
all processes if the kernel mappings are not intrinsically shared. In the
non-PAE case, this is done by maintaining a pgd_list of all processes; this
list is used when all process pagetables must be updated. pgd_list is
threaded via otherwise unused entries in the page structure for the pgd, which
means that the pgd must be page-sized for this to work.

Normally the PAE pgd is only 4x64 byte entries large, but Xen requires the PAE
pgd to page aligned anyway, so this patch forces the pgd to be page
aligned+sized when the kernel pmd is unshared, to accomodate both these
requirements.

Also, since there may be several distinct kernel pmds (if the user/kernel
split is below 3G), there's no point in allocating them from a slab cache;
they're just allocated with get_free_page and initialized appropriately. (Of
course the could be cached if there is just a single kernel pmd - which is the
default with a 3G user/kernel split - but it doesn't seem worthwhile to add
yet another case into this code).

[ Many thanks to wli for review comments. ]

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: William Lee Irwin III
Signed-off-by: Andi Kleen
Cc: Zachary Amsden
Cc: Christoph Lameter
Acked-by: Ingo Molnar
Signed-off-by: Andrew Morton

Jeremy Fitzhardinge
2007-05-03 01:27:13 +0800
90caccb97 [PATCH] i386: PARAVIRT: Allocate a fixmap slot ... Browse Code »

Allocate a fixmap slot for use by a paravirt_ops implementation. This
is intended for early-boot bootstrap mappings. Once the zones and
allocator have been set up, it would be better to use get_vm_area() to
allocate some virtual space.

Xen uses this to map the hypervisor's shared info page, which doesn't
have a pseudo-physical page number, and therefore can't be mapped
ordinarily. It is needed early because it contains the vcpu state,
including the interrupt mask.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Acked-by: Ingo Molnar

Jeremy Fitzhardinge
2007-05-03 01:27:13 +0800
b239fb250 [PATCH] i386: PARAVIRT: Hooks to set up initial pagetable ... Browse Code »

This patch introduces paravirt_ops hooks to control how the kernel's
initial pagetable is set up.

In the case of a native boot, the very early bootstrap code creates a
simple non-PAE pagetable to map the kernel and physical memory. When
the VM subsystem is initialized, it creates a proper pagetable which
respects the PAE mode, large pages, etc.

When booting under a hypervisor, there are many possibilities for what
paging environment the hypervisor establishes for the guest kernel, so
the constructon of the kernel's pagetable depends on the hypervisor.

In the case of Xen, the hypervisor boots the kernel with a fully
constructed pagetable, which is already using PAE if necessary. Also,
Xen requires particular care when constructing pagetables to make sure
all pagetables are always mapped read-only.

In order to make this easier, kernel's initial pagetable construction
has been changed to only allocate and initialize a pagetable page if
there's no page already present in the pagetable. This allows the Xen
paravirt backend to make a copy of the hypervisor-provided pagetable,
allowing the kernel to establish any more mappings it needs while
keeping the existing ones.

A slightly subtle point which is worth highlighting here is that Xen
requires all kernel mappings to share the same pte_t pages between all
pagetables, so that updating a kernel page's mapping in one pagetable
is reflected in all other pagetables. This makes it possible to
allocate a page and attach it to a pagetable without having to
explicitly enumerate that page's mapping in all pagetables.

And:

+From: "Eric W. Biederman"

If we don't set the leaf page table entries it is quite possible that
will inherit and incorrect page table entry from the initial boot
page table setup in head.S. So we need to redo the effort here,
so we pick up PSE, PGE and the like.

Hypervisors like Xen require that their page tables be read-only,
which is slightly incompatible with our low identity mappings, however
I discussed this with Jeremy he has modified the Xen early set_pte
function to avoid problems in this area.

Signed-off-by: Eric W. Biederman
Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Acked-by: William Irwin
Cc: Ingo Molnar

Jeremy Fitzhardinge
2007-05-03 01:27:13 +0800
3dc494e86 [PATCH] i386: PARAVIRT: Add pagetable accessors to pack and unpack pagetable entries ... Browse Code »

Add a set of accessors to pack, unpack and modify page table entries
(at all levels). This allows a paravirt implementation to control the
contents of pgd/pmd/pte entries. For example, Xen uses this to
convert the (pseudo-)physical address into a machine address when
populating a pagetable entry, and converting back to pphys address
when an entry is read.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Acked-by: Ingo Molnar

Jeremy Fitzhardinge
2007-05-03 01:27:13 +0800
458762336 [PATCH] i386: PARAVIRT: use paravirt_nop to consistently mark no-op operations ... Browse Code »

Add a _paravirt_nop function for use as a stub for no-op operations,
and paravirt_nop #defined void * version to make using it easier
(since all its uses are as a void *).

This is useful to allow the patcher to automatically identify noop
operations so it can simply nop out the callsite.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Acked-by: Ingo Molnar
[mingo] but only as a cleanup of the current open-coded (void *) casts.
My problem with this is that it loses the types. Not that there is much
to check for, but still, this adds some assumptions about how function
calls look like

Jeremy Fitzhardinge
2007-05-03 01:27:13 +0800
7f63c41c6 [PATCH] i386: PARAVIRT: Remove CONFIG_DEBUG_PARAVIRT ... Browse Code »

Remove CONFIG_DEBUG_PARAVIRT. When inlining code, this option
attempts to trash registers in the patch-site's "clobber" field, on
the grounds that this should find bugs with incorrect clobbers.
Unfortunately, the clobber field really means "registers modified by
this patch site", which includes return values.

Because of this, this option has outlived its usefulness, so remove
it.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: Rusty Russell

Jeremy Fitzhardinge
2007-05-03 01:27:13 +0800
4cdf6bc24 [PATCH] x86-64: update MAINTAINERS ... Browse Code »

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: Chris Wright
Cc: Zachary Amsden
Cc: Rusty Russell

Jeremy Fitzhardinge
2007-05-03 01:27:13 +0800
a75c54f93 [PATCH] i386: i386 separate hardware-defined TSS from Linux additions ... Browse Code »

On Thu, 2007-03-29 at 13:16 +0200, Andi Kleen wrote:
> Please clean it up properly with two structs.

Not sure about this, now I've done it. Running it here.

If you like it, I can do x86-64 as well.

==
lguest defines its own TSS struct because the "struct tss_struct"
contains linux-specific additions. Andi asked me to split the struct
in processor.h.

Unfortunately it makes usage a little awkward.

Signed-off-by: Rusty Russell
Signed-off-by: Andi Kleen

Rusty Russell
2007-05-03 01:27:13 +0800
82d1bb725 [PATCH] x86-64: x86-64 system crashes when no memory populating Node 0 ... Browse Code »

I have a 4 socket AMD Operton system. The 2.6.18 kernel I have crashes
when there is no memory in node0.

AK: changed call to _nopanic

Cc: Andi Kleen
Signed-off-by: Andrew Morton
Signed-off-by: Andi Kleen

James Puthukattukaran
2007-05-03 01:27:13 +0800
1c3d99c11 [PATCH] x86-64: Fix x86_64 compilation with DEBUG_SIG on ... Browse Code »

Setting the DEBUG_SIG flag breaks compilation due to a wrong
struct access. Aditionally, it raises two warnings. This is one
patch to fix them all.

Signed-off-by: Glauber de Oliveira Costa
Signed-off-by: Andi Kleen

Glauber de Oliveira Costa
2007-05-03 01:27:13 +0800
b7fb4af06 [PATCH] i386: Allow boot-time disable of SMP altinstructions ... Browse Code »

Add "noreplace-smp" to disable SMP instruction replacement.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen

Jeremy Fitzhardinge
2007-05-03 01:27:13 +0800
d0175ab64 [PATCH] i386: Remove smp_alt_instructions ... Browse Code »

The .smp_altinstructions section and its corresponding symbols are
completely unused, so remove them.

Also, remove stray #ifdef __KENREL__ in asm-i386/alternative.h

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: Andi Kleen

Jeremy Fitzhardinge
2007-05-03 01:27:13 +0800
4bc5aa91f [PATCH] x86: Clean up x86 control register and MSR macros (corrected) ... Browse Code »

This patch is based on Rusty's recent cleanup of the EFLAGS-related
macros; it extends the same kind of cleanup to control registers and
MSRs.

It also unifies these between i386 and x86-64; at least with regards
to MSRs, the two had definitely gotten out of sync.

Signed-off-by: H. Peter Anvin
Signed-off-by: Andi Kleen

H. Peter Anvin
2007-05-03 01:27:12 +0800
b6e3590f8 [PATCH] x86: Allow percpu variables to be page-aligned ... Browse Code »

Let's allow page-alignment in general for per-cpu data (wanted by Xen, and
Ingo suggested KVM as well).

Because larger alignments can use more room, we increase the max per-cpu
memory to 64k rather than 32k: it's getting a little tight.

Signed-off-by: Rusty Russell
Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Acked-by: Ingo Molnar
Cc: Andi Kleen
Signed-off-by: Andrew Morton

Jeremy Fitzhardinge
2007-05-03 01:27:12 +0800
de90c5ce8 [PATCH] i386: Enable bank 0 on non K7 Athlon ... Browse Code »

As a bug workaround bank 0 on K7s is normally disabled, but no need
to do that on other AMD CPUs.

Cc: davej@redhat.com

Signed-off-by: Andi Kleen

Andi Kleen
2007-05-03 01:27:12 +0800
d479d2cc0 [PATCH] i386: Update smp_call_function* comments ... Browse Code »

Update documentation for i386 smp_call_function* functions.

As reported by Randy Dunlap

[ I've posted this before but it seems to have been lost along the way. ]

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: Randy Dunlap

Jeremy Fitzhardinge
2007-05-03 01:27:12 +0800
794633185 [PATCH] i386: Use menuconfig objects - APM ... Browse Code »

(I hope Andi is the right one to Cc, otherwise please add, thanks!)

Use menuconfigs instead of menus, so the whole menu can be disabled at
once instead of going through all options.

Signed-off-by: Jan Engelhardt
Signed-off-by: Andi Kleen

Jan Engelhardt
2007-05-03 01:27:12 +0800
f039b7547 [PATCH] x86: Don't use MWAIT on AMD Family 10 ... Browse Code »

It doesn't put the CPU into deeper sleep states, so it's better to use the standard
idle loop to save power. But allow to reenable it anyways for benchmarking.

I also removed the obsolete idle=halt on i386

Cc: andreas.herrmann@amd.com

Signed-off-by: Andi Kleen

Andi Kleen
2007-05-03 01:27:12 +0800
c169859d6 [PATCH] x86-64: Clean up asm-x86_64/bugs.h ... Browse Code »

Most of asm-x86_64/bugs.h is code which should be in a C file, so put it there.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: Andi Kleen
Cc: Linus Torvalds

Jeremy Fitzhardinge
2007-05-03 01:27:12 +0800
1dbf527c5 [PATCH] i386: Make COMPAT_VDSO runtime selectable. ... Browse Code »

Now that relocation of the VDSO for COMPAT_VDSO users is done at
runtime rather than compile time, it is possible to enable/disable
compat mode at runtime.

This patch allows you to enable COMPAT_VDSO mode with "vdso=2" on the
kernel command line, or via sysctl. (Switching on a running system
shouldn't be done lightly; any process which was relying on the compat
VDSO will be upset if it goes away.)

The COMPAT_VDSO config option still exists, but if enabled it just
makes vdso_enabled default to VDSO_COMPAT.

+From: Hugh Dickins

Fix oops from i386-make-compat_vdso-runtime-selectable.patch.

Even mingetty at system startup finds it easy to trigger an oops
while reading /proc/PID/maps: though it has a good hold on the mm
itself, that cannot stop exit_mm() from resetting tsk->mm to NULL.

(It is usually show_map()'s call to get_gate_vma() which oopses,
and I expect we could change that to check priv->tail_vma instead;
but no matter, even m_start()'s call just after get_task_mm() is racy.)

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: Zachary Amsden
Cc: "Jan Beulich"
Cc: Eric W. Biederman
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: Roland McGrath

Jeremy Fitzhardinge
2007-05-03 01:27:12 +0800
d4f7a2c18 [PATCH] i386: Relocate VDSO ELF headers to match mapped location with COMPAT_VDSO ... Browse Code »

Some versions of libc can't deal with a VDSO which doesn't have its
ELF headers matching its mapped address. COMPAT_VDSO maps the VDSO at
a specific system-wide fixed address. Previously this was all done at
build time, on the grounds that the fixed VDSO address is always at
the top of the address space. However, a hypervisor may reserve some
of that address space, pushing the fixmap address down.

This patch does the adjustment dynamically at runtime, depending on
the runtime location of the VDSO fixmap.

[ Patch has been through several hands: Jan Beulich wrote the orignal
version; Zach reworked it, and Jeremy converted it to relocate phdrs
as well as sections. ]

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: Zachary Amsden
Cc: "Jan Beulich"
Cc: Eric W. Biederman
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: Roland McGrath

Jeremy Fitzhardinge
2007-05-03 01:27:12 +0800
a6c4e076e [PATCH] i386: clean up identify_cpu ... Browse Code »

identify_cpu() is used to identify both the boot CPU and secondary
CPUs, but it performs some actions which only apply to the boot CPU.
Those functions are therefore really __init functions, but because
they're called by identify_cpu(), they must be marked __cpuinit.

This patch splits identify_cpu() into identify_boot_cpu() and
identify_secondary_cpu(), and calls the appropriate init functions
from each. Also, identify_boot_cpu() and all the functions it
dominates are marked __init.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen

Jeremy Fitzhardinge
2007-05-03 01:27:12 +0800
1353ebb4b [PATCH] i386: Clean up asm-i386/bugs.h ... Browse Code »

Most of asm-i386/bugs.h is code which should be in a C file, so put it there.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: Linus Torvalds

Jeremy Fitzhardinge
2007-05-03 01:27:12 +0800
0d08e0d3a [PATCH] x86-64: Fix vmalloc_32 to really allocate <4GB on 64bit platforms ... Browse Code »

Ugly ifdef, but should handle all 64bit platforms that have suitable
zones. On some like Altix it's probably impossible without IOMMU
use to get memory

Andi Kleen
2007-05-03 01:27:12 +0800
bbf30a165 [PATCH] x86-64: fix arithmetic in comment ... Browse Code »

The xmm space on x86_64 is 256 bytes.

Signed-off-by: Avi Kivity
Signed-off-by: Andi Kleen

Avi Kivity
2007-05-03 01:27:12 +0800
5d02d7ae7 [PATCH] x86-64: Use X86_EFLAGS_IF in x86-64/irqflags.h. ... Browse Code »

As per i386 patch: move X86_EFLAGS_IF et al out to a new header:
processor-flags.h, so we can include it from irqflags.h and use it in
raw_irqs_disabled_flags().

As a side-effect, we could now use these flags in .S files.

Signed-off-by: Rusty Russell
Signed-off-by: Andi Kleen

Andi Kleen
2007-05-03 01:27:11 +0800
b92e9fac4 [PATCH] x86: fix amd64-agp aperture validation ... Browse Code »

Under CONFIG_DISCONTIGMEM, assuming that a !pfn_valid() implies all
subsequent pfn-s are also invalid is wrong. Thus replace this by
explicitly checking against the E820 map.

AK: make e820 on x86-64 not initdata

Signed-off-by: Jan Beulich
Signed-off-by: Andi Kleen
Acked-by: Mark Langsdorf

Jan Beulich
2007-05-03 01:27:11 +0800
b00742d39 [PATCH] x86-64: Account for module percpu space separately from kernel percpu ... Browse Code »

Rather than using a single constant PERCPU_ENOUGH_ROOM, compute it as
the sum of kernel_percpu + PERCPU_MODULE_RESERVE. This is now common
to all architectures; if an architecture wants to set
PERCPU_ENOUGH_ROOM to something special, then it may do so (ia64 is
the only one which does).

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: Rusty Russell
Cc: Eric W. Biederman
Cc: Andi Kleen

Jeremy Fitzhardinge
2007-05-03 01:27:11 +0800
bbba11c35 [PATCH] i386: Remove unneeded externs in nmi.c ... Browse Code »

All were already in some header
Signed-off-by: Andi Kleen

Andi Kleen
2007-05-03 01:27:11 +0800
43c3ab308 [PATCH] x86-64: Change my email address ... Browse Code »

Signed-off-by: Andi Kleen

Andi Kleen
2007-05-03 01:27:11 +0800
07f3331c6 [PATCH] i386: Add machine_ops interface to abstract halting and rebooting ... Browse Code »

machine_ops is an interface for the machine_* functions defined in
. This is intended to allow hypervisors to intercept
the reboot process, but it could be used to implement other x86
subarchtecture reboots.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen

Jeremy Fitzhardinge
2007-05-03 01:27:11 +0800
01a2f4355 [PATCH] i386: Add smp_ops interface ... Browse Code »

Add a smp_ops interface. This abstracts the API defined by
for use within arch/i386. The primary intent is that it
be used by a paravirtualizing hypervisor to implement SMP, but it
could also be used by non-APIC-using sub-architectures.

This is related to CONFIG_PARAVIRT, but is implemented unconditionally
since it is simpler that way and not a highly performance-sensitive
interface.

Signed-off-by: Jeremy Fitzhardinge
Signed-off-by: Andi Kleen
Cc: Andi Kleen
Cc: Ingo Molnar
Cc: James Bottomley

Jeremy Fitzhardinge
2007-05-03 01:27:11 +0800
4fbb59688 [PATCH] i386: cleanup GDT Access ... Browse Code »

Now we have an explicit per-cpu GDT variable, we don't need to keep the
descriptors around to use them to find the GDT: expose cpu_gdt directly.

We could go further and make load_gdt() pack the descriptor for us, or even
assume it means "load the current cpu's GDT" which is what it always does.

Signed-off-by: Rusty Russell
Signed-off-by: Andi Kleen
Cc: Andi Kleen
Signed-off-by: Andrew Morton

Rusty Russell
2007-05-03 01:27:11 +0800
141f9cfe0 [PATCH] x86-64: Fix "Section mismatch" compile warning ... Browse Code »

Fix "Section mismatch" warnings in arch/x86_64/kernel/time.c

Signed-off-by: Bernhard Walle
Signed-off-by: Andi Kleen

Bernhard Walle
2007-05-03 01:27:11 +0800
dd4ecfc2b [PATCH] x86-64: adjust EDID retrieval ... Browse Code »

commit 5e518d7672dea4cd7c60871e40d0490c52f01d13 did the same change to
i386's variant.

With this change, i386's and x86-64's versions are identical, raising
the question whether the x86-64 one should go (just like there's only
one instance of edd.S).

Signed-off-by: Jan Beulich
Signed-off-by: Andi Kleen

Jan Beulich
2007-05-03 01:27:11 +0800
ae32b1297 [PATCH] x86-64: Inhibit machine from asserting an NMI when doing Alt-SysRq-M operation. ... Browse Code »

This patch touches the NMI watchdog every MAX_ORDER_NR_PAGES
to inhibit the machine from triggering an NMI while the CPUs
are locked. This situation is happening on boxes with more
than 64CPUs and 128GB of RAM when Alt-SysRq-m is performed.

It has been succesfully tested for regression on uni, 2, 4, 8
32, and 64 CPU boxes with various memory configuration.

Signed-off-by: Andi Kleen

Konrad Rzeszutek
2007-05-03 01:27:11 +0800
c8118c6c0 [PATCH] x86-64: vsyscall_gtod_data diet and vgettimeofday() fix ... Browse Code »

Current vsyscall_gtod_data is large (3 or 4 cache lines dirtied at timer
interrupt). We can shrink it to exactly 64 bytes (1 cache line on AMD64)

Instead of copying a whole struct clocksource, we copy only needed fields.

I deleted an unused field : offset_base

This patch fixes one oddity in vgettimeofday(): It can returns a timeval with
tv_usec = 1000000. Maybe not a bug, but why not doing the right thing ?

Signed-off-by: Eric Dumazet
Signed-off-by: Andi Kleen

Eric Dumazet
2007-05-03 01:27:11 +0800
272a3713b [PATCH] x86-64: fix vtime() vsyscall ... Browse Code »

There is a tiny probability that the return value from vtime(time_t *t) is
Signed-off-by: Andi Kleen

different than the value stored in *t

Using a temporary variable solves the problem and gives a faster code.

17: 48 85 ff test %rdi,%rdi
1a: 48 8b 05 00 00 00 00 mov 0(%rip),%rax #
__vsyscall_gtod_data.wall_time_tv.tv_sec
21: 74 03 je 26
23: 48 89 07 mov %rax,(%rdi)
26: c9 leaveq
27: c3 retq

Signed-off-by: Eric Dumazet

Eric Dumazet
2007-05-03 01:27:11 +0800
bd8559c38 [PATCH] x86: remove UNEXPECTED_IO_APIC() ... Browse Code »

Many years ago, UNEXPECTED_IO_APIC() contained printk()'s (but nothing more).

Now that it's completely empty for years, we can as well remove it.

Signed-off-by: Adrian Bunk
Signed-off-by: Andi Kleen

Adrian Bunk
2007-05-03 01:27:11 +0800