02 Oct, 2009

1 commit


27 Sep, 2009

1 commit

  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: Remove redundant non-NUMA topology functions
    x86: early_printk: Protect against using the same device twice
    x86: Reduce verbosity of "PAT enabled" kernel message
    x86: Reduce verbosity of "TSC is reliable" message
    x86: mce: Use safer ways to access MCE registers
    x86: mce, inject: Use real inject-msg in raise_local
    x86: mce: Fix thermal throttling message storm
    x86: mce: Clean up thermal throttling state tracking code
    x86: split NX setup into separate file to limit unstack-protected code
    xen: check EFER for NX before setting up GDT mapping
    x86: Cleanup linker script using new linker script macros.
    x86: Use section .data.page_aligned for the idt_table.
    x86: convert to use __HEAD and HEAD_TEXT macros.
    x86: convert compressed loader to use __HEAD and HEAD_TEXT macros.
    x86: fix fragile computation of vsyscall address

    Linus Torvalds
     

24 Sep, 2009

1 commit

  • Makes code futureproof against the impending change to mm->cpu_vm_mask (to be a pointer).

    It's also a chance to use the new cpumask_ ops which take a pointer
    (the older ones are deprecated, but there's no hurry for arch code).

    Signed-off-by: Rusty Russell

    Rusty Russell
     

23 Sep, 2009

1 commit


22 Sep, 2009

1 commit

  • x86-64 assumes NX is available by default, so we need to
    explicitly check for it before using NX. Some first-generation
    Intel x86-64 processors didn't support NX, and even recent systems
    allow it to be disabled in BIOS.

    [ Impact: prevent Xen crash on NX-less 64-bit machines ]

    Signed-off-by: Jeremy Fitzhardinge
    Cc: Stable Kernel

    Jeremy Fitzhardinge
     

19 Sep, 2009

1 commit

  • …el/git/tip/linux-2.6-tip

    * 'x86-platform-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (38 commits)
    x86: Move get/set_wallclock to x86_platform_ops
    x86: platform: Fix section annotations
    x86: apic namespace cleanup
    x86: Distangle ioapic and i8259
    x86: Add Moorestown early detection
    x86: Add hardware_subarch ID for Moorestown
    x86: Add early platform detection
    x86: Move tsc_init to late_time_init
    x86: Move tsc_calibration to x86_init_ops
    x86: Replace the now identical time_32/64.c by time.c
    x86: time_32/64.c unify profile_pc
    x86: Move calibrate_cpu to tsc.c
    x86: Make timer setup and global variables the same in time_32/64.c
    x86: Remove mca bus ifdef from timer interrupt
    x86: Simplify timer_ack magic in time_32.c
    x86: Prepare unification of time_32/64.c
    x86: Remove do_timer hook
    x86: Add timer_init to x86_init_ops
    x86: Move percpu clockevents setup to x86_init_ops
    x86: Move xen_post_allocator_init into xen_pagetable_setup_done
    ...

    Fix up conflicts in arch/x86/include/asm/io_apic.h

    Linus Torvalds
     

16 Sep, 2009

1 commit

  • get/set_wallclock() have already a set of platform dependent
    implementations (default, EFI, paravirt). MRST will add another
    variant.

    Moving them to platform ops simplifies the existing code and minimizes
    the effort to integrate new variants.

    Signed-off-by: Feng Tang
    LKML-Reference:
    Signed-off-by: Thomas Gleixner

    Feng Tang
     

15 Sep, 2009

1 commit


14 Sep, 2009

1 commit

  • * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (22 commits)
    x86: Fix code patching for paravirt-alternatives on 486
    x86, msr: change msr-reg.o to obj-y, and export its symbols
    x86: Use hard_smp_processor_id() to get apic id for AMD K8 cpus
    x86, sched: Workaround broken sched domain creation for AMD Magny-Cours
    x86, mcheck: Use correct cpumask for shared bank4
    x86, cacheinfo: Fixup L3 cache information for AMD multi-node processors
    x86: Fix CPU llc_shared_map information for AMD Magny-Cours
    x86, msr: Fix msr-reg.S compilation with gas 2.16.1, on 32-bit too
    x86: Move kernel_fpu_using to irq_fpu_usable in asm/i387.h
    x86, msr: fix msr-reg.S compilation with gas 2.16.1
    x86, msr: Export the register-setting MSR functions via /dev/*/msr
    x86, msr: Create _on_cpu helpers for {rw,wr}msr_safe_regs()
    x86, msr: Have the _safe MSR functions return -EIO, not -EFAULT
    x86, msr: CFI annotations, cleanups for msr-reg.S
    x86, asm: Make _ASM_EXTABLE() usable from assembly code
    x86, asm: Add 32-bit versions of the combined CFI macros
    x86, AMD: Disable wrongly set X86_FEATURE_LAHF_LM CPUID bit
    x86, msr: Rewrite AMD rd/wrmsr variants
    x86, msr: Add rd/wrmsr interfaces with preset registers
    x86: add specific support for Intel Atom architecture
    ...

    Linus Torvalds
     

10 Sep, 2009

3 commits

  • We need to have a stronger barrier between releasing the lock and
    checking for any waiting spinners. A compiler barrier is not sufficient
    because the CPU's ordering rules do not prevent the read xl->spinners
    from happening before the unlock assignment, as they are different
    memory locations.

    We need to have an explicit barrier to enforce the write-read ordering
    to different memory locations.

    Because of it, I can't bring up > 4 HVM guests on one SMP machine.

    [ Code and commit comments expanded -J ]

    [ Impact: avoid deadlock when using Xen PV spinlocks ]

    Signed-off-by: Yang Xiaowei
    Signed-off-by: Jeremy Fitzhardinge

    Yang Xiaowei
     
  • Where possible we enable interrupts while waiting for a spinlock to
    become free, in order to reduce big latency spikes in interrupt handling.

    However, at present if we manage to pick up the spinlock just before
    blocking, we'll end up holding the lock with interrupts enabled for a
    while. This will cause a deadlock if we recieve an interrupt in that
    window, and the interrupt handler tries to take the lock too.

    Solve this by shrinking the interrupt-enabled region to just around the
    blocking call.

    [ Impact: avoid race/deadlock when using Xen PV spinlocks ]

    Reported-by: "Yang, Xiaowei"
    Signed-off-by: Jeremy Fitzhardinge

    Jeremy Fitzhardinge
     
  • -fstack-protector uses a special per-cpu "stack canary" value.
    gcc generates special code in each function to test the canary to make
    sure that the function's stack hasn't been overrun.

    On x86-64, this is simply an offset of %gs, which is the usual per-cpu
    base segment register, so setting it up simply requires loading %gs's
    base as normal.

    On i386, the stack protector segment is %gs (rather than the usual kernel
    percpu %fs segment register). This requires setting up the full kernel
    GDT and then loading %gs accordingly. We also need to make sure %gs is
    initialized when bringing up secondary cpus too.

    To keep things consistent, we do the full GDT/segment register setup on
    both architectures.

    Because we need to avoid -fstack-protected code before setting up the GDT
    and because there's no way to disable it on a per-function basis, several
    files need to have stack-protector inhibited.

    [ Impact: allow Xen booting with stack-protector enabled ]

    Signed-off-by: Jeremy Fitzhardinge

    Jeremy Fitzhardinge
     

01 Sep, 2009

1 commit


31 Aug, 2009

8 commits


27 Aug, 2009

1 commit

  • memory_setup is overridden by x86_quirks and by paravirts with weak
    functions and quirks. Unify the whole mess and make it an
    unconditional x86_init_ops function which defaults to the standard
    function and can be overridden by the early platform code.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

26 Aug, 2009

2 commits

  • Initialize cx before calling xen_cpuid(), in order to suppress the
    "may be used uninitialized in this function" warning.

    Signed-off-by: H. Peter Anvin
    Cc: Jeremy Fitzhardinge

    H. Peter Anvin
     
  • Xen always runs on CPUs which properly support WP enforcement in
    privileged mode, so there's no need to test for it.

    This also works around a crash reported by Arnd Hannemann, though I
    think its just a band-aid for that case.

    Reported-by: Arnd Hannemann
    Signed-off-by: Jeremy Fitzhardinge
    Acked-by: Pekka Enberg
    Signed-off-by: H. Peter Anvin

    Jeremy Fitzhardinge
     

20 Aug, 2009

2 commits


11 Jun, 2009

1 commit

  • * 'x86-xen-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (42 commits)
    xen: cache cr0 value to avoid trap'n'emulate for read_cr0
    xen/x86-64: clean up warnings about IST-using traps
    xen/x86-64: fix breakpoints and hardware watchpoints
    xen: reserve Xen start_info rather than e820 reserving
    xen: add FIX_TEXT_POKE to fixmap
    lguest: update lazy mmu changes to match lguest's use of kvm hypercalls
    xen: honour VCPU availability on boot
    xen: add "capabilities" file
    xen: drop kexec bits from /sys/hypervisor since kexec isn't implemented yet
    xen/sys/hypervisor: change writable_pt to features
    xen: add /sys/hypervisor support
    xen/xenbus: export xenbus_dev_changed
    xen: use device model for suspending xenbus devices
    xen: remove suspend_cancel hook
    xen/dev-evtchn: clean up locking in evtchn
    xen: export ioctl headers to userspace
    xen: add /dev/xen/evtchn driver
    xen: add irq_from_evtchn
    xen: clean up gate trap/interrupt constants
    xen: set _PAGE_NX in __supported_pte_mask before pagetable construction
    ...

    Linus Torvalds
     

16 May, 2009

1 commit

  • Xiaohui Xin and some other folks at Intel have been looking into what's
    behind the performance hit of paravirt_ops when running native.

    It appears that the hit is entirely due to the paravirtualized
    spinlocks introduced by:

    | commit 8efcbab674de2bee45a2e4cdf97de16b8e609ac8
    | Date: Mon Jul 7 12:07:51 2008 -0700
    |
    | paravirt: introduce a "lock-byte" spinlock implementation

    The extra call/return in the spinlock path is somehow
    causing an increase in the cycles/instruction of somewhere around 2-7%
    (seems to vary quite a lot from test to test). The working theory is
    that the CPU's pipeline is getting upset about the
    call->call->locked-op->return->return, and seems to be failing to
    speculate (though I haven't seen anything definitive about the precise
    reasons). This doesn't entirely make sense, because the performance
    hit is also visible on unlock and other operations which don't involve
    locked instructions. But spinlock operations clearly swamp all the
    other pvops operations, even though I can't imagine that they're
    nearly as common (there's only a .05% increase in instructions
    executed).

    If I disable just the pv-spinlock calls, my tests show that pvops is
    identical to non-pvops performance on native (my measurements show that
    it is actually about .1% faster, but Xiaohui shows a .05% slowdown).

    Summary of results, averaging 10 runs of the "mmperf" test, using a
    no-pvops build as baseline:

    nopv Pv-nospin Pv-spin
    CPU cycles 100.00% 99.89% 102.18%
    instructions 100.00% 100.10% 100.15%
    CPI 100.00% 99.79% 102.03%
    cache ref 100.00% 100.84% 100.28%
    cache miss 100.00% 90.47% 88.56%
    cache miss rate 100.00% 89.72% 88.31%
    branches 100.00% 99.93% 100.04%
    branch miss 100.00% 103.66% 107.72%
    branch miss rt 100.00% 103.73% 107.67%
    wallclock 100.00% 99.90% 102.20%

    The clear effect here is that the 2% increase in CPI is
    directly reflected in the final wallclock time.

    (The other interesting effect is that the more ops are
    out of line calls via pvops, the lower the cache access
    and miss rates. Not too surprising, but it suggests that
    the non-pvops kernel is over-inlined. On the flipside,
    the branch misses go up correspondingly...)

    So, what's the fix?

    Paravirt patching turns all the pvops calls into direct calls, so
    _spin_lock etc do end up having direct calls. For example, the compiler
    generated code for paravirtualized _spin_lock is:

    : mov %gs:0xb4c8,%rax
    : incl 0xffffffffffffe044(%rax)
    : callq *0xffffffff805a5b30
    : retq

    The indirect call will get patched to:
    : mov %gs:0xb4c8,%rax
    : incl 0xffffffffffffe044(%rax)
    : callq
    : nop; nop /* or whatever 2-byte nop */
    : retq

    One possibility is to inline _spin_lock, etc, when building an
    optimised kernel (ie, when there's no spinlock/preempt
    instrumentation/debugging enabled). That will remove the outer
    call/return pair, returning the instruction stream to a single
    call/return, which will presumably execute the same as the non-pvops
    case. The downsides arel 1) it will replicate the
    preempt_disable/enable code at eack lock/unlock callsite; this code is
    fairly small, but not nothing; and 2) the spinlock definitions are
    already a very heavily tangled mass of #ifdefs and other preprocessor
    magic, and making any changes will be non-trivial.

    The other obvious answer is to disable pv-spinlocks. Making them a
    separate config option is fairly easy, and it would be trivial to
    enable them only when Xen is enabled (as the only non-default user).
    But it doesn't really address the common case of a distro build which
    is going to have Xen support enabled, and leaves the open question of
    whether the native performance cost of pv-spinlocks is worth the
    performance improvement on a loaded Xen system (10% saving of overall
    system CPU when guests block rather than spin). Still it is a
    reasonable short-term workaround.

    [ Impact: fix pvops performance regression when running native ]

    Analysed-by: "Xin Xiaohui"
    Analysed-by: "Li Xin"
    Analysed-by: "Nakajima Jun"
    Signed-off-by: Jeremy Fitzhardinge
    Acked-by: H. Peter Anvin
    Cc: Nick Piggin
    Cc: Xen-devel
    LKML-Reference:
    [ fixed the help text ]
    Signed-off-by: Ingo Molnar

    Jeremy Fitzhardinge
     

13 May, 2009

1 commit

  • mmu.c needs to #include module.h to prevent these warnings:

    arch/x86/xen/mmu.c:239: warning: data definition has no type or storage class
    arch/x86/xen/mmu.c:239: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL_GPL'
    arch/x86/xen/mmu.c:239: warning: parameter names (without types) in function declaration

    [ Impact: cleanup ]

    Signed-off-by: Randy Dunlap
    Acked-by: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Randy Dunlap
     

09 May, 2009

3 commits

  • stts() is implemented in terms of read_cr0/write_cr0 to update the
    state of the TS bit. This happens during context switch, and so
    is fairly performance critical. Rather than falling back to
    a trap-and-emulate native read_cr0, implement our own by caching
    the last-written value from write_cr0 (the TS bit is the only one
    we really care about).

    Impact: optimise Xen context switches
    Signed-off-by: Jeremy Fitzhardinge

    Jeremy Fitzhardinge
     
  • Ignore known IST-using traps. Aside from the debugger traps, they're
    low-level faults which Xen will handle for us, so the kernel needn't
    worry about them. Keep warning in case unknown trap starts using IST.

    Impact: suppress spurious warnings
    Signed-off-by: Jeremy Fitzhardinge

    Jeremy Fitzhardinge
     
  • Native x86-64 uses the IST mechanism to run int3 and debug traps on
    an alternative stack. Xen does not do this, and so the frames were
    being misinterpreted by the ptrace code. This change special-cases
    these two exceptions by using Xen variants which run on the normal
    kernel stack properly.

    Impact: avoid crash or bad data when IST trap is invoked under Xen
    Signed-off-by: Jeremy Fitzhardinge

    Jeremy Fitzhardinge
     

08 May, 2009

3 commits

  • Use reserve_early rather than e820 reservations for Xen start info and mfn->pfn
    table, so that the memory use is a bit more self-documenting.

    [ Impact: cleanup ]

    Signed-off-by: Jeremy Fitzhardinge
    Cc: Xen-devel
    Cc: Linus Torvalds
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jeremy Fitzhardinge
     
  • Conflicts:
    arch/frv/include/asm/pgtable.h
    arch/x86/include/asm/required-features.h
    arch/x86/xen/mmu.c

    Merge reason: x86/xen was on a .29 base still, move it to a fresher
    branch and pick up Xen fixes as well, plus resolve
    conflicts

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • The Xen pagetables are no longer implicitly reserved as part of the other
    i386_start_kernel reservations, so make sure we explicitly reserve them.
    This prevents them from being released into the general kernel free page
    pool and reused.

    [ Impact: fix Xen guest crash ]

    Also-Bisected-by: Bryan Donlan
    Signed-off-by: Jeremy Fitzhardinge
    Cc: Xen-devel
    Cc: Linus Torvalds
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jeremy Fitzhardinge
     

22 Apr, 2009

1 commit

  • Pass clocksource pointer to the read() callback for clocksources. This
    allows us to share the callback between multiple instances.

    [hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
    [akpm@linux-foundation.org: cleanup]
    Signed-off-by: Magnus Damm
    Acked-by: John Stultz
    Cc: Thomas Gleixner
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Magnus Damm
     

14 Apr, 2009

1 commit

  • * 'for-rc1/xen/core' of git://git.kernel.org/pub/scm/linux/kernel/git/jeremy/xen:
    xen: add FIX_TEXT_POKE to fixmap
    xen: honour VCPU availability on boot
    xen: clean up gate trap/interrupt constants
    xen: set _PAGE_NX in __supported_pte_mask before pagetable construction
    xen: resume interrupts before system devices.
    xen/mmu: weaken flush_tlb_other test
    xen/mmu: some early pagetable cleanups
    Xen: Add virt_to_pfn helper function
    x86-64: remove PGE from must-have feature list
    xen: mask XSAVE from cpuid
    NULL noise: arch/x86/xen/smp.c
    xen: remove xen_load_gdt debug
    xen: make xen_load_gdt simpler
    xen: clean up xen_load_gdt
    xen: split construction of p2m mfn tables from registration
    xen: separate p2m allocation from setting
    xen: disable preempt for leave_lazy_mmu

    Linus Torvalds
     

10 Apr, 2009

2 commits


09 Apr, 2009

1 commit