24 Jul, 2008

1 commit

  • * 'cpus4096-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip: (31 commits)
    NR_CPUS: Replace NR_CPUS in speedstep-centrino.c
    cpumask: Provide a generic set of CPUMASK_ALLOC macros, FIXUP
    NR_CPUS: Replace NR_CPUS in cpufreq userspace routines
    NR_CPUS: Replace per_cpu(..., smp_processor_id()) with __get_cpu_var
    NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genapic_flat_64.c
    NR_CPUS: Replace NR_CPUS in arch/x86/kernel/genx2apic_uv_x.c
    NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/proc.c
    NR_CPUS: Replace NR_CPUS in arch/x86/kernel/cpu/mcheck/mce_64.c
    cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c, fix
    cpumask: Use optimized CPUMASK_ALLOC macros in the centrino_target
    cpumask: Provide a generic set of CPUMASK_ALLOC macros
    cpumask: Optimize cpumask_of_cpu in lib/smp_processor_id.c
    cpumask: Optimize cpumask_of_cpu in kernel/time/tick-common.c
    cpumask: Optimize cpumask_of_cpu in drivers/misc/sgi-xp/xpc_main.c
    cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/ldt.c
    cpumask: Optimize cpumask_of_cpu in arch/x86/kernel/io_apic_64.c
    cpumask: Replace cpumask_of_cpu with cpumask_of_cpu_ptr
    Revert "cpumask: introduce new APIs"
    cpumask: make for_each_cpu_mask a bit smaller
    net: Pass reference to cpumask variable in net/sunrpc/svc.c
    ...

    Fix up trivial conflicts in drivers/cpufreq/cpufreq.c manually

    Linus Torvalds
     

20 Jul, 2008

1 commit


19 Jul, 2008

1 commit


17 Jul, 2008

2 commits

  • "idle=nomwait" disables the use of the MWAIT
    instruction from both C1 (C1_FFH) and deeper (C2C3_FFH)
    C-states.

    When MWAIT is unavailable, the BIOS and OS generally
    negotiate to use the HALT instruction for C1,
    and use IO accesses for deeper C-states.

    This option is useful for power and performance
    comparisons, and also to work around BIOS bugs
    where broken MWAIT support is advertised.

    http://bugzilla.kernel.org/show_bug.cgi?id=10807
    http://bugzilla.kernel.org/show_bug.cgi?id=10914

    Signed-off-by: Zhao Yakui
    Signed-off-by: Li Shaohua
    Signed-off-by: Len Brown
    Signed-off-by: Andi Kleen

    Zhao Yakui
     
  • "idle=halt" limits the idle loop to using
    the halt instruction. No MWAIT, no IO accesses,
    no C-states deeper than C1.

    If something is broken in the idle code,
    "idle=halt" is a less severe workaround
    than "idle=poll" which disables all power savings.

    Signed-off-by: Zhao Yakui
    Signed-off-by: Len Brown
    Signed-off-by: Andi Kleen

    Zhao Yakui
     

08 Jul, 2008

4 commits


04 Jul, 2008

1 commit

  • The manual padding to align on cacheline size only worked in 32 bit
    In 64 bit the structure was not aligned and contained wasted space.

    use the compiler ____cachline_aligned to save space & properly align
    this structure.

    x86_64_default size goes from 9136 -> 8960
    x86_64_AMD size goes from 9136 -> 8896

    built & running on 2.6.26-rc8.

    Signed-off-by: Richard Kennedy
    Signed-off-by: Ingo Molnar

    Richard Kennedy
     

01 May, 2008

1 commit


27 Apr, 2008

1 commit

  • OK, so 25-mm1 gave a lockdep error which made me look into this.

    The first thing that I noticed was the horrible mess; the second thing I
    saw was hacks like: 71e93d15612c61c2e26a169567becf088e71b8ff

    The problem is that arch idle routines are somewhat inconsitent with
    their IRQ state handling and instead of fixing _that_, we go paper over
    the problem.

    So the thing I've tried to do is set a standard for idle routines and
    fix them all up to adhere to that. So the rules are:

    idle routines are entered with IRQs disabled
    idle routines will exit with IRQs enabled

    Nearly all already did this in one form or another.

    Merge the 32 and 64 bit bits so they no longer have different bugs.

    As for the actual lockdep warning; __sti_mwait() did a plainly un-annotated
    irq-enable.

    Signed-off-by: Peter Zijlstra
    Tested-by: Bob Copeland
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

26 Apr, 2008

1 commit

  • It is claimed that NexGen CPUs were never shipped:

    http://lkml.org/lkml/2008/4/20/179

    Also, the kernel support for these chips has been broken for
    a long time, the code intended to support NexGen thereby being
    essentially dead.

    As an outcome of the discussion that can be found using the URL
    above, this patch removes the NexGen support altogether.

    The changes in this patch survived a defconfig build for i386, a
    couple of successful randconfig builds, as well as a runtime test,
    which consisted in booting a 32-bit x86 box up to the shell prompt.

    Signed-off-by: Dmitri Vorobiev
    Signed-off-by: Ingo Molnar

    Dmitri Vorobiev
     

20 Apr, 2008

3 commits

  • Only allocate the FPU area when the application actually uses FPU, i.e., in the
    first lazy FPU trap. This could save memory for non-fpu using apps.

    for example: on my system after boot, there are around 300 processes, with
    only 17 using FPU.

    Signed-off-by: Suresh Siddha
    Cc: Arjan van de Ven
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Suresh Siddha
     
  • Split the FPU save area from the task struct. This allows easy migration
    of FPU context, and it's generally cleaner. It also allows the following
    two optimizations:

    1) only allocate when the application actually uses FPU, so in the first
    lazy FPU trap. This could save memory for non-fpu using apps. Next patch
    does this lazy allocation.

    2) allocate the right size for the actual cpu rather than 512 bytes always.
    Patches enabling xsave/xrstor support (coming shortly) will take advantage
    of this.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Suresh Siddha
     
  • This patch implements the PR_GET_TSC and PR_SET_TSC prctl()
    commands on the x86 platform (both 32 and 64 bit.) These
    commands control the ability to read the timestamp counter
    from userspace (the RDTSC instruction.)

    While the RDTSC instuction is a useful profiling tool,
    it is also the source of some non-determinism in ring-3.
    For deterministic replay applications it is useful to be
    able to trap and emulate (and record the outcome of) this
    instruction.

    This patch uses code earlier used to disable the timestamp
    counter for the SECCOMP framework. A side-effect of this
    patch is that the SECCOMP environment will now also disable
    the timestamp counter on x86_64 due to the addition of the
    TIF_NOTSC define on this platform.

    The code which enables/disables the RDTSC instruction during
    context switches is in the __switch_to_xtra function, which
    already handles other unusual conditions, so normal
    performance should not have to suffer from this change.

    Signed-off-by: Erik Bosman
    Acked-by: Arjan van de Ven
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Erik Bosman
     

17 Apr, 2008

10 commits


19 Feb, 2008

2 commits

  • Signed-off-by: Mike Travis
    Cc: Christoph Lameter
    Cc: Jack Steiner
    Cc: linux-mm@kvack.org
    Cc: Andrew Morton
    Cc: Andi Kleen
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Mike Travis
     
  • This patch removes the mca-pentium boot option that was a noop.

    besides the source code cleanup factor, this saves some text as well:

    arch/x86/kernel/cpu/bugs.o:
    text data bss dec hex filename
    651 77 4 732 2dc bugs.o.before
    631 53 4 688 2b0 bugs.o.after

    Signed-off-by: Adrian Bunk
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Adrian Bunk
     

09 Feb, 2008

1 commit


30 Jan, 2008

11 commits

  • There are already various options to disable specific cpuid bits
    on the command line. They all use their own variable. Add a generic
    mask to make this easier in the future.

    Signed-off-by: Andi Kleen
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Andi Kleen
     
  • Change the size of APICIDs from u8 to u16. This partially
    supports the new x2apic mode that will be present on future
    processor chips. (Chips actually support 32-bit APICIDs, but that
    change is more intrusive. Supporting 16-bit is sufficient for now).

    Signed-off-by: Jack Steiner

    I've included just the partial change from u8 to u16 apicids. The
    remaining x2apic changes will be in a separate patch.

    In addition, the fake_node_to_pxm_map[] and fake_apicid_to_node[]
    tables have been moved from local data to the __initdata section
    reducing stack pressure when MAX_NUMNODES and MAX_LOCAL_APIC are
    increased in size.

    Signed-off-by: Mike Travis
    Reviewed-by: Christoph Lameter
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    travis@sgi.com
     
  • Make sure pte_t, whatever its definition, has a pte element with type
    pteval_t. This allows common code to access it without needing to be
    specifically parameterised on what pagetable mode we're compiling for.
    For 32-bit, this means that pte_t becomes a union with "pte" and "{
    pte_low, pte_high }" (PAE) or just "pte_low" (non-PAE).

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Jeremy Fitzhardinge
     
  • migration helpers for KVM.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Moving things out of processor.h is always a good thing.

    Also needed to avoid include loop in later patch.

    Signed-off-by: Andi Kleen
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Andi Kleen
     
  • This patch adds paravirt hook for swapgs operation, which is a privileged
    operation in x86_64.

    Signed-off-by: Glauber de Oliveira Costa
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Glauber de Oliveira Costa
     
  • What's left in processor_32.h and processor_64.h cannot be cleanly
    integrated. However, it's just a couple of definitions. They are moved
    to processor.h around ifdefs, and the original files are deleted. Note that
    there's much less headers included in the final version.

    Signed-off-by: Glauber de Oliveira Costa
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Glauber de Oliveira Costa
     
  • The i387_fxsave_struct formats really have the same layout
    on 32 and 64, with only some slightly different use of a few
    fields. The i387_fsave_struct and i387_soft_struct formats
    are never used by 64-bit kernels, but it doesn't hurt to
    have the unused types in the union and cuts down on the
    amount of #ifdef hair required throughout the i387 code.

    Signed-off-by: Roland McGrath
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Roland McGrath
     
  • This patch moves i387 definitions from processor_32.h and processor_64.h
    to processor.h. They are different. Very different. And there's appearently
    nothing we can do about it, so they're enclosed inside ifdefs.

    Signed-off-by: Glauber de Oliveira Costa
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Glauber de Oliveira Costa
     
  • There's only one difference between the NOPs used in asm code for i386 and x86_64:
    i386 has a lot more variants. The code is moved to processor.h, and adjusted
    accordingly.

    Signed-off-by: Glauber de Oliveira Costa
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Glauber de Oliveira Costa
     
  • This patch moves the prefetch[w]? functions to processor.h

    Signed-off-by: Glauber de Oliveira Costa
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Glauber de Oliveira Costa