21 Dec, 2011

1 commit

  • Several fields in struct cpuinfo_x86 were not defined for the
    !SMP case, likely to save space. However, those fields still
    have some meaning for UP, and keeping them allows some #ifdef
    removal from other files. The additional size of the UP kernel
    from this change is not significant enough to worry about
    keeping up the distinction:

    text data bss dec hex filename
    4737168 506459 972040 6215667 5ed7f3 vmlinux.o.before
    4737444 506459 972040 6215943 5ed907 vmlinux.o.after

    for a difference of 276 bytes for an example UP config.

    If someone wants those 276 bytes back badly then it should
    be implemented in a cleaner way.

    Signed-off-by: Kevin Winchester
    Cc: Steffen Persvold
    Link: http://lkml.kernel.org/r/1324428742-12498-1-git-send-email-kjwinchester@gmail.com
    Signed-off-by: Ingo Molnar

    Kevin Winchester
     

14 Oct, 2011

2 commits

  • Now that the cpu update level is available the Atom PSE errata
    check can use it directly without reading the MSR again.

    Signed-off-by: Andi Kleen
    Acked-by: H. Peter Anvin
    Link: http://lkml.kernel.org/r/1318466795-7393-2-git-send-email-andi@firstfloor.org
    Signed-off-by: Ingo Molnar

    Andi Kleen
     
  • I got a request to make it easier to determine the microcode
    update level on Intel CPUs. This patch adds a new "microcode"
    field to /proc/cpuinfo.

    The microcode level is also outputed on fatal machine checks
    together with the other CPUID model information.

    I removed the respective code from the microcode update driver,
    it just reads the field from cpu_data. Also when the microcode
    is updated it fills in the new values too.

    I had to add a memory barrier to native_cpuid to prevent it
    being optimized away when the result is not used.

    This turns out to clean up further code which already got this
    information manually. This is done in followon patches.

    Signed-off-by: Andi Kleen
    Acked-by: H. Peter Anvin
    Link: http://lkml.kernel.org/r/1318466795-7393-1-git-send-email-andi@firstfloor.org
    Signed-off-by: Ingo Molnar

    Andi Kleen
     

16 Jul, 2011

1 commit


15 Jul, 2011

1 commit

  • Since 2.6.36 (23016bf0d25), Linux prints the existence of "epb" in /proc/cpuinfo,
    Since 2.6.38 (d5532ee7b40), the x86_energy_perf_policy(8) utility has
    been available in-tree to update MSR_IA32_ENERGY_PERF_BIAS.

    However, the typical BIOS fails to initialize the MSR, presumably
    because this is handled by high-volume shrink-wrap operating systems...

    Linux distros, on the other hand, do not yet invoke x86_energy_perf_policy(8).
    As a result, WSM-EP, SNB, and later hardware from Intel will run in its
    default hardware power-on state (performance), which assumes that users
    care for performance at all costs and not for energy efficiency.
    While that is fine for performance benchmarks, the hardware's intended default
    operating point is "normal" mode...

    Initialize the MSR to the "normal" by default during kernel boot.

    x86_energy_perf_policy(8) is available to change the default after boot,
    should the user have a different preference.

    Signed-off-by: Len Brown
    Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1107140051020.18606@x980
    Acked-by: Rafael J. Wysocki
    Signed-off-by: H. Peter Anvin
    Cc:

    Len Brown
     

20 May, 2011

1 commit

  • * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, cpu: Fix detection of Celeron Covington stepping A1 and B0
    Documentation, ABI: Update L3 cache index disable text
    x86, AMD, cacheinfo: Fix L3 cache index disable checks
    x86, AMD, cacheinfo: Fix fallout caused by max3 conversion
    x86, cpu: Change NOP selection for certain Intel CPUs
    x86, cpu: Clean up and unify the NOP selection infrastructure
    x86, percpu: Use ASM_NOP4 instead of hardcoding P6_NOP4
    x86, cpu: Move AMD Elan Kconfig under "Processor family"

    Fix up trivial conflicts in alternative handling (commit dc326fca2b64
    "x86, cpu: Clean up and unify the NOP selection infrastructure" removed
    some hacky 5-byte instruction stuff, while commit d430d3d7e646 "jump
    label: Introduce static_branch() interface" renamed HAVE_JUMP_LABEL to
    CONFIG_JUMP_LABEL in the code that went away)

    Linus Torvalds
     

18 May, 2011

1 commit

  • If kernel intends to use enhanced REP MOVSB/STOSB, it must ensure
    IA32_MISC_ENABLE.Fast_String_Enable (bit 0) is set and CPUID.(EAX=07H, ECX=0H):
    EBX[bit 9] also reports 1.

    Signed-off-by: Fenghua Yu
    Link: http://lkml.kernel.org/r/1305671358-14478-3-git-send-email-fenghua.yu@intel.com
    Signed-off-by: H. Peter Anvin

    Fenghua Yu
     

17 May, 2011

1 commit

  • Steppings A1 and B0 of Celeron Covington are currently misdetected as
    Pentium II (Dixon). Fix it by removing the stepping check.

    [ hpa: this fixes this specific bug... the CPUID documentation
    specifies that the L2 cache size can disambiguate additional CPUs;
    this patch does not fix that. ]

    Signed-off-by: Ondrej Zary
    Link: http://lkml.kernel.org/r/201105162138.15416.linux@rainbow-software.org
    Signed-off-by: H. Peter Anvin

    Ondrej Zary
     

28 Jan, 2011

2 commits

  • Unlike 64bit, 32bit has been using its own cpu_to_node_map[] for
    CPU -> NUMA node mapping. Replace it with early_percpu variable
    x86_cpu_to_node_map and share the mapping code with 64bit.

    * USE_PERCPU_NUMA_NODE_ID is now enabled for 32bit too.

    * x86_cpu_to_node_map and numa_set/clear_node() are moved from
    numa_64 to numa. For now, on 32bit, x86_cpu_to_node_map is initialized
    with 0 instead of NUMA_NO_NODE. This is to avoid introducing unexpected
    behavior change and will be updated once init path is unified.

    * srat_detect_node() is now enabled for x86_32 too. It calls
    numa_set_node() and initializes the mapping making explicit
    cpu_to_node_map[] updates from map/unmap_cpu_to_node() unnecessary.

    Signed-off-by: Tejun Heo
    Cc: eric.dumazet@gmail.com
    Cc: yinghai@kernel.org
    Cc: brgerst@gmail.com
    Cc: gorcunov@gmail.com
    Cc: penberg@kernel.org
    Cc: shaohui.zheng@intel.com
    Cc: rientjes@google.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    Cc: David Rientjes

    Tejun Heo
     
  • The mapping between cpu/apicid and node is done via
    apicid_to_node[] on 64bit and apicid_2_node[] +
    apic->x86_32_numa_cpu_node() on 32bit. This difference makes it
    difficult to further unify 32 and 64bit NUMA handling.

    This patch unifies it by replacing both apicid_to_node[] and
    apicid_2_node[] with __apicid_to_node[] array, which is accessed
    by two accessors - set_apicid_to_node() and numa_cpu_node(). On
    64bit, numa_cpu_node() always consults __apicid_to_node[]
    directly while 32bit goes through apic->numa_cpu_node() method
    to allow apic implementations to override it.

    srat_detect_node() for amd cpus contains workaround for broken
    NUMA configuration which assumes relationship between APIC ID,
    HT node ID and NUMA topology. Leave it to access
    __apicid_to_node[] directly as mapping through CPU might result
    in undesirable behavior change. The comment is reformatted and
    updated to note the ugliness.

    Signed-off-by: Tejun Heo
    Reviewed-by: Pekka Enberg
    Cc: eric.dumazet@gmail.com
    Cc: yinghai@kernel.org
    Cc: brgerst@gmail.com
    Cc: gorcunov@gmail.com
    Cc: shaohui.zheng@intel.com
    Cc: rientjes@google.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    Cc: David Rientjes

    Tejun Heo
     

23 Oct, 2010

1 commit

  • … 'x86-quirks-for-linus', 'x86-setup-for-linus', 'x86-uv-for-linus' and 'x86-vm86-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip

    * 'softirq-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    softirqs: Make wakeup_softirqd static

    * 'x86-debug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, asm: Restore parentheses around one pushl_cfi argument
    x86, asm: Fix ancient-GAS workaround
    x86, asm: Fix CFI macro invocations to deal with shortcomings in gas

    * 'x86-numa-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, numa: Assign CPUs to nodes in round-robin manner on fake NUMA

    * 'x86-quirks-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86: HPET force enable for CX700 / VIA Epia LT

    * 'x86-setup-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, setup: Use string copy operation to optimze copy in kernel compression

    * 'x86-uv-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, UV: Use allocated buffer in tlb_uv.c:tunables_read()

    * 'x86-vm86-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, vm86: Fix preemption bug for int1 debug and int3 breakpoint handlers.

    Linus Torvalds
     

22 Oct, 2010

1 commit


12 Oct, 2010

1 commit

  • commit d9c2d5ac6af87b4491bff107113aaf16f6c2b2d9 "x86, numa: Use near(er)
    online node instead of roundrobin for NUMA" changed NUMA initialization on
    Intel to choose the nearest online node or first node. Fake NUMA would be
    better of with round-robin initialization, instead of the all CPUS on
    first node. Change the choice of first node, back to round-robin.

    For testing NUMA kernel behaviour without cpusets and NUMA aware
    applications, it would be better to have cpus in different nodes, rather
    than all in a single node. With cpusets migration of tasks scenarios
    cannot not be tested.

    I guess having it round-robin shouldn't affect the use cases for all cpus
    on the first node.

    The code comments in arch/x86/mm/numa_64.c:759 indicate that this used to
    be the case, which was changed by commit d9c2d5ac6. It changed from
    roundrobin to nearer or first node. And I couldn't find any reason for
    this change in its changelog.

    Signed-off-by: Nikanth Karthikesan
    Cc: David Rientjes
    Signed-off-by: Andrew Morton

    Nikanth Karthikesan
     

29 Sep, 2010

1 commit


13 Aug, 2010

1 commit

  • boot_cpu_id is there for historical reasons and was renamed to
    boot_cpu_physical_apicid in patch:

    c70dcb7 x86: change boot_cpu_id to boot_cpu_physical_apicid

    However, there are some remaining occurrences of boot_cpu_id that are
    never touched in the kernel and thus its value is always 0.

    This patch removes boot_cpu_id completely.

    Signed-off-by: Robert Richter
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Robert Richter
     

18 May, 2010

1 commit

  • * 'x86-cpu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, hypervisor: add missing
    Modify the VMware balloon driver for the new x86_hyper API
    x86, hypervisor: Export the x86_hyper* symbols
    x86: Clean up the hypervisor layer
    x86, HyperV: fix up the license to mshyperv.c
    x86: Detect running on a Microsoft HyperV system
    x86, cpu: Make APERF/MPERF a normal table-driven flag
    x86, k8: Fix build error when K8_NB is disabled
    x86, cacheinfo: Disable index in all four subcaches
    x86, cacheinfo: Make L3 cache info per node
    x86, cacheinfo: Reorganize AMD L3 cache structure
    x86, cacheinfo: Turn off L3 cache index disable feature in virtualized environments
    x86, cacheinfo: Unify AMD L3 cache index disable checking
    cpufreq: Unify sysfs attribute definition macros
    powernow-k8: Fix frequency reporting
    x86, cpufreq: Add APERF/MPERF support for AMD processors
    x86: Unify APERF/MPERF support
    powernow-k8: Add core performance boost support
    x86, cpu: Add AMD core boosting feature flag to /proc/cpuinfo

    Fix up trivial conflicts in arch/x86/kernel/cpu/intel_cacheinfo.c and
    drivers/cpufreq/cpufreq_ondemand.c

    Linus Torvalds
     

09 May, 2010

1 commit


30 Apr, 2010

1 commit


24 Apr, 2010

1 commit

  • Atom erratum AAE44/AAF40/AAG38/AAH41:

    "If software clears the PS (page size) bit in a present PDE (page
    directory entry), that will cause linear addresses mapped through this
    PDE to use 4-KByte pages instead of using a large page after old TLB
    entries are invalidated. Due to this erratum, if a code fetch uses
    this PDE before the TLB entry for the large page is invalidated then
    it may fetch from a different physical address than specified by
    either the old large page translation or the new 4-KByte page
    translation. This erratum may also cause speculative code fetches from
    incorrect addresses."

    [http://download.intel.com/design/processor/specupdt/319536.pdf]

    Where as commit 211b3d03c7400f48a781977a50104c9d12f4e229 seems to
    workaround errata AAH41 (mixed 4K TLBs) it reduces the window of
    opportunity for the bug to occur and does not totally remove it. This
    patch disables mixed 4K/4MB page tables totally avoiding the page
    splitting and not tripping this processor issue.

    This is based on an original patch by Colin King.

    Originally-by: Colin Ian King
    Cc: Colin Ian King
    Cc: Ingo Molnar
    Signed-off-by: H. Peter Anvin
    LKML-Reference:
    Cc:

    H. Peter Anvin
     

10 Apr, 2010

1 commit

  • Initialize this CPUID flag feature in common code. It could be made a
    standalone function later, maybe, if more functionality is duplicated.

    Signed-off-by: Borislav Petkov
    LKML-Reference:
    Reviewed-by: Thomas Renninger
    Signed-off-by: H. Peter Anvin

    Borislav Petkov
     

26 Mar, 2010

1 commit

  • Support for the PMU's BTS features has been upstreamed in
    v2.6.32, but we still have the old and disabled ptrace-BTS,
    as Linus noticed it not so long ago.

    It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
    regard for other uses (perf) and doesn't provide the flexibility
    needed for perf either.

    Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
    was never used and ptrace-block-step can be implemented using a
    much simpler approach.

    So axe all 3000 lines of it. That includes the *locked_memory*()
    APIs in mm/mlock.c as well.

    Reported-by: Linus Torvalds
    Signed-off-by: Peter Zijlstra
    Cc: Roland McGrath
    Cc: Oleg Nesterov
    Cc: Markus Metzger
    Cc: Steven Rostedt
    Cc: Andrew Morton
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

02 Mar, 2010

1 commit

  • On UV systems, the TSC is not synchronized across blades. The
    sched_clock_cpu() function is returning values that can go
    backwards (I've seen as much as 8 seconds) when switching
    between cpus.

    As each cpu comes up, early_init_intel() will currently set the
    sched_clock_stable flag true. When mark_tsc_unstable() runs, it
    clears the flag, but this only occurs once (the first time a cpu
    comes up whose TSC is not synchronized with cpu 0). After this,
    early_init_intel() will set the flag again as the next cpu comes
    up.

    Only set sched_clock_stable if tsc has not been marked unstable.

    Signed-off-by: Dimitri Sivanich
    Acked-by: Venkatesh Pallipadi
    Acked-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Dimitri Sivanich
     

18 Dec, 2009

1 commit

  • Commit 83ce4009 did the following change
    If the TSC is constant and non-stop, also set it reliable.

    But, there seems to be few systems that will end up with TSC warp across
    sockets, depending on how the cpus come out of reset. Skipping TSC sync
    test on such systems may result in time inconsistency later.

    So, reenable TSC sync test even on constant and non-stop TSC systems.
    Set, sched_clock_stable to 1 by default and reset it in
    mark_tsc_unstable, if TSC sync fails.

    This change still gives perf benefit mentioned in 83ce4009 for systems
    where TSC is reliable.

    Signed-off-by: Venkatesh Pallipadi
    Acked-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Pallipadi, Venkatesh
     

12 Dec, 2009

1 commit

  • When there are a large number of processors in a system, there
    is an excessive amount of messages sent to the system console.
    It's estimated that with 4096 processors in a system, and the
    console baudrate set to 56K, the startup messages will take
    about 84 minutes to clear the serial port.

    This set of patches limits the number of repetitious messages
    which contain no additional information. Much of this information
    is obtainable from the /proc and /sysfs. Some of the messages
    are also sent to the kernel log buffer as KERN_DEBUG messages so
    dmesg can be used to examine more closely any details specific to
    a problem.

    The new cpu bootup sequence for system_state == SYSTEM_BOOTING:

    Booting Node 0, Processors #1 #2 #3 #4 #5 #6 #7 Ok.
    Booting Node 1, Processors #8 #9 #10 #11 #12 #13 #14 #15 Ok.
    ...
    Booting Node 3, Processors #56 #57 #58 #59 #60 #61 #62 #63 Ok.
    Brought up 64 CPUs

    After the system is running, a single line boot message is displayed
    when CPU's are hotplugged on:

    Booting Node %d Processor %d APIC 0x%x

    Status of the following lines:

    CPU: Physical Processor ID: printed once (for boot cpu)
    CPU: Processor Core ID: printed once (for boot cpu)
    CPU: Hyper-Threading is disabled printed once (for boot cpu)
    CPU: Thermal monitoring enabled printed once (for boot cpu)
    CPU %d/0x%x -> Node %d: removed
    CPU %d is now offline: only if system_state == RUNNING
    Initializing CPU#%d: KERN_DEBUG

    Signed-off-by: Mike Travis
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Mike Travis
     

23 Nov, 2009

1 commit

  • CPU to node mapping is set via the following sequence:

    1. numa_init_array(): Set up roundrobin from cpu to online node

    2. init_cpu_to_node(): Set that according to apicid_to_node[]
    according to srat only handle the node that
    is online, and leave other cpu on node
    without ram (aka not online) to still
    roundrobin.

    3. later call srat_detect_node for Intel/AMD, will use first_online
    node or nearby node.

    Problem is that setup_per_cpu_areas() is not called between 2 and 3,
    the per_cpu for cpu on node with ram is on different node, and could
    put that on node with two hops away.

    So try to optimize this and add find_near_online_node() and call
    init_cpu_to_node().

    Signed-off-by: Yinghai Lu
    Cc: Tejun Heo
    Cc: Linus Torvalds
    Cc: Thomas Gleixner
    Cc: H. Peter Anvin
    Cc: Rusty Russell
    Cc: David Rientjes
    Cc: Andrew Morton
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     

15 Sep, 2009

1 commit

  • Move the APERFMPERF capacility into a X86_FEATURE flag so that it
    can be used outside of the acpi cpufreq driver.

    Cc: H. Peter Anvin
    Cc: Venkatesh Pallipadi
    Cc: Yanmin
    Cc: Dave Jones
    Cc: Len Brown
    Cc: Yinghai Lu
    Cc: cpufreq@vger.kernel.org
    Signed-off-by: Peter Zijlstra
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Peter Zijlstra
     

11 Jul, 2009

1 commit

  • No code changes except printk levels (although some of the K6
    mtrr code might be clearer if there were a few as would
    splitting out some of the intel cache code).

    Signed-off-by: Alan Cox
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Alan Cox
     

15 Jun, 2009

1 commit

  • The hooks that we modify are:
    - Page fault handler (to handle kmemcheck faults)
    - Debug exception handler (to hide pages after single-stepping
    the instruction that caused the page fault)

    Also redefine memset() to use the optimized version if kmemcheck is
    enabled.

    (Thanks to Pekka Enberg for minimizing the impact on the page fault
    handler.)

    As kmemcheck doesn't handle MMX/SSE instructions (yet), we also disable
    the optimized xor code, and rely instead on the generic C implementation
    in order to avoid false-positive warnings.

    Signed-off-by: Vegard Nossum

    [whitespace fixlet]
    Signed-off-by: Pekka Enberg
    Signed-off-by: Ingo Molnar

    [rebased for mainline inclusion]
    Signed-off-by: Vegard Nossum

    Vegard Nossum
     

18 May, 2009

1 commit


29 Mar, 2009

1 commit


28 Mar, 2009

1 commit


14 Mar, 2009

1 commit


13 Mar, 2009

1 commit


12 Mar, 2009

1 commit


08 Mar, 2009

1 commit

  • Impact: cleanup and code size reduction on 64-bit

    This code is only applied to Intel Pentium and AMD K7 32-bit cpus.

    Move those checks to intel_init()/amd_init() for 32-bit
    so 64-bit will not build this code.

    Also change to use cpu_index check to see if we need to emit warning.

    Signed-off-by: Yinghai Lu
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     

27 Feb, 2009

1 commit

  • If the TSC is constant and non-stop, also set it reliable.

    (We will turn this off in DMI quirks for multi-chassis systems)

    The performance number on a 16-way Nehalem system running
    32 tasks that context-switch between each other is significant:

    sched_clock_stable=0 sched_clock_stable=1
    .................... ....................
    22.456925 million/sec 24.306972 million/sec [+8.2%]

    lmbench's "lat_ctx -s 0 2" goes from 0.63 microseconds to
    0.59 microseconds - a 6.7% increase in context-switching
    performance.

    Perfstat of 1 million pipe context switches between two tasks:

    Performance counter stats for './pipe-test-1m':

    [before] [after]
    ............ ............
    37621.421089 36436.848378 task clock ticks (msecs)

    0 0 CPU migrations (events)
    2000274 2000189 context switches (events)
    194 193 pagefaults (events)
    8433799643 8171016416 CPU cycles (events) -3.21%
    8370133368 8180999694 instructions (events) -2.31%
    4158565 3895941 cache references (events) -6.74%
    44312 46264 cache misses (events)

    2349.287976 2279.362465 wall-time (msecs) -3.06%

    The speedup comes straight from the reduction in the instruction
    count. sched_clock_cpu() got simpler and the whole workload thus
    executes faster.

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

21 Feb, 2009

1 commit


20 Feb, 2009

1 commit


18 Feb, 2009

2 commits