22 May, 2011

2 commits

  • Apic probe now looks at the apic drivers listed in the
    .apicdrivers section. Remove apic_probe[] and make each apic
    driver static.

    Signed-off-by: Suresh Siddha
    Tested-by: Cyrill Gorcunov
    Cc: steiner@sgi.com
    Cc: gorcunov@openvz.org
    Cc: yinghai@kernel.org
    Link: http://lkml.kernel.org/r/20110521005526.341718626@sbsiddha-MOBL3.sc.intel.com
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     
  • This will pave the way for each apic driver to be self-contained
    and eliminate the need for apic_probe[].

    Order in which apic drivers are listed in the .apicdrivers
    section is important, as this determines the apic probe order.
    And this is enforced by the ordering of apic driver files in the
    Makefile and the macros apic_driver()/apic_drivers().

    Signed-off-by: Suresh Siddha
    Tested-by: Cyrill Gorcunov
    Cc: steiner@sgi.com
    Cc: gorcunov@openvz.org
    Cc: yinghai@kernel.org
    Link: http://lkml.kernel.org/r/20110521005526.068775085@sbsiddha-MOBL3.sc.intel.com
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

20 May, 2011

5 commits

  • To eliminate code duplication.

    Signed-off-by: Cyrill Gorcunov
    Signed-off-by: Suresh Siddha
    Cc: steiner@sgi.com
    Cc: yinghai@kernel.org
    Link: http://lkml.kernel.org/r/20110519234637.591426753@sbsiddha-MOBL3.sc.intel.com
    Signed-off-by: Ingo Molnar

    Cyrill Gorcunov
     
  • In the case of x2apic cluster mode we can group IPI register
    writes based on the cluster group instead of individual per-cpu
    destination messages.

    This reduces the apic register writes and reduces the amount of
    IPI messages (in the best case we can reduce it by a factor of
    16).

    With this change, the cost of flush_tlb_others(), with the flush
    tlb IPI being sent from a cpu in the socket-1 to all the logical
    cpus in socket-2 (on a Westmere-EX system that has 20 logical
    cpus in a socket) is 3x times better now (compared to the former
    'send one-by-one' algorithm).

    Signed-off-by: Cyrill Gorcunov
    Signed-off-by: Suresh Siddha
    Cc: steiner@sgi.com
    Cc: yinghai@kernel.org
    Link: http://lkml.kernel.org/r/20110519234637.512271057@sbsiddha-MOBL3.sc.intel.com
    Signed-off-by: Ingo Molnar

    Cyrill Gorcunov
     
  • In the case of x2apic cluster mode, we can group IPI register
    writes based on the cluster group instead of individual per-cpu
    destination messages.

    For this purpose, track the cpu's that belong to the same x2apic
    cluster.

    Signed-off-by: Cyrill Gorcunov
    Signed-off-by: Suresh Siddha
    Cc: steiner@sgi.com
    Cc: yinghai@kernel.org
    Link: http://lkml.kernel.org/r/20110519234637.421800999@sbsiddha-MOBL3.sc.intel.com
    Signed-off-by: Ingo Molnar

    Cyrill Gorcunov
     
  • Signed-off-by: Suresh Siddha
    Acked-by: Cyrill Gorcunov
    Cc: steiner@sgi.com
    Cc: yinghai@kernel.org
    Link: http://lkml.kernel.org/r/20110519234637.337024125@sbsiddha-MOBL3.sc.intel.com
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     
  • Use the unused probe routine in the apic driver to finalize the
    apic model selection. This cleans up the
    default_setup_apic_routing() and this probe routine in future
    can also be used for doing any apic model specific
    initialisation.

    Signed-off-by: Suresh Siddha
    Acked-by: Cyrill Gorcunov
    Cc: steiner@sgi.com
    Cc: yinghai@kernel.org
    Link: http://lkml.kernel.org/r/20110519234637.247458931@sbsiddha-MOBL3.sc.intel.com
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

28 Jan, 2011

2 commits

  • apic->apicid_to_node() is 32bit specific apic operation which
    determines NUMA node for a CPU. Depending on the APIC
    implementation, it can be easier to determine NUMA node from
    either physical or logical apicid. Currently,
    ->apicid_to_node() takes @logical_apicid and calls
    hard_smp_processor_id() if the physical apicid is needed.

    This prevents NUMA mapping from being queried from a different
    CPU, which in turn makes it impossible to initialize NUMA
    mapping before SMP bringup.

    This patch replaces apic->apicid_to_node() with
    ->x86_32_numa_cpu_node() which takes @cpu, from which both
    logical and physical apicids can easily be determined. While at
    it, drop duplicate implementations from bigsmp_32 and summit_32,
    and use the default one.

    Signed-off-by: Tejun Heo
    Reviewed-by: Pekka Enberg
    Cc: eric.dumazet@gmail.com
    Cc: yinghai@kernel.org
    Cc: brgerst@gmail.com
    Cc: gorcunov@gmail.com
    Cc: shaohui.zheng@intel.com
    Cc: rientjes@google.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tejun Heo
     
  • After the previous patch, apic->cpu_to_logical_apicid() is no
    longer used. Kill it.

    For apic types with custom cpu_to_logical_apicid() which is also
    used for other purposes, remove the function and modify its
    users to do the mapping directly.

    #ifdef's on CONFIG_SMP in es7000_32 and summit_32 are ignored
    during conversion as they are not used for UP kernels.

    Signed-off-by: Tejun Heo
    Cc: eric.dumazet@gmail.com
    Cc: yinghai@kernel.org
    Cc: brgerst@gmail.com
    Cc: gorcunov@gmail.com
    Cc: penberg@kernel.org
    Cc: shaohui.zheng@intel.com
    Cc: rientjes@google.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Tejun Heo
     

18 Dec, 2009

1 commit

  • John Blackwood reported:
    > on an older Dell PowerEdge 6650 system with 8 cpus (4 are hyper-threaded),
    > and 32 bit (x86) kernel, once you change the irq smp_affinity of an irq
    > to be less than all cpus in the system, you can never change really the
    > irq smp_affinity back to be all cpus in the system (0xff) again,
    > even though no error status is returned on the "/bin/echo ff >
    > /proc/irq/[n]/smp_affinity" operation.
    >
    > This is due to that fact that BAD_APICID has the same value as
    > all cpus (0xff) on 32bit kernels, and thus the value returned from
    > set_desc_affinity() via the cpu_mask_to_apicid_and() function is treated
    > as a failure in set_ioapic_affinity_irq_desc(), and no affinity changes
    > are made.

    set_desc_affinity() is already checking if the incoming cpu mask
    intersects with the cpu online mask or not. So there is no need
    for the apic op cpu_mask_to_apicid_and() to check again
    and return BAD_APICID.

    Remove the BAD_APICID return value from cpu_mask_to_apicid_and()
    and also fix set_desc_affinity() to return -1 instead of using BAD_APICID
    to represent error conditions (as cpu_mask_to_apicid_and() can return
    logical or physical apicid values and BAD_APICID is really to represent
    bad physical apic id).

    Reported-by: John Blackwood
    Root-caused-by: John Blackwood
    Signed-off-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Suresh Siddha
     

08 Aug, 2009

1 commit

  • found a system where x2apic reports an MSI-X irq initialization
    failure:

    [ 302.859446] igbvf 0000:81:10.4: enabling device (0000 -> 0002)
    [ 302.874369] igbvf 0000:81:10.4: using 64bit DMA mask
    [ 302.879023] igbvf 0000:81:10.4: using 64bit consistent DMA mask
    [ 302.894386] igbvf 0000:81:10.4: enabling bus mastering
    [ 302.898171] igbvf 0000:81:10.4: setting latency timer to 64
    [ 302.914050] reserve_memtype added 0xefb08000-0xefb0c000, track uncached-minus, req uncached-minus, ret uncached-minus
    [ 302.933839] reserve_memtype added 0xefb28000-0xefb29000, track uncached-minus, req uncached-minus, ret uncached-minus
    [ 302.940367] alloc irq_desc for 265 on node 4
    [ 302.956874] alloc kstat_irqs on node 4
    [ 302.959452] alloc irq_2_iommu on node 0
    [ 302.974328] igbvf 0000:81:10.4: irq 265 for MSI/MSI-X
    [ 302.977778] alloc irq_desc for 266 on node 4
    [ 302.980347] alloc kstat_irqs on node 4
    [ 302.995312] free_memtype request 0xefb28000-0xefb29000
    [ 302.998816] igbvf 0000:81:10.4: Failed to initialize MSI-X interrupts.

    ... it turns out that when trying to enable MSI-X,
    __assign_irq_vector(new, cfg_new, apic->target_cpus()) can not
    get vector because for x2apic target-cpus returns cpumask_of(0)

    Update that to online_mask like xapic.

    Signed-off-by: Yinghai Lu
    Acked-by: Suresh Siddha
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     

04 Aug, 2009

1 commit

  • One system has socket 1 come up as BSP.

    kexeced kernel reports BSP as:

    [ 1.524550] Initializing cgroup subsys cpuacct
    [ 1.536064] initial_apicid:20
    [ 1.537135] ht_mask_width:1
    [ 1.538128] core_select_mask:f
    [ 1.539126] core_plus_mask_width:5
    [ 1.558479] CPU: Physical Processor ID: 0
    [ 1.559501] CPU: Processor Core ID: 0
    [ 1.560539] CPU: L1 I cache: 32K, L1 D cache: 32K
    [ 1.579098] CPU: L2 cache: 256K
    [ 1.580085] CPU: L3 cache: 24576K
    [ 1.581108] CPU 0/0x20 -> Node 0
    [ 1.596193] CPU 0 microcode level: 0xffff0008

    It doesn't have correct physical processor id and will get an
    error:

    [ 38.840859] CPU0 attaching sched-domain:
    [ 38.848287] domain 0: span 0,8,72 level SIBLING
    [ 38.851151] groups: 0 8 72
    [ 38.858137] domain 1: span 0,8-15,72-79 level MC
    [ 38.868944] groups: 0,8,72 9,73 10,74 11,75 12,76 13,77 14,78 15,79
    [ 38.881383] ERROR: parent span is not a superset of domain->span
    [ 38.890724] domain 2: span 0-7,64-71 level CPU
    [ 38.899237] ERROR: domain->groups does not contain CPU0
    [ 38.909229] groups: 8-15,72-79
    [ 38.912547] ERROR: groups don't span domain->span
    [ 38.919665] domain 3: span 0-127 level NODE
    [ 38.930739] groups: 0-7,64-71 8-15,72-79 16-23,80-87 24-31,88-95 32-39,96-103 40-47,104-111 48-55,112-119 56-63,120-127

    it turns out: we can not use current_cpu_data in phys_pgd_id
    for x2apic.

    identify_boot_cpu() is called by check_bugs() before
    smp_prepare_cpus() and till smp_prepare_cpus() current_cpu_data
    for bsp is assigned with boot_cpu_data.

    Just make phys_pkg_id for x2apic is aligned to xapic.

    Signed-off-by: Yinghai Lu
    Acked-by: Suresh Siddha
    Cc: Andrew Morton
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     

12 Apr, 2009

1 commit


18 Mar, 2009

1 commit

  • Impact: optimize APIC IPI related barriers

    Uncached MMIO accesses for xapic are inherently serializing and hence
    we don't need explicit barriers for xapic IPI paths.

    x2apic MSR writes/reads don't have serializing semantics and hence need
    a serializing instruction or mfence, to make all the previous memory
    stores globally visisble before the x2apic msr write for IPI.

    Add x2apic_wrmsr_fence() in flush tlb path to x2apic specific paths.

    Signed-off-by: Suresh Siddha
    Cc: Peter Zijlstra
    Cc: Oleg Nesterov
    Cc: Jens Axboe
    Cc: Linus Torvalds
    Cc: "Paul E. McKenney"
    Cc: Rusty Russell
    Cc: Steven Rostedt
    Cc: "steiner@sgi.com"
    Cc: Nick Piggin
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

26 Feb, 2009

2 commits

  • Impact: cleanup

    - rename apic->wakeup_cpu to apic->wakeup_secondary_cpu, to
    make it apparent that this is an SMP-only method

    - handle NULL ->wakeup_secondary_cpus to mean the default INIT
    wakeup sequence - this allows simplification of the APIC
    driver templates.

    Cc: Yinghai Lu
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Impact: cleanup

    x86_quirks->update_apic() calling looks crazy. so try to remove it:

    1. every apic take wakeup_cpu member directly
    2. separate es7000_apic to es7000_apic_cluster
    3. use uv_wakeup_cpu directly

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     

23 Feb, 2009

1 commit

  • If BIOS hands over the control to OS in legacy xapic mode, select
    legacy xapic related ops in the early apic probe and shift to x2apic
    ops later in the boot sequence, only after enabling x2apic mode.

    If BIOS hands over the control in x2apic mode, select x2apic related
    ops in the early apic probe.

    This fixes the early boot panic, where we were selecting x2apic ops,
    while the cpu is still in legacy xapic mode.

    Signed-off-by: Suresh Siddha
    Cc: Yinghai Lu
    Cc: Andrew Morton
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     

18 Feb, 2009

1 commit

  • arch/x86/kernel/ is getting a bit crowded, and the APIC
    drivers are scattered into various different files.

    Move them to arch/x86/kernel/apic/*, and also remove
    the 'gen' prefix from those which had it.

    Also move APIC related functionality: the IO-APIC driver,
    the NMI and the IPI code.

    Signed-off-by: Ingo Molnar

    Ingo Molnar