06 Feb, 2006

2 commits

  • percpu_data blindly allocates bootmem memory to store NR_CPUS instances of
    cpudata, instead of allocating memory only for possible cpus.

    As a preparation for changing that, we need to convert various 0 -> NR_CPUS
    loops to use for_each_cpu().

    (The above only applies to users of asm-generic/percpu.h. powerpc has gone it
    alone and is presently only allocating memory for present CPUs, so it's
    currently corrupting memory).

    Signed-off-by: Eric Dumazet
    Cc: "David S. Miller"
    Cc: James Bottomley
    Acked-by: Ingo Molnar
    Cc: Jens Axboe
    Cc: Anton Blanchard
    Acked-by: William Irwin
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     
  • This reverts commit 10f4dc8b27ac42f930ac55adb8c521264dc997f8.

    Quoth Andi Kleen:
    "Kiran decided that it makes the problem worse than it was before.
    Fixing it fully requires more work which is too much for 2.6.16. So
    please revert that commit for now."

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

05 Feb, 2006

28 commits

  • This patch contains a printk reorder to remove the current problem of
    displaying "PCI-DMA: Disabling IOMMU." and then "PCI-DMA: using GART
    IOMMU" 20 lines later in dmesg.

    It also constains a printk reorder in swiotlb to state swiotlb
    enablement prior to describing the location of the bounce buffers, and a
    printk reorder to state gart enablement prior to describing the
    aperature.

    Also constains a whitespace cleanup in arch/x86_64/kernel/setup.c

    Tested (along with patch 2/2) on dual opteron with gart enabled,
    iommu=soft, and iommu=off.

    Signed-off-by: Jon Mason
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Jon Mason
     
  • Hack for 2.6.16. In 2.6.17 all code that uses NR_CPUs should
    be audited and changed to only touch possible CPUs.

    Don't mark the reference per cpu data init data (so it stays
    around after boot) and point all impossible CPUs to it. This way
    they reference some valid - although shared memory. Usually
    this is only initialization like INIT_LIST_HEADs and there
    won't be races because these CPUs never run. Still somewhat hackish.

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • It's bad juju to touch the APIC when it hasn't been enabled.
    I also moved ack_bad_irq for x86-64 out of line following i386.

    Signed-off-by: Andi Kleen
    Acked-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • Some broken BIOS's had processors disabled, but
    same apic id as a valid processor. This causes
    acpi_processor_start() to think this disabled
    cpu is ok, and croak. So we dont record bad
    apicid's anymore.

    http://bugzilla.kernel.org/show_bug.cgi?id=5930

    Signed-off-by: Ashok Raj
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Ashok Raj
     
  • Checking of the validity of pointers should be consistently done before
    dereferencing the pointer.

    Signed-Off-By: Jan Beulich
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Jan Beulich
     
  • Conditionalize two unwind directives to match other similarly
    conditional code.

    Signed-Off-By: Jan Beulich
    Cc: Jim Houston
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Jan Beulich
     
  • On some broken motherboards (at least one NForce3 based AMD64 laptop)
    the PIT timer runs at a incorrect frequency. This patch adds a new
    option "apicpmtimer" that allows to use the APIC timer and calibrate it
    using the PMTimer. It requires the earlier patch that allows to run the
    main timer from the APIC.

    Specifying apicpmtimer implies apicmaintimer.

    The option defaults to off for now.

    I tested it on a few systems and the resulting APIC timer frequencies
    were usually a bit off, but always
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • kprobes cannot deal with the funny calling conventions when it
    runs on a different stack when it returns. If someone wants
    to instrument context switch they can add a probe to schedule()
    instead.

    Cc: jkenisto@us.ibm.com, prasanna@in.ibm.com

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • Align the start of the per-cpu section to the configured number of bytes in a
    cache line. This stops a BUG_ON() from triggering in load_module() when
    DEFINE_PER_CPU() is used in a module and the section isn't cacheline-aligned.
    Rusty also found this and sent a patch in a while ago
    (http://lkml.org/lkml/2004/10/19/17), I don't know what came of that.

    Signed-off-by: Zach Brown
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Zach Brown
     
  • [ AK: I redid Kevin's fix to be simpler, but the idea and original
    analysis of the problem is from Kevin]

    This avoid allocation failures on some SATA systems like Nvidia CK8
    when the IOMMU gets fragmented. Modern SATA devices have quite large queues
    (128 entries) and the FS with ext2/3 is good enough now that it often
    passes whole 128 page sg lists down to the driver. These require
    512K of continuous free space in the IOMMU aperture to map when merged.
    When the IOMMU is fragmented this could lead to spurious IO errors
    due to failing mappings.

    Short term fix is to just try to map the SG list again unmerged
    page by page - this way fragmentation doesn't matter anymore.
    The code for that was already there, but it just wasn't enabled for the
    merge case.

    According to Kevin at least the Nvidia device doesn't seem to benefit
    from merging much anyways, so the only slowdown is from trying
    to do an unnecessary merge attempt.

    Kevin plans to implement better fragmentation avoidance in the future,
    but that wouldn't be 2.6.16 material.

    TBD: should add some statistic counters to count how often that really
    happens.

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Kevin VanMaren
     
  • I broke this earlier when moving the patch from i386 to x86-64.
    Need to return the virtual address here, not the physical address.
    This fixes some boot time crashes on x86-64.

    Cc: gregkh@suse.de

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • - Check if the processor/memory affinity entries are long enough
    according to the ACPI 3.0 spec.
    - Ignore memory affinity entries that define a zero length region.

    All based on BIOS issues found in the field @)

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • attached patch is 2 more cases i found via running the reference_init.pl
    script. These were easy to spot just knowing the file names. There is
    one another about init/main.c that i cant exactly zero in. (partly
    because i dont know how to interpret the data thats spewed out of the tool).

    Signed-off-by: Ashok Raj
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Ashok Raj
     
  • SIgned-off-by: Shaohua Li
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Shaohua Li
     
  • Might fix boot failures on systems with empty PXMs in SRAT

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • > mm/mempolicy.c: In function `huge_zonelist':
    > mm/mempolicy.c:1045: error: `HPAGE_SHIFT' undeclared (first use in this function)
    > mm/mempolicy.c:1045: error: (Each undeclared identifier is reported only once
    > mm/mempolicy.c:1045: error: for each function it appears in.)
    > make[1]: *** [mm/mempolicy.o] Error 1

    Need to wrap huge_zonelist function with CONFIG_HUGETLBFS.

    Signed-off-by: Ken Chen
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Chen, Kenneth W
     
  • Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • It has been enabled by default for some time now and is cheap enough
    so it doesn't matter anyways.

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • Currently, x86_64 and ia64 arches do not clear the corresponding bits
    in the node's cpumask when a cpu goes down or cpu bring up is cancelled.
    This is buggy since there are pieces of common code where the cpumask is
    checked in the cpu down code path to decide on things (like in the slab
    down path). PPC does the right thing, but x86_64 and ia64 don't (This
    was the reason Sonny hit upon a slab bug during cpu offline on ppc and
    could not reproduce on other arches). This patch fixes it for x86_64.
    I won't attempt ia64 as I cannot test it.

    Credit for spotting this should go to Alok.

    Signed-off-by: Alok N Kataria
    Signed-off-by: Ravikiran Thirumalai
    Signed-off-by: Shai Fultheim
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Ravikiran G Thirumalai
     
  • They cause quite bad performance regressions on Netburst
    This is temporary until we can get new optimized functions
    for these CPUs.

    This undoes changes that were done in 2.6.15 and in 2.6.16-rc1,
    essentially bringing the code back to 2.6.14 level. Only change
    is I renamed the X86_FEATURE_K8_C flag to X86_FEATURE_REP_GOOD
    and fixed the check for the flag and also fixed some comments.

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • This avoids BUG_ONs in the low level allocator when an illegal
    GFP mask is added.

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • At resume time, TSC's value or something similar might be changed a lot
    against suspend time. This could make system gets a very big lost ticks.
    See http://bugzilla.kernel.org/show_bug.cgi?id=5825

    Signed-off-by: Shaohua Li
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Shaohua Li
     
  • They all have problems with IRQ 0 routing, so just use the APIC on them.

    Can be overwritten with "noapicmaintimer"

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • Another piece from the no-idle-tick patch.

    This can be enabled with the "apicmaintimer" option.

    This is mainly useful when the PIT/HPET interrupt is unreliable.
    Note there are some systems that are known to stop the APIC
    timer in C3. For those it will never work, but this case
    should be automatically detected.

    It also only works with PM timer right now. When HPET is used
    the way the main timer handler computes the delay doesn't work.

    It should be a bit more efficient because there is one less
    regular interrupt to process on the boot processor.

    Requires earlier bugfix from Venkatesh

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • Bug in apic timer removal on C3 patch. We should switch to IPI from APIC timer
    only when C3 state is valid.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Venkatesh Pallipadi
     
  • Avoids some ifdef mess later.

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • A kprobe executes IRET early and that could cause NMI recursion
    and stack corruption.

    Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     
  • Signed-off-by: Andi Kleen
    Signed-off-by: Linus Torvalds

    Andi Kleen
     

04 Feb, 2006

10 commits