09 May, 2007

13 commits

  • i386:

    Rearrange the cmpxchg code to allow atomic.h to get it without needing to
    include system.h. This kills warnings in the UML build from atomic.h about
    implicit declarations of cmpxchg symbols. The i386 build presumably isn't
    seeing this because a separate inclusion of system.h is covering it over.

    The cmpxchg stuff is moved to asm-i386/cmpxchg.h, with an include left in
    system.h for the benefit of generic code which expects cmpxchg there.

    Meanwhile, atomic.h includes cmpxchg.h.

    This causes no noticable damage to the i386 build.

    x86_64:

    Move cmpxchg into its own header. atomic.h already included system.h, so
    this is changed to include cmpxchg.h.

    This is purely cleanup - it's not fixing any warnings - so if the x86_64
    system.h isn't considered as cleanup-worthy as i386, then this can be
    dropped.

    It causes no noticable damage to the x86_64 build.

    uml:

    The i386 and x86_64 cmpxchg patches require an asm-um/cmpxchg.h for the
    UML build.

    Signed-off-by: Jeff Dike
    Cc: Paolo 'Blaisorblade' Giarrusso
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • tas() has no users, so get rid of it.

    Signed-off-by: Jeff Dike
    Cc:
    Cc: Paolo 'Blaisorblade' Giarrusso
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • Signed-off-by: Mathieu Desnoyers
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • atomic_add_unless as inline. Remove system.h atomic.h circular dependency.
    I agree (with Andi Kleen) this typeof is not needed and more error
    prone. All the original atomic.h code that uses cmpxchg (which includes
    the atomic_add_unless) uses defines instead of inline functions,
    probably to circumvent a circular dependency between system.h and
    atomic.h on powerpc (which my patch addresses). Therefore, it makes
    sense to use inline functions that will provide type checking.

    atomic_add_unless as inline. Remove system.h atomic.h circular dependency.
    Digging into the FRV architecture shows me that it is also affected by
    such a circular dependency. Here is the diff applying this against the
    rest of my atomic.h patches.

    It applies over the atomic.h standardization patches.

    Signed-off-by: Mathieu Desnoyers
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • Signed-off-by: Mathieu Desnoyers
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mathieu Desnoyers
     
  • Implement utimensat(2) which is an extension to futimesat(2) in that it

    a) supports nano-second resolution for the timestamps
    b) allows to selectively ignore the atime/mtime value
    c) allows to selectively use the current time for either atime or mtime
    d) supports changing the atime/mtime of a symlink itself along the lines
    of the BSD lutimes(3) functions

    For this change the internally used do_utimes() functions was changed to
    accept a timespec time value and an additional flags parameter.

    Additionally the sys_utime function was changed to match compat_sys_utime
    which already use do_utimes instead of duplicating the work.

    Also, the completely missing futimensat() functionality is added. We have
    such a function in glibc but we have to resort to using /proc/self/fd/* which
    not everybody likes (chroot etc).

    Test application (the syscall number will need per-arch editing):

    #include
    #include
    #include
    #include
    #include
    #include

    #define __NR_utimensat 280

    #define UTIME_NOW ((1l << 30) - 1l)
    #define UTIME_OMIT ((1l << 30) - 2l)

    int
    main(void)
    {
    int status = 0;

    int fd = open("ttt", O_RDWR|O_CREAT|O_EXCL, 0666);
    if (fd == -1)
    error (1, errno, "failed to create test file \"ttt\"");

    struct stat64 st1;
    if (fstat64 (fd, &st1) != 0)
    error (1, errno, "fstat failed");

    struct timespec t[2];
    t[0].tv_sec = 0;
    t[0].tv_nsec = 0;
    t[1].tv_sec = 0;
    t[1].tv_nsec = 0;
    if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

    struct stat64 st2;
    if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

    if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
    {
    puts ("atim not reset to zero");
    status = 1;
    }
    if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
    puts ("mtim not reset to zero");
    status = 1;
    }
    if (status != 0)
    goto out;

    t[0] = st1.st_atim;
    t[1].tv_sec = 0;
    t[1].tv_nsec = UTIME_OMIT;
    if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

    if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

    if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
    || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
    {
    puts ("atim not set");
    status = 1;
    }
    if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
    puts ("mtim changed from zero");
    status = 1;
    }
    if (status != 0)
    goto out;

    t[0].tv_sec = 0;
    t[0].tv_nsec = UTIME_OMIT;
    t[1] = st1.st_mtim;
    if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

    if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

    if (st2.st_atim.tv_sec != st1.st_atim.tv_sec
    || st2.st_atim.tv_nsec != st1.st_atim.tv_nsec)
    {
    puts ("mtim changed from original time");
    status = 1;
    }
    if (st2.st_mtim.tv_sec != st1.st_mtim.tv_sec
    || st2.st_mtim.tv_nsec != st1.st_mtim.tv_nsec)
    {
    puts ("mtim not set");
    status = 1;
    }
    if (status != 0)
    goto out;

    sleep (2);

    t[0].tv_sec = 0;
    t[0].tv_nsec = UTIME_NOW;
    t[1].tv_sec = 0;
    t[1].tv_nsec = UTIME_NOW;
    if (syscall(__NR_utimensat, AT_FDCWD, "ttt", t, 0) != 0)
    error (1, errno, "utimensat failed");

    if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

    struct timeval tv;
    gettimeofday(&tv,NULL);

    if (st2.st_atim.tv_sec tv.tv_sec)
    {
    puts ("atim not set to NOW");
    status = 1;
    }
    if (st2.st_mtim.tv_sec tv.tv_sec)
    {
    puts ("mtim not set to NOW");
    status = 1;
    }

    if (symlink ("ttt", "tttsym") != 0)
    error (1, errno, "cannot create symlink");

    t[0].tv_sec = 0;
    t[0].tv_nsec = 0;
    t[1].tv_sec = 0;
    t[1].tv_nsec = 0;
    if (syscall(__NR_utimensat, AT_FDCWD, "tttsym", t, AT_SYMLINK_NOFOLLOW) != 0)
    error (1, errno, "utimensat failed");

    if (lstat64 ("tttsym", &st2) != 0)
    error (1, errno, "lstat failed");

    if (st2.st_atim.tv_sec != 0 || st2.st_atim.tv_nsec != 0)
    {
    puts ("symlink atim not reset to zero");
    status = 1;
    }
    if (st2.st_mtim.tv_sec != 0 || st2.st_mtim.tv_nsec != 0)
    {
    puts ("symlink mtim not reset to zero");
    status = 1;
    }
    if (status != 0)
    goto out;

    t[0].tv_sec = 1;
    t[0].tv_nsec = 0;
    t[1].tv_sec = 1;
    t[1].tv_nsec = 0;
    if (syscall(__NR_utimensat, fd, NULL, t, 0) != 0)
    error (1, errno, "utimensat failed");

    if (fstat64 (fd, &st2) != 0)
    error (1, errno, "fstat failed");

    if (st2.st_atim.tv_sec != 1 || st2.st_atim.tv_nsec != 0)
    {
    puts ("atim not reset to one");
    status = 1;
    }
    if (st2.st_mtim.tv_sec != 1 || st2.st_mtim.tv_nsec != 0)
    {
    puts ("mtim not reset to one");
    status = 1;
    }

    if (status == 0)
    puts ("all OK");

    out:
    close (fd);
    unlink ("ttt");
    unlink ("tttsym");

    return status;
    }

    [akpm@linux-foundation.org: add missing i386 syscall table entry]
    Signed-off-by: Ulrich Drepper
    Cc: Alexey Dobriyan
    Cc: Michael Kerrisk
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ulrich Drepper
     
  • Eliminate 19439 (!!) sparse warnings like:
    include/linux/mm.h:321:22: warning: constant 0xffff810000000000 is so big it is unsigned long

    Eliminate 56 sparse warnings like:
    arch/x86_64/kernel/setup.c:248:16: warning: constant 0xffffffff80000000 is so big it is unsigned long

    Eliminate 5 sparse warnings like:
    arch/x86_64/kernel/module.c:49:13: warning: constant 0xfffffffffff00000 is so big it is unsigned long

    Eliminate 23 sparse warnings like:
    arch/x86_64/mm/init.c:551:37: warning: constant 0xffffc20000000000 is so big it is unsigned long

    Eliminate 6 sparse warnings like:
    arch/x86_64/kernel/module.c:49:13: warning: constant 0xffffffff88000000 is so big it is unsigned long

    Eliminate 23 sparse warnings like:
    arch/x86_64/mm/init.c:552:6: warning: constant 0xffffe1ffffffffff is so big it is unsigned long

    Eliminate 3 sparse warnings like:
    arch/x86_64/kernel/e820.c:186:17: warning: constant 0x3fffffffffff is so big it is long

    Signed-off-by: Randy Dunlap
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Make a global linux/const.h header file instead of having multiple,
    per-arch files, and convert current users of asm/const.h to use
    linux/const.h.

    Built on x86_64 and sparc64.

    [akpm@linux-foundation.org: fix include/asm-x86_64/Kbuild]
    Signed-off-by: Randy Dunlap
    Signed-off-by: David S. Miller
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Most architectures defined three macros, MK_IOSPACE_PFN(), GET_IOSPACE()
    and GET_PFN() in pgtable.h. However, the only callers of any of these
    macros are in Sparc specific code, either in arch/sparc, arch/sparc64 or
    drivers/sbus.

    This patch removes the redundant macros from all architectures except
    sparc and sparc64.

    Signed-off-by: David Gibson
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Gibson
     
  • Currently the size of the per-cpu region reserved to save crash notes is
    set by the per-architecture value MAX_NOTE_BYTES. Which in turn is
    currently set to 1024 on all supported architectures.

    While testing ia64 I recently discovered that this value is in fact too
    small. The particular setup I was using actually needs 1172 bytes. This
    lead to very tedious failure mode where the tail of one elf note would
    overwrite the head of another if they ended up being alocated sequentially
    by kmalloc, which was often the case.

    It seems to me that a far better approach is to caclculate the size that
    the area needs to be. This patch does just that.

    If a simpler stop-gap patch for ia64 to be squeezed into 2.6.21(.X) is
    needed then this should be as easy as making MAX_NOTE_BYTES larger in
    arch/asm-ia64/kexec.h. Perhaps 2048 would be a good choice. However, I
    think that the approach in this patch is a much more robust idea.

    Acked-by: Vivek Goyal
    Signed-off-by: Simon Horman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Simon Horman
     
  • This patch moves the die notifier handling to common code. Previous
    various architectures had exactly the same code for it. Note that the new
    code is compiled unconditionally, this should be understood as an appel to
    the other architecture maintainer to implement support for it aswell (aka
    sprinkling a notify_die or two in the proper place)

    arm had a notifiy_die that did something totally different, I renamed it to
    arm_notify_die as part of the patch and made it static to the file it's
    declared and used at. avr32 used to pass slightly less information through
    this interface and I brought it into line with the other architectures.

    [akpm@linux-foundation.org: build fix]
    [akpm@linux-foundation.org: fix vmalloc_sync_all bustage]
    [bryan.wu@analog.com: fix vmalloc_sync_all in nommu]
    Signed-off-by: Christoph Hellwig
    Cc:
    Cc: Russell King
    Signed-off-by: Bryan Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Adds the needed TCGETS2/TCSETS2 ioctl calls, structures, defines and the like.
    Tested against the test suite and passes. Other platforms should need
    roughly the same change.

    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • Convert over to the new NMI handling for getting IPMI watchdog timeouts via an
    NMI. This add config options to know if there is the ability to receive NMIs
    and if it has an NMI post processing call. Then it modifies the IPMI watchdog
    to take advantage of this so that it can know if an NMI comes in.

    It also adds testing that the IPMI NMI watchdog works.

    Signed-off-by: Corey Minyard
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Corey Minyard
     

07 May, 2007

1 commit

  • This was broken. It adds complexity, for no good reason. Rather than
    separate __pa() and __pa_symbol(), we should deprecate __pa_symbol(),
    and preferably __pa() too - and just use "virt_to_phys()" instead, which
    is more readable and has nicer semantics.

    However, right now, just undo the separation, and make __pa_symbol() be
    the exact same as __pa(). That fixes the bugs this patch introduced,
    and we can do the fairly obvious cleanups later.

    Do the new __phys_addr() function (which is now the actual workhorse for
    the unified __pa()/__pa_symbol()) as a real external function, that way
    all the potential issues with compile/link-time optimizations of
    constant symbol addresses go away, and we can also, if we choose to, add
    more sanity-checking of the argument.

    Cc: Eric W. Biederman
    Cc: Vivek Goyal
    Cc: Andi Kleen
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

06 May, 2007

1 commit

  • * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (231 commits)
    [PATCH] i386: Don't delete cpu_devs data to identify different x86 types in late_initcall
    [PATCH] i386: type may be unused
    [PATCH] i386: Some additional chipset register values validation.
    [PATCH] i386: Add missing !X86_PAE dependincy to the 2G/2G split.
    [PATCH] x86-64: Don't exclude asm-offsets.c in Documentation/dontdiff
    [PATCH] i386: avoid redundant preempt_disable in __unlazy_fpu
    [PATCH] i386: white space fixes in i387.h
    [PATCH] i386: Drop noisy e820 debugging printks
    [PATCH] x86-64: Fix allnoconfig error in genapic_flat.c
    [PATCH] x86-64: Shut up warnings for vfat compat ioctls on other file systems
    [PATCH] x86-64: Share identical video.S between i386 and x86-64
    [PATCH] x86-64: Remove CONFIG_REORDER
    [PATCH] x86-64: Print type and size correctly for unknown compat ioctls
    [PATCH] i386: Remove copy_*_user BUG_ONs for (size < 0)
    [PATCH] i386: Little cleanups in smpboot.c
    [PATCH] x86-64: Don't enable NUMA for a single node in K8 NUMA scanning
    [PATCH] x86: Use RDTSCP for synchronous get_cycles if possible
    [PATCH] i386: Add X86_FEATURE_RDTSCP
    [PATCH] i386: Implement X86_FEATURE_SYNC_RDTSC on i386
    [PATCH] i386: Implement alternative_io for i386
    ...

    Fix up trivial conflict in include/linux/highmem.h manually.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

05 May, 2007

2 commits

  • * master.kernel.org:/pub/scm/linux/kernel/git/gregkh/pci-2.6: (59 commits)
    PCI: Free resource files in error path of pci_create_sysfs_dev_files()
    pci-quirks: disable MSI on RS400-200 and RS480
    PCI hotplug: Use menuconfig objects
    PCI: ZT5550 CPCI Hotplug driver fix
    PCI: rpaphp: Remove semaphores
    PCI: rpaphp: Ensure more pcibios_add/pcibios_remove symmetry
    PCI: rpaphp: Use pcibios_remove_pci_devices() symmetrically
    PCI: rpaphp: Document is_php_dn()
    PCI: rpaphp: Document find_php_slot()
    PCI: rpaphp: Rename rpaphp_register_pci_slot() to rpaphp_enable_slot()
    PCI: rpaphp: refactor tail call to rpaphp_register_slot()
    PCI: rpaphp: remove rpaphp_set_attention_status()
    PCI: rpaphp: remove print_slot_pci_funcs()
    PCI: rpaphp: Remove setup_pci_slot()
    PCI: rpaphp: remove a call that does nothing but a pointer lookup
    PCI: rpaphp: Remove another wrappered function
    PCI: rpaphp: Remve another call that is a wrapper
    PCI: rpaphp: remove a function that does nothing but wrap debug printks
    PCI: rpaphp: Remove un-needed goto
    PCI: rpaphp: Fix a memleak; slot->location string was never freed
    ...

    Linus Torvalds
     
  • * master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart:
    [AGPGART] sworks-agp: Switch to PCI ref counting APIs
    [AGPGART] Nvidia AGP: Use refcount aware PCI interfaces
    [AGPGART] Fix sparse warning in sgi-agp.c
    [AGPGART] Intel-agp adjustments
    [AGPGART] Move [un]map_page_into_agp into asm/agp.h
    [AGPGART] Add missing calls to global_flush_tlb() to ali-agp
    [AGPGART] prevent probe collision of sis-agp and amd64_agp

    Linus Torvalds
     

03 May, 2007

23 commits

  • Most architectures' scatterlist.h use the type dma_addr_t, but omit to
    include which defines it. This could lead to build failures,
    so let's add the missing includes.

    Signed-off-by: Jean Delvare
    Signed-off-by: Andrew Morton
    Signed-off-by: Greg Kroah-Hartman

    Jean Delvare
     
  • This mainly removes a lot of code, replacing it with calls into the new 32bit
    perfctr-watchdog.c

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • No need to maintain it anymore

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Use safe_apic_wait_icr_idle to check ICR idle bit if the vector is
    NMI_VECTOR to avoid potential hangups in the event of crash when kdump
    tries to stop the other CPUs.

    Signed-off-by: Fernando Luis Vazquez Cao
    Signed-off-by: Andi Kleen

    Fernando Luis [** ISO-8859-1 charset **] VázquezCao
     
  • Implement __send_IPI_dest_field which can be used to send IPIs when the
    "destination shorthand" field of the ICR is set to 00 (destination
    field). Use it whenever possible.

    Signed-off-by: Fernando Luis Vazquez Cao
    Signed-off-by: Andi Kleen

    Fernando Luis [** ISO-8859-1 charset **] VázquezCao
     
  • apic_wait_icr_idle looks like this:

    static __inline__ void apic_wait_icr_idle(void)
    {
    while (apic_read(APIC_ICR) & APIC_ICR_BUSY)
    cpu_relax();
    }

    The busy loop in this function would not be problematic if the
    corresponding status bit in the ICR were always updated, but that does
    not seem to be the case under certain crash scenarios. Kdump uses an IPI
    to stop the other CPUs in the event of a crash, but when any of the
    other CPUs are locked-up inside the NMI handler the CPU that sends the
    IPI will end up looping forever in the ICR check, effectively
    hard-locking the whole system.

    Quoting from Intel's "MultiProcessor Specification" (Version 1.4), B-3:

    "A local APIC unit indicates successful dispatch of an IPI by
    resetting the Delivery Status bit in the Interrupt Command
    Register (ICR). The operating system polls the delivery status
    bit after sending an INIT or STARTUP IPI until the command has
    been dispatched.

    A period of 20 microseconds should be sufficient for IPI dispatch
    to complete under normal operating conditions. If the IPI is not
    successfully dispatched, the operating system can abort the
    command. Alternatively, the operating system can retry the IPI by
    writing the lower 32-bit double word of the ICR. This “time-out”
    mechanism can be implemented through an external interrupt, if
    interrupts are enabled on the processor, or through execution of
    an instruction or time-stamp counter spin loop."

    Intel's documentation suggests the implementation of a time-out
    mechanism, which, by the way, is already being open-coded in some parts
    of the kernel that tinker with ICR.

    Create a apic_wait_icr_idle replacement that implements the time-out
    mechanism and that can be used to solve the aforementioned problem.

    AK: moved both functions out of line
    AK: Added improved loop from Keith Owens

    Signed-off-by: Fernando Luis Vazquez Cao
    Signed-off-by: Andi Kleen

    Fernando Luis VazquezCao
     
  • Applied fix by Andew Morton:
    http://lkml.org/lkml/2007/4/8/88 - Fix `make headers_check'.

    AMD and Intel x86 CPU manuals state that it is the responsibility of
    system software to initialize and maintain MTRR consistency across
    all processors in Multi-Processing Environments.

    Quote from page 188 of the AMD64 System Programming manual (Volume 2):

    7.6.5 MTRRs in Multi-Processing Environments

    "In multi-processing environments, the MTRRs located in all processors must
    characterize memory in the same way. Generally, this means that identical
    values are written to the MTRRs used by the processors." (short omission here)
    "Failure to do so may result in coherency violations or loss of atomicity.
    Processor implementations do not check the MTRR settings in other processors
    to ensure consistency. It is the responsibility of system software to
    initialize and maintain MTRR consistency across all processors."

    Current Linux MTRR code already implements the above in the case that the
    BIOS does not properly initialize MTRRs on the secondary processors,
    but the case where the fixed-range MTRRs of the boot processor are changed
    after Linux started to boot, before the initialsation of a secondary
    processor, is not handled yet.

    In this case, secondary processors are currently initialized by Linux
    with MTRRs which the boot processor had very early, when mtrr_bp_init()
    did run, but not with the MTRRs which the boot processor uses at the
    time when that secondary processors is actually booted,
    causing differing MTRR contents on the secondary processors.

    Such situation happens on Acer Ferrari 1000 and 5000 notebooks where the
    BIOS enables and sets AMD-specific IORR bits in the fixed-range MTRRs
    of the boot processor when it transitions the system into ACPI mode.
    The SMI handler of the BIOS does this in SMM, entered while Linux ACPI
    code runs acpi_enable().

    Other occasions where the SMI handler of the BIOS may change bits in
    the MTRRs could occur as well. To initialize newly booted secodary
    processors with the fixed-range MTRRs which the boot processor uses
    at that time, this patch saves the fixed-range MTRRs of the boot
    processor before new secondary processors are started. When the
    secondary processors run their Linux initialisation code, their
    fixed-range MTRRs will be updated with the saved fixed-range MTRRs.

    If CONFIG_MTRR is not set, we define mtrr_save_state
    as an empty statement because there is nothing to do.

    Possible TODOs:

    *) CPU-hotplugging outside of SMP suspend/resume is not yet tested
    with this patch.

    *) If, even in this case, an AP never runs i386/do_boot_cpu or x86_64/cpu_up,
    then the calls to mtrr_save_state() could be replaced by calls to
    mtrr_save_fixed_ranges(NULL) and mtrr_save_state() would not be
    needed.

    That would need either verification of the CPU-hotplug code or
    at least a test on a >2 CPU machine.

    *) The MTRRs of other running processors are not yet checked at this
    time but it might be interesting to syncronize the MTTRs of all
    processors before booting. That would be an incremental patch,
    but of rather low priority since there is no machine known so
    far which would require this.

    AK: moved prototypes on x86-64 around to fix warnings

    Signed-off-by: Bernhard Kaindl
    Signed-off-by: Andrew Morton
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Cc: Dave Jones

    Bernhard Kaindl
     
  • In this current implementation which is used in other patches,
    mtrr_save_fixed_ranges() accepts a dummy void pointer because
    in the current implementation of one of these patches, this
    function may be called from smp_call_function_single() which
    requires that this function takes a void pointer argument.

    This function calls get_fixed_ranges(), passing mtrr_state.fixed_ranges
    which is the element of the static struct which stores our current
    backup of the fixed-range MTRR values which all CPUs shall be
    using.

    Because mtrr_save_fixed_ranges calls get_fixed_ranges after
    kernel initialisation time, __init needs to be removed from
    the declaration of get_fixed_ranges().

    If CONFIG_MTRR is not set, we define mtrr_save_fixed_ranges
    as an empty statement because there is nothing to do.

    AK: Moved prototypes for x86-64 around to fix warnings

    Signed-off-by: Bernhard Kaindl
    Signed-off-by: Andi Kleen
    Cc: Andrew Morton
    Cc: Andi Kleen
    Cc: Dave Jones

    Bernhard Kaindl
     
  • Signed-off-by: Andi Kleen

    Andi Kleen
     
  • The other symbols used to delineate the alt-instructions sections have the
    form __foo/__foo_end. Rename parainstructions to match.

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton

    Jeremy Fitzhardinge
     
  • Add hooks to allow a paravirt implementation to track the lifetime of
    an mm. Paravirtualization requires three hooks, but only two are
    needed in common code. They are:

    arch_dup_mmap, which is called when a new mmap is created at fork

    arch_exit_mmap, which is called when the last process reference to an
    mm is dropped, which typically happens on exit and exec.

    The third hook is activate_mm, which is called from the arch-specific
    activate_mm() macro/function, and so doesn't need stub versions for
    other architectures. It's called when an mm is first used.

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Andi Kleen
    Cc: linux-arch@vger.kernel.org
    Cc: James Bottomley
    Acked-by: Ingo Molnar

    Jeremy Fitzhardinge
     
  • This patch is based on Rusty's recent cleanup of the EFLAGS-related
    macros; it extends the same kind of cleanup to control registers and
    MSRs.

    It also unifies these between i386 and x86-64; at least with regards
    to MSRs, the two had definitely gotten out of sync.

    Signed-off-by: H. Peter Anvin
    Signed-off-by: Andi Kleen

    H. Peter Anvin
     
  • It doesn't put the CPU into deeper sleep states, so it's better to use the standard
    idle loop to save power. But allow to reenable it anyways for benchmarking.

    I also removed the obsolete idle=halt on i386

    Cc: andreas.herrmann@amd.com

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Most of asm-x86_64/bugs.h is code which should be in a C file, so put it there.

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Cc: Linus Torvalds

    Jeremy Fitzhardinge
     
  • The xmm space on x86_64 is 256 bytes.

    Signed-off-by: Avi Kivity
    Signed-off-by: Andi Kleen

    Avi Kivity
     
  • As per i386 patch: move X86_EFLAGS_IF et al out to a new header:
    processor-flags.h, so we can include it from irqflags.h and use it in
    raw_irqs_disabled_flags().

    As a side-effect, we could now use these flags in .S files.

    Signed-off-by: Rusty Russell
    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Rather than using a single constant PERCPU_ENOUGH_ROOM, compute it as
    the sum of kernel_percpu + PERCPU_MODULE_RESERVE. This is now common
    to all architectures; if an architecture wants to set
    PERCPU_ENOUGH_ROOM to something special, then it may do so (ia64 is
    the only one which does).

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Andi Kleen
    Cc: Rusty Russell
    Cc: Eric W. Biederman
    Cc: Andi Kleen

    Jeremy Fitzhardinge
     
  • - there's no reason for duplicating the prototype from
    include/linux/syscalls.h in include/asm-x86_64/unistd.h
    - every file should #include the headers containing the prototypes for
    it's global functions

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andi Kleen

    Adrian Bunk
     
  • x86_64 currently simulates a list using the index and private fields of the
    page struct. Seems that the code was inherited from i386. But x86_64 does
    not use the slab to allocate pgds and pmds etc. So the lru field is not
    used by the slab and therefore available.

    This patch uses standard list operations on page->lru to realize pgd
    tracking.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton

    Christoph Lameter
     
  • GCC (4.1 at least) unrolls it anyway, but I can't believe this code
    was ever justifiable. (I've also submitted a patch which cleans up
    i386, which is even uglier).

    Signed-off-by: Rusty Russell
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton

    Rusty Russell
     
  • Extends the numa=fake x86_64 command-line option to allow for configurable
    node sizes. These nodes can be used in conjunction with cpusets for coarse
    memory resource management.

    The old command-line option is still supported:
    numa=fake=32 gives 32 fake NUMA nodes, ignoring the NUMA setup of the
    actual machine.

    But now you may configure your system for the node sizes of your choice:
    numa=fake=2*512,1024,2*256
    gives two 512M nodes, one 1024M node, two 256M nodes, and
    the rest of system memory to a sixth node.

    The existing hash function is maintained to support the various node sizes
    that are possible with this implementation.

    Each node of the same size receives roughly the same amount of available
    pages, regardless of any reserved memory with its address range. The total
    available pages on the system is calculated and divided by the number of equal
    nodes to allocate. These nodes are then dynamically allocated and their
    borders extended until such time as their number of available pages reaches
    the required size.

    Configurable node sizes are recommended when used in conjunction with cpusets
    for memory control because it eliminates the overhead associated with scanning
    the zonelists of many smaller full nodes on page_alloc().

    Cc: Andi Kleen
    Signed-off-by: David Rientjes
    Signed-off-by: Andi Kleen
    Cc: Paul Jackson
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton

    David Rientjes
     
  • Change mark_tsc_unstable() so it takes a string argument, which holds the
    reason the TSC was marked unstable.

    This is then displayed the first time mark_tsc_unstable is called.

    This should help us better debug why the TSC was marked unstable on certain
    systems and allow us to make sure we're not being overly paranoid when
    throwing out this troublesome clocksource.

    Cc: Ingo Molnar
    Cc: Thomas Gleixner
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Andi Kleen

    john stultz
     
  • o X86_64 kernel should run from 2MB aligned address for two reasons.
    - Performance.
    - For relocatable kernels, page tables are updated based on difference
    between compile time address and load time physical address.
    This difference should be multiple of 2MB as kernel text and data
    is mapped using 2MB pages and PMD should be pointing to a 2MB
    aligned address. Life is simpler if both compile time and load time
    kernel addresses are 2MB aligned.

    o Flag the error at compile time if one is trying to build a kernel which
    does not meet alignment restrictions.

    Signed-off-by: Vivek Goyal
    Signed-off-by: Andi Kleen
    Cc: "Eric W. Biederman"
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton

    Vivek Goyal