27 Sep, 2006

4 commits

  • Consistently use MAX_ERRNO when checking for errors in __syscall_return().

    [ralf@linux-mips.org: build fix]
    Signed-off-by: Randy Dunlap
    Signed-off-by: Ralf Baechle
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     
  • Size zones and holes in an architecture independent manner for x86_64.

    Signed-off-by: Mel Gorman
    Cc: Dave Hansen
    Cc: Andy Whitcroft
    Cc: Andi Kleen
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: "Keith Mannthey"
    Cc: "Luck, Tony"
    Cc: KAMEZAWA Hiroyuki
    Cc: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • We need processor.h for cpu_relax().

    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (225 commits)
    [PATCH] Don't set calgary iommu as default y
    [PATCH] i386/x86-64: New Intel feature flags
    [PATCH] x86: Add a cumulative thermal throttle event counter.
    [PATCH] i386: Make the jiffies compares use the 64bit safe macros.
    [PATCH] x86: Refactor thermal throttle processing
    [PATCH] Add 64bit jiffies compares (for use with get_jiffies_64)
    [PATCH] Fix unwinder warning in traps.c
    [PATCH] x86: Allow disabling early pci scans with pci=noearly or disallowing conf1
    [PATCH] x86: Move direct PCI scanning functions out of line
    [PATCH] i386/x86-64: Make all early PCI scans dependent on CONFIG_PCI
    [PATCH] Don't leak NT bit into next task
    [PATCH] i386/x86-64: Work around gcc bug with noreturn functions in unwinder
    [PATCH] Fix some broken white space in ia32_signal.c
    [PATCH] Initialize argument registers for 32bit signal handlers.
    [PATCH] Remove all traces of signal number conversion
    [PATCH] Don't synchronize time reading on single core AMD systems
    [PATCH] Remove outdated comment in x86-64 mmconfig code
    [PATCH] Use string instructions for Core2 copy/clear
    [PATCH] x86: - restore i8259A eoi status on resume
    [PATCH] i386: Split multi-line printk in oops output.
    ...

    Linus Torvalds
     

26 Sep, 2006

36 commits

  • The use of SEGMENT_RPL_MASK in the i386 ptrace.h introduced by
    x86-allow-a-kernel-to-not-be-in-ring-0.patch broke the UML build, as UML
    includes the underlying architecture's ptrace.h, but has no easy access to the
    x86 segment definitions.

    Rather than kludging around this, as in the past, this patch splits the
    userspace-usable parts, which are the bits that UML needs, of ptrace.h into
    ptrace-abi.h, which is included back into ptrace.h. Thus, there is no net
    effect on i386.

    As a side-effect, this creates a ptrace header which is close to being usable
    in /usr/include.

    x86_64 is also treated in this way for consistency. There was some trailing
    whitespace there, which is cleaned up.

    Signed-off-by: Jeff Dike
    Cc: David Woodhouse
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Dike
     
  • On x86_64 machines with more than 2 GB of RAM there are large memory gaps
    (with no corresponding kernel virtual addresses) and reserved memory
    regions between areas of usable physical RAM. Moreover, if CONFIG_FLATMEM
    is set, they appear within the normal zone. swsusp should not try to save
    them, so the corresponding page structs have to be marked as 'nosave'.

    Signed-off-by: Rafael J. Wysocki
    Cc: Mel Gorman
    Acked-by: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • If we're going to implement smp_call_function_single() on three architecture
    with the same prototype then it should have a declaration in a
    non-arch-specific header file.

    Move it into .

    Cc: Stephane Eranian
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • One of the changes necessary for shared page tables is to standardize the
    pxx_page macros. pte_page and pmd_page have always returned the struct
    page associated with their entry, while pte_page_kernel and pmd_page_kernel
    have returned the kernel virtual address. pud_page and pgd_page, on the
    other hand, return the kernel virtual address.

    Shared page tables needs pud_page and pgd_page to return the actual page
    structures. There are very few actual users of these functions, so it is
    simple to standardize their usage.

    Since this is basic cleanup, I am submitting these changes as a standalone
    patch. Per Hugh Dickins' comments about it, I am also changing the
    pxx_page_kernel macros to pxx_page_vaddr to clarify their meaning.

    Signed-off-by: Dave McCracken
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave McCracken
     
  • get_cpu_var()/per_cpu()/__get_cpu_var() arguments must be simple
    identifiers. Otherwise the arch dependent implementations might break.

    This patch enforces the correct usage of the macros by producing a syntax
    error if the variable is not a simple identifier.

    Signed-off-by: Jan Blunck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Blunck
     
  • Refactor the event processing (syslog messaging and rate limiting)
    into separate file therm_throt.c. This allows consistent reporting
    of CPU thermal throttle events.

    After ACK'ing the interrupt, if the event is current, the user
    (p4.c/mce_intel.c) calls therm_throt_process to log (and rate limit)
    the event. If that function returns 1, the user has the option to log
    things further (such as to mce_log in x86_64).

    AK: minor cleanup

    Signed-off-by: Dmitriy Zavin
    Signed-off-by: Andi Kleen

    Dmitriy Zavin
     
  • Fix

    linux/arch/x86_64/kernel/traps.c: In function 'dump_trace':
    linux/arch/x86_64/kernel/traps.c:275: warning: cast to pointer from integer of different size

    with allnoconfig

    Cc: jbeulich@novell.com
    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Some buggy systems can machine check when config space accesses
    happen for some non existent devices. i386/x86-64 do some early
    device scans that might trigger this. Allow pci=noearly to disable
    this. Also when type 1 is disabling also don't do any early
    accesses which are always type1.

    This moves the pci= configuration parsing to be a early parameter.
    I don't think this can break anything because it only changes
    a single global that is only used by PCI.

    Cc: gregkh@suse.de
    Cc: Trammell Hudson

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Saves about 200 bytes of code space.

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • SYSENTER can cause a NT to be set which might cause crashes on the IRET
    in the next task.

    Following similar i386 patch from Linus.

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Current gcc generates calls not jumps to noreturn functions. When that happens the
    return address can point to the next function, which confuses the unwinder.

    This patch works around it by marking asynchronous exception
    frames in contrast normal call frames in the unwind information. Then teach
    the unwinder to decode this.

    For normal call frames the unwinder now subtracts one from the address which avoids
    this problem. The standard libgcc unwinder uses the same trick.

    It doesn't include adjustment of the printed address (i.e. for the original
    example, it'd still be kernel_math_error+0 that gets displayed, but the
    unwinder wouldn't get confused anymore.

    This only works with binutils 2.6.17+ and some versions of H.J.Lu's 2.6.16
    unfortunately because earlier binutils don't support .cfi_signal_frame

    [AK: added automatic detection of the new binutils and wrote description]

    Signed-off-by: Jan Beulich
    Signed-off-by: Andi Kleen

    Jan Beulich
     
  • Previously exit_idle would be called more often than enter_idle

    Now instead of using complicated tests just keep track of it
    using the per CPU variable as a flip flop. I moved the idle state into the
    PDA to make the access more efficient.

    Original bug report and an initial patch from Stephane Eranian,
    but redone by AK.

    Cc: Stephane Eranian

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Signed-off-by: Andi Kleen

    Andi Kleen
     
  • This quietens so warnings about uninitialized use of the return
    value of the pda read operations.
    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Reindent the macros in x86-64 pda.h, making them much more readable.
    Follows Jeremy's i386 version of this.

    No functional changes

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • - Replace some broken white space.
    - Replace __ keywords with standard names

    No functional changes.

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Following i386.

    And also fix the two occurrences that caused warnings in arch/x86_64/*

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • - Don't zero for __copy_from_user_inatomic following i386.
    This will prevent spurious zeros for parallel file system writers when
    one does a exception
    - The string instruction version didn't zero the output on
    exception. Oops.

    Also I cleaned up the code a bit while I was at it and added a minor
    optimization to the string instruction path.

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • I just added type checking for assignments the PDA in the i386 PDA code.
    Here's the x86-64 equivalent. (Obviously this doesn't contain the latest
    x86-64 PDA change.)

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Andi Kleen

    Jeremy Fitzhardinge
     
  • Apparently that is the more official way to get numbers without $ in inline
    assembly

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • This patch adds the per thread cookie field to the task struct and the PDA.
    Also it makes sure that the PDA value gets the new cookie value at context
    switch, and that a new task gets a new cookie at task creation time.

    Signed-off-by: Arjan van Ven
    Signed-off-by: Ingo Molnar
    Signed-off-by: Andi Kleen
    CC: Andi Kleen

    Arjan van de Ven
     
  • Change the comments in the pda structure to make the first fields to have
    their offset documented and to have the comments aligned.
    The stack protector series needs a field at offset 40 (gcc ABI); annotate
    upto 40 for that reason.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Ingo Molnar
    Signed-off-by: Andi Kleen
    CC: Andi Kleen

    Arjan van de Ven
     
  • kexec: Avoid overwriting the current pgd (V4, x86_64)

    This patch upgrades the x86_64-specific kexec code to avoid overwriting the
    current pgd. Overwriting the current pgd is bad when CONFIG_CRASH_DUMP is used
    to start a secondary kernel that dumps the memory of the previous kernel.

    The code introduces a new set of page tables. These tables are used to provide
    an executable identity mapping without overwriting the current pgd.

    Signed-off-by: Magnus Damm
    Signed-off-by: Andi Kleen

    Magnus Damm
     
  • Remove most of the special cases for the debug IST stack. This is a
    follow on clean up patch, it requires the bug fix patch that adds
    orig_ist.

    Signed-off-by: Keith Owens
    Signed-off-by: Andi Kleen

    Keith Owens
     
  • Based on a idea by Jeremy Fitzhardinge:

    Replace the volatiles and memory clobbers in the PDA access with
    telling gcc about access to a proxy PDA structure that doesn't
    actually exist. But the dummy accesses give a defined ordering for
    read/write accesses.

    Also add some memory barriers to the early GS initialization to
    make sure no PDA access is moved before it.

    Advantage is some .text savings (probably most from better
    code for accessing "current"):

    text data bss dec hex filename
    4845647 1223688 615864 6685199 66020f vmlinux
    4837780 1223688 615864 6677332 65e354 vmlinux-pda

    1.2% smaller code

    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • They cannot be actually freed because the FACS table has a
    shared-with-the-BIOS lock.

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Based on patch from David Rientjes , but
    changed by AK.

    Optimizes the 64-bit hamming weight for x86_64 processors assuming they
    have fast multiplication. Uses five fewer bitops than the generic
    hweight64. Benchmark on one EMT64 showed ~25% speedup with 2^24
    consecutive calls.

    Define a new ARCH_HAS_FAST_MULTIPLIER that can be set by other
    architectures that can also multiply fast.

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Drop support for non e820 BIOS calls to get the memory map.

    The boot assembler code still has some support, but not the C code now.

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • - Remove a define that was used only once
    - Remove the too large APIC ID check because we always support
    the full 8bit range of APICs.
    - Restructure code a bit to be simpler.

    Cc: len.brown@intel.com

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Use normal pte accessors in change_page_attr() to access the PSE
    bits.

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • Fix the pte_exec/mkexec page table accessor functions to really
    use the NX bit. Previously they only checked the USER bit, but
    weren't actually used for anything.

    Then use them in change_page_attr() to manipulate the NX bit
    properly.

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • And replace all users with ordinary smp_processor_id. The function
    was originally added to get some basic oops information out even
    if the GS register was corrupted. However that didn't
    work for some anymore because printk is needed to print the oops
    and it uses smp_processor_id() already. Also GS register corruptions
    are not particularly common anymore.

    This also helps the Xen port which would otherwise need to
    do this in a special way because it can't access the local APIC.

    Cc: Chris Wright

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • This is now automatically included by kbuild.

    Signed-off-by: Dave Jones
    Signed-off-by: Andi Kleen

    Dave Jones
     
  • Right now the kernel on x86-64 has a 100% lazy fpu behavior: after *every*
    context switch a trap is taken for the first FPU use to restore the FPU
    context lazily. This is of course great for applications that have very
    sporadic or no FPU use (since then you avoid doing the expensive
    save/restore all the time). However for very frequent FPU users... you
    take an extra trap every context switch.

    The patch below adds a simple heuristic to this code: After 5 consecutive
    context switches of FPU use, the lazy behavior is disabled and the context
    gets restored every context switch. If the app indeed uses the FPU, the
    trap is avoided. (the chance of the 6th time slice using FPU after the
    previous 5 having done so are quite high obviously).

    After 256 switches, this is reset and lazy behavior is returned (until
    there are 5 consecutive ones again). The reason for this is to give apps
    that do longer bursts of FPU use still the lazy behavior back after some
    time.

    [akpm@osdl.org: place new task_struct field next to jit_keyring to save space]
    Signed-off-by: Arjan van de Ven
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton

    Arjan van de Ven
     
  • Now for a completely different but trivial approach.
    I just boot tested it with 255 CPUS and everything worked.

    Currently everything (except module data) we place in
    the per cpu area we know about at compile time. So
    instead of allocating a fixed size for the per_cpu area
    allocate the number of bytes we need plus a fixed constant
    for to be used for modules.

    It isn't perfect but it is much less of a pain to
    work with than what we are doing now.

    AK: fixed warning

    Signed-off-by: Eric W. Biederman
    Signed-off-by: Andi Kleen

    Eric W. Biederman
     
  • This unifies the standard backtracer and the new stacktrace
    in memory backtracer. The standard one is converted to use callbacks
    and then reimplement stacktrace using new callbacks.

    The main advantage is that stacktrace can now use the new dwarf2 unwinder
    and avoid false positives in many cases.

    I kept it simple to make sure the standard backtracer stays reliable.

    Cc: mingo@elte.hu

    Signed-off-by: Andi Kleen

    Andi Kleen