27 Jan, 2007

1 commit

  • I wouldn't mind if CONFIG_COMPAT_VDSO went away entirely. But if it's there,
    it should work properly. Currently it's quite haphazard: both real vma and
    fixmap are mapped, both are put in the two different AT_* slots, sysenter
    returns to the vma address rather than the fixmap address, and core dumps yet
    are another story.

    This patch makes CONFIG_COMPAT_VDSO disable the real vma and use the fixmap
    area consistently. This makes it actually compatible with what the old vdso
    implementation did.

    Signed-off-by: Roland McGrath
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     

23 Jan, 2007

1 commit

  • The current PDA code, which went in in post 2.6.19 has a flaw in that it
    doesn't correctly cycle the GDT and %GS segment through the boot PDA,
    the CPU PDA and finally the per-cpu PDA.

    The bug generally doesn't show up if the boot CPU id is zero, but
    everything falls apart for a non zero boot CPU id. The basically kills
    voyager which is perfectly capable of doing non zero CPU id boots, so
    voyager currently won't boot without this.

    The fix is to be careful and actually do the GDT setups correctly.

    Signed-off-by: James Bottomley
    Cc: Andi Kleen
    Cc: Jeremy Fitzhardinge
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    James Bottomley
     

06 Jan, 2007

1 commit

  • o Relocatable bzImage support had got rid of CONFIG_PHYSICAL_START option
    thinking that now this option is not required as people can build a
    second kernel as relocatable and load it anywhere. So need of compiling
    the kernel for a custom address was gone. But Magnus uses vmlinux images
    for second kernel in Xen environment and he wants to continue to use
    it.

    o Restoring the CONFIG_PHYSICAL_START option for the time being. I think
    down the line we can get rid of it.

    Signed-off-by: Vivek Goyal
    Cc: "Eric W. Biederman"
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vivek Goyal
     

23 Dec, 2006

2 commits

  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (68 commits)
    ACPI: replace kmalloc+memset with kzalloc
    ACPI: Add support for acpi_load_table/acpi_unload_table_id
    fbdev: update after backlight argument change
    ACPI: video: Add dev argument for backlight_device_register
    ACPI: Implement acpi_video_get_next_level()
    ACPI: Kconfig - depend on PM rather than selecting it
    ACPI: fix NULL check in drivers/acpi/osl.c
    ACPI: make drivers/acpi/ec.c:ec_ecdt static
    ACPI: prevent processor module from loading on failures
    ACPI: fix single linked list manipulation
    ACPI: ibm_acpi: allow clean removal
    ACPI: fix git automerge failure
    ACPI: ibm_acpi: respond to workqueue update
    ACPI: dock: add uevent to indicate change in device status
    ACPI: ec: Lindent once again
    ACPI: ec: Change #define to enums there possible.
    ACPI: ec: Style changes.
    ACPI: ec: Acquire Global Lock under EC mutex.
    ACPI: ec: Drop udelay() from poll mode. Loop by reading status field instead.
    ACPI: ec: Rename gpe_bit to gpe
    ...

    Linus Torvalds
     
  • register_memory() becomes double definition in 2.6.20-rc1. It is defined
    in arch/i386/kernel/setup.c as static definition in 2.6.19. But it is
    moved to arch/i386/kernel/e820.c in 2.6.20-rc1. And same name function is
    defined in driver/base/memory.c too. So, it becomes cause of compile error
    of duplicate definition if memory hotplug option is on.

    Signed-off-by: Yasunori Goto
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     

20 Dec, 2006

1 commit


16 Dec, 2006

2 commits


14 Dec, 2006

2 commits

  • Virtually index, physically tagged cache architectures can get away
    without cache flushing when forking. This patch adds a new cache
    flushing function flush_cache_dup_mm(struct mm_struct *) which for the
    moment I've implemented to do the same thing on all architectures
    except on MIPS where it's a no-op.

    Signed-off-by: Ralf Baechle
    Signed-off-by: Linus Torvalds

    Ralf Baechle
     
  • Currently, to tell a task that it should go to the refrigerator, we set the
    PF_FREEZE flag for it and send a fake signal to it. Unfortunately there
    are two SMP-related problems with this approach. First, a task running on
    another CPU may be updating its flags while the freezer attempts to set
    PF_FREEZE for it and this may leave the task's flags in an inconsistent
    state. Second, there is a potential race between freeze_process() and
    refrigerator() in which freeze_process() running on one CPU is reading a
    task's PF_FREEZE flag while refrigerator() running on another CPU has just
    set PF_FROZEN for the same task and attempts to reset PF_FREEZE for it. If
    the refrigerator wins the race, freeze_process() will state that PF_FREEZE
    hasn't been set for the task and will set it unnecessarily, so the task
    will go to the refrigerator once again after it's been thawed.

    To solve first of these problems we need to stop using PF_FREEZE to tell
    tasks that they should go to the refrigerator. Instead, we can introduce a
    special TIF_*** flag and use it for this purpose, since it is allowed to
    change the other tasks' TIF_*** flags and there are special calls for it.

    To avoid the freeze_process()-refrigerator() race we can make
    freeze_process() to always check the task's PF_FROZEN flag after it's read
    its "freeze" flag. We should also make sure that refrigerator() will
    always reset the task's "freeze" flag after it's set PF_FROZEN for it.

    Signed-off-by: Rafael J. Wysocki
    Acked-by: Pavel Machek
    Cc: Russell King
    Cc: David Howells
    Cc: Andi Kleen
    Cc: "Luck, Tony"
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Paul Mundt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     

13 Dec, 2006

1 commit


11 Dec, 2006

1 commit

  • Large sched domains can be very expensive to scan. Add an option SD_SERIALIZE
    to the sched domain flags. If that flag is set then we make sure that no
    other such domain is being balanced.

    [akpm@osdl.org: build fix]
    Signed-off-by: Christoph Lameter
    Cc: Peter Williams
    Cc: Nick Piggin
    Cc: Christoph Lameter
    Cc: "Siddha, Suresh B"
    Cc: "Chen, Kenneth W"
    Acked-by: Ingo Molnar
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

09 Dec, 2006

3 commits

  • This completes IDE except for one use which requires a new core PCI function
    and will be polished up at the end

    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • In order to sort out our struct termios and add proper speed control we need
    to separate the kernel and user termios structures. Glibc is fine but the
    other libraries rely on the kernel exported struct termios and we need to
    extend this without breaking the ABI/API

    To do so we add a struct ktermios which is the kernel view of a termios
    structure and overlaps the struct termios with extra fields on the end for
    now. (That limitation will go away in later patches). Some platforms (eg
    alpha) planned ahead and thus use the same struct for both, others did not.

    This just adds the structures but does not use them, it seems a sensible
    splitting point for bisect if there are compile failures (not that I expect
    them)

    Signed-off-by: Alan Cox
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     
  • This makes i386 use the generic BUG machinery. There are no functional
    changes from the old i386 implementation.

    The main advantage in using the generic BUG machinery for i386 is that the
    inlined overhead of BUG is just the ud2a instruction; the file+line(+function)
    information are no longer inlined into the instruction stream. This reduces
    cache pollution, and makes disassembly work properly.

    Signed-off-by: Jeremy Fitzhardinge
    Cc: Andi Kleen
    Cc: Hugh Dickens
    Cc: Michael Ellerman
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeremy Fitzhardinge
     

08 Dec, 2006

10 commits

  • * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (156 commits)
    [PATCH] x86-64: Export smp_call_function_single
    [PATCH] i386: Clean up smp_tune_scheduling()
    [PATCH] unwinder: move .eh_frame to RODATA
    [PATCH] unwinder: fully support linker generated .eh_frame_hdr section
    [PATCH] x86-64: don't use set_irq_regs()
    [PATCH] x86-64: check vector in setup_ioapic_dest to verify if need setup_IO_APIC_irq
    [PATCH] x86-64: Make ix86 default to HIGHMEM4G instead of NOHIGHMEM
    [PATCH] i386: replace kmalloc+memset with kzalloc
    [PATCH] x86-64: remove remaining pc98 code
    [PATCH] x86-64: remove unused variable
    [PATCH] x86-64: Fix constraints in atomic_add_return()
    [PATCH] x86-64: fix asm constraints in i386 atomic_add_return
    [PATCH] x86-64: Correct documentation for bzImage protocol v2.05
    [PATCH] x86-64: replace kmalloc+memset with kzalloc in MTRR code
    [PATCH] x86-64: Fix numaq build error
    [PATCH] x86-64: include/asm-x86_64/cpufeature.h isn't a userspace header
    [PATCH] unwinder: Add debugging output to the Dwarf2 unwinder
    [PATCH] x86-64: Clarify error message in GART code
    [PATCH] x86-64: Fix interrupt race in idle callback (3rd try)
    [PATCH] x86-64: Remove unwind stack pointer alignment forcing again
    ...

    Fixed conflict in include/linux/uaccess.h manually

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Make the contents of the userspace asm/setup.h header consistent on all
    architectures:

    - export setup.h to userspace on all architectures
    - export only COMMAND_LINE_SIZE to userspace
    - frv: move COMMAND_LINE_SIZE from param.h
    - i386: remove duplicate COMMAND_LINE_SIZE from param.h
    - arm:
    - export ATAGs to userspace
    - change u8/u16/u32 to __u8/__u16/__u32

    Signed-off-by: Adrian Bunk
    Acked-by: Russell King
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • Pass struct dev pointer to dma_cache_sync()

    dma_cache_sync() is ill-designed in that it does not have a struct device
    pointer argument which makes proper support for systems that consist of a
    mix of coherent and non-coherent DMA devices hard. Change dma_cache_sync
    to take a struct device pointer as first argument and fix all its callers
    to pass it.

    Signed-off-by: Ralf Baechle
    Cc: James Bottomley
    Cc: "David S. Miller"
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ralf Baechle
     
  • dma_is_consistent() is ill-designed in that it does not have a struct
    device pointer argument which makes proper support for systems that consist
    of a mix of coherent and non-coherent DMA devices hard. Change
    dma_is_consistent to take a struct device pointer as first argument and fix
    the sole caller to pass it.

    Signed-off-by: Ralf Baechle
    Cc: James Bottomley
    Cc: "David S. Miller"
    Cc: Greg KH
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ralf Baechle
     
  • The last thing we agreed on was to remove the macros entirely for 2.6.19,
    on all architectures. Unfortunately, I think nobody actually _did_ that,
    so they are still there.

    [akpm@osdl.org: x86_64 fix]
    Cc: David Woodhouse
    Cc: Greg Schafer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • Name some of the remaning 'old_style_spin_init' locks

    Signed-off-by: Peter Zijlstra
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     
  • Make swsusp support i386 systems with PAE or without PSE.

    This is done by creating temporary page tables located in resume-safe page
    frames before the suspend image is restored in the same way as x86_64 does
    it.

    Signed-off-by: Rafael J. Wysocki
    Cc: Andi Kleen
    Cc: Dave Jones
    Cc: Nigel Cunningham
    Cc: Pavel Machek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rafael J. Wysocki
     
  • Replace all uses of kmem_cache_t with struct kmem_cache.

    The patch was generated using the following script:

    #!/bin/sh
    #
    # Replace one string by another in all the kernel sources.
    #

    set -e

    for file in `find * -name "*.c" -o -name "*.h"|xargs grep -l $1`; do
    quilt add $file
    sed -e "1,\$s/$1/$2/g" $file >/tmp/$$
    mv /tmp/$$ $file
    quilt refresh
    done

    The script was run like this

    sh replace kmem_cache_t "struct kmem_cache"

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • x86 NUMA systems only define bootmem for node 0. alloc_bootmem_node() and
    friends therefore ignore the passed pgdat and use NODE_DATA(0) in all
    cases. This leads to the following warnings as we are not using the passed
    parameter:

    .../mm/page_alloc.c: In function 'zone_wait_table_init':
    .../mm/page_alloc.c:2259: warning: unused variable 'pgdat'

    One option would be to define all variables used with these macros
    __attribute__ ((unused)), but this would leave us exposed should these
    become genuinely unused.

    The key here is that we _are_ using the value, we ignore it but that is a
    deliberate action. This patch adds a nested local variable within the
    alloc_bootmem_node helper to which the pgdat parameter is assigned making
    it 'used'. The nested local is marked __attribute__ ((unused)) to silence
    this same warning for it.

    Signed-off-by: Andy Whitcroft
    Cc: Christoph Lameter
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • Introduce pagefault_{disable,enable}() and use these where previously we did
    manual preempt increments/decrements to make the pagefault handler do the
    atomic thing.

    Currently they still rely on the increased preempt count, but do not rely on
    the disabled preemption, this might go away in the future.

    (NOTE: the extra barrier() in pagefault_disable might fix some holes on
    machines which have too many registers for their own good)

    [heiko.carstens@de.ibm.com: s390 fix]
    Signed-off-by: Peter Zijlstra
    Acked-by: Nick Piggin
    Cc: Martin Schwidefsky
    Signed-off-by: Heiko Carstens
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

07 Dec, 2006

15 commits

  • Signed-off-by: Andrew Morton
    Signed-off-by: Andi Kleen

    Burman Yan
     
  • Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Andi Kleen

    Adrian Bunk
     
  • Since v->counter is both read and written, it should be an output as well
    as an input for the asm. The current code only gets away with this because
    counter is volatile. Also, according to Documents/atomic_ops.txt,
    atomic_add_return should provide a memory barrier, in particular a compiler
    barrier, so the asm should be marked as clobbering memory.

    Test case:

    #include

    typedef struct { int counter; } atomic_t; /* NB: no "volatile" */

    #define ATOMIC_INIT(i) { (i) }

    #define atomic_read(v) ((v)->counter)

    static __inline__ int atomic_add_return(int i, atomic_t *v)
    {
    int __i = i;

    __asm__ __volatile__(
    "lock; xaddl %0, %1;"
    :"=r"(i)
    :"m"(v->counter), "0"(i));
    /* __asm__ __volatile__(
    "lock; xaddl %0, %1"
    :"+r" (i), "+m" (v->counter)
    : : "memory"); */
    return i + __i;
    }

    int main (void) {
    atomic_t a = ATOMIC_INIT(0);
    int x;

    x = atomic_add_return (1, &a);
    if ((x!=1) || (atomic_read(&a)!=1))
    printf("fail: %i, %i\n", x, atomic_read(&a));
    }

    Signed-off-by: Duncan Sands
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Acked-by: David Howells
    Signed-off-by: Andrew Morton

    Duncan Sands
     
  • Tighten the requirements on both input to and output from the Dwarf2
    unwinder.

    Signed-off-by: Jan Beulich
    Signed-off-by: Andi Kleen

    Jan Beulich
     
  • -mregparm=3 has been enabled by default for some time on i386, and AFAIK
    there aren't any problems with it left.

    This patch removes the REGPARM config option and sets -mregparm=3
    unconditionally.

    Signed-off-by: Adrian Bunk
    Signed-off-by: Andi Kleen

    Adrian Bunk
     
  • Here is a small patch for i386 which adds a cpufeature flag and
    detection code for Intel's Branch Trace Store (BTS) feature. This
    feature can be found on Intel P4 and Core 2 processors among others.
    It can also be used by perfmon.

    changelog:
    - add CPU_FEATURE_BTS
    - add Branch Trace Store detection

    signed-off-by: stephane eranian

    Signed-off-by: Andi Kleen

    Stephane Eranian
     
  • Move the irqbalance quirks for E7320/E7520/E7525(Errata 23 in
    http://download.intel.com/design/chipsets/specupdt/30304203.pdf) to early
    quirks.

    And add a PCI quirk for these platforms to check(which happens very late
    during the boot) if the APIC routing is indeed set to default flat mode.

    This fixes the breakage(in x86_64) of this quirk due to cpu hotplug which
    selects physical mode instead of the logical flat(as needed for this errata
    workaround).

    Signed-off-by: Suresh Siddha
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Cc: "Li, Shaohua"
    Signed-off-by: Andrew Morton

    Siddha, Suresh B
     
  • Add 'enable_cpu_hotplug' flag and when cleared, the hotplug control file
    ("online") will not be added under /sys/devices/system/cpu/cpuX/

    Next patch doing PCI quirks will use this.

    Signed-off-by: Suresh Siddha
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Cc: "Li, Shaohua"
    Signed-off-by: Andrew Morton

    Siddha, Suresh B
     
  • gcc doesn't support -mtune=core2 yet, but will be soon. Use -mtune=generic or -mtune=i686
    as fallback

    TBD need benchmarking for INTEL_USERCOPY etc. So far I used the same defaults as MPENTIUMM

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • The function ptep_get_and_clear uses an atomic instruction sequence to get and
    clear an active pte. Rather than add such an atomic operator to all virtual
    machine implementations in paravirt-ops, it is easier to support the raw
    atomic sequence and use either a trapping writable pagetable approach, or a
    post-update notification. For the post update notification, we require the
    pte_update function to be called after the access. Combine the 2-level and
    3-level paging operators into one common function which does the post-update
    notification, and rename the actual atomic sequences to raw_ptep_xxx
    operators.

    Signed-off-by: Zachary Amsden
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Cc: Jeremy Fitzhardinge
    Cc: Chris Wright
    Signed-off-by: Andrew Morton

    Zachary Amsden
     
  • Make parameter names match function argument names for the yet to be defined
    pte_update_defer accessor.

    Signed-off-by: Zachary Amsden
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Cc: Jeremy Fitzhardinge
    Cc: Chris Wright
    Signed-off-by: Andrew Morton

    Zachary Amsden
     
  • Move header includes for the nopud / nopmd types to the location of the actual
    pte / pgd type definitions. This allows generic 4-level page type code to be
    written before the split 2/3 level page table headers are included.

    Signed-off-by: Zachary Amsden
    Signed-off-by: Andi Kleen
    Cc: Andi Kleen
    Cc: Jeremy Fitzhardinge
    Cc: Chris Wright
    Signed-off-by: Andrew Morton

    Zachary Amsden
     
  • Add the three bare TLB accessor functions to paravirt-ops. Most amusingly,
    flush_tlb is redefined on SMP, so I can't call the paravirt op flush_tlb.
    Instead, I chose to indicate the actual flush type, kernel (global) vs. user
    (non-global). Global in this sense means using the global bit in the page
    table entry, which makes TLB entries persistent across CR3 reloads, not
    global as in the SMP sense of invoking remote shootdowns, so the term is
    confusingly overloaded.

    AK: folded in fix from Zach for PAE compilation

    Signed-off-by: Zachary Amsden
    Signed-off-by: Chris Wright
    Signed-off-by: Andi Kleen
    Cc: Rusty Russell
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton

    Rusty Russell
     
  • Add APIC accessors to paravirt-ops. Unfortunately, we need two write
    functions, as some older broken hardware requires workarounds for
    Pentium APIC errata - this is the purpose of apic_write_atomic.

    AK: replaced __inline with inline

    Signed-off-by: Zachary Amsden
    Signed-off-by: Chris Wright
    Signed-off-by: Andi Kleen
    Cc: Rusty Russell
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton

    Rusty Russell
     
  • Allow selected bug checks to be skipped by paravirt kernels. The two most
    important are the F00F workaround (which is either done by the hypervisor,
    or not required), and the 'hlt' instruction check, which can break under
    some hypervisors.

    Signed-off-by: Zachary Amsden
    Signed-off-by: Chris Wright
    Signed-off-by: Andi Kleen
    Cc: Rusty Russell
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton

    Rusty Russell