09 Jan, 2007

1 commit

  • This reverts commit b026872601976f666bae77b609dc490d1834bf77, which has
    been linked to several problem reports with IO-APIC and the timer.
    Machines either don't boot because the timer doesn't happen, or we get
    double timer interrupts because we end up double-routing the timer irq
    through multiple interfaces.

    See for example

    http://lkml.org/lkml/2006/12/16/101
    http://lkml.org/lkml/2007/1/3/9
    http://bugzilla.kernel.org/show_bug.cgi?id=7789

    about some of the discussion.

    Patches to fix this cleanup exist (and have been confirmed to work fine
    at least for some of the affected cases) and we'll revisit it for
    2.6.21, but this late in the -rc series we're better off just reverting
    the incomplete commit that caused the problems.

    Suggested-by: Adrian Bunk
    Cc: Eric W. Biederman
    Cc: Yinghai Lu
    Cc: Andrew Morton
    Cc: Andi Kleen
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

04 Jan, 2007

2 commits

  • * master.kernel.org:/pub/scm/linux/kernel/git/davej/cpufreq:
    [CPUFREQ] longhaul: Kill off warnings introduced by recent changes.
    [CPUFREQ] Uninitialized use of cmd.val in arch/i386/kernel/cpu/cpufreq/acpi-cpufreq.c:acpi_cpufreq_target()
    [CPUFREQ] Longhaul - Always guess FSB
    [CPUFREQ] Longhaul - Fix up powersaver assumptions.
    [CPUFREQ] longhaul: Fix up unreachable code.
    [CPUFREQ] speedstep-centrino: missing space and bracket
    [CPUFREQ] Bug fix for acpi-cpufreq and cpufreq_stats oops on frequency change notification
    [CPUFREQ] select consistently

    Linus Torvalds
     
  • If caller passed the tsk, we should use it to validate a stack ptr.
    Otherwise, sysrq-t and other debugging stuff doesn't work.

    Signed-off-by: OGAWA Hirofumi
    Signed-off-by: Linus Torvalds

    OGAWA Hirofumi
     

02 Jan, 2007

1 commit


31 Dec, 2006

1 commit

  • The apple fn keys don't work anymore with 2.6.20-rc1.

    The reason is that USB_HID_POWERBOOK appears in several files although
    USB_HIDINPUT_POWERBOOK is the thing to be used.

    The patch fixes this.

    Cc: Greg KH
    Cc: Dmitry Torokhov
    Cc: Benjamin Herrenschmidt
    Cc: Jiri Kosina
    Cc: Marcel Holtmann
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Soeren Sonnenburg
     

23 Dec, 2006

2 commits

  • Make x86_64 ACPI_CPU_FREQ select CPU_FREQ_TABLE like other methods do.
    (although we should still eliminate as much use of 'select' as possible)

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Dave Jones

    Randy Dunlap
     
  • Fernando Lopez-Lezcano reported frequent scheduling latencies and audio
    xruns starting at the 2.6.18-rt kernel, and those problems persisted all
    until current -rt kernels. The latencies were serious and unjustified by
    system load, often in the milliseconds range.

    After a patient and heroic multi-month effort of Fernando, where he
    tested dozens of kernels, tried various configs, boot options,
    test-patches of mine and provided latency traces of those incidents, the
    following 'smoking gun' trace was captured by him:

    _------=> CPU#
    / _-----=> irqs-off
    | / _----=> need-resched
    || / _---=> hardirq/softirq
    ||| / _--=> preempt-depth
    |||| /
    ||||| delay
    cmd pid ||||| time | caller
    \ / ||||| \ | /
    IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (try_to_wake_up)
    IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup <-5856> (37 0)
    IRQ_19-1479 1D..1 0us : __trace_start_sched_wakeup (c01262ba 0 0)
    IRQ_19-1479 1D..1 0us : resched_task (try_to_wake_up)
    IRQ_19-1479 1D..1 0us : __spin_unlock_irqrestore (try_to_wake_up)
    ...
    -0 1...1 11us!: default_idle (cpu_idle)
    ...
    -0 0Dn.1 602us : smp_apic_timer_interrupt (c0103baf 1 0)
    ...
    -5856 0D..2 618us : __switch_to (__schedule)
    -5856 0D..2 618us : __schedule <-0> (20 162)
    -5856 0D..2 619us : __spin_unlock_irq (__schedule)
    -5856 0...1 619us : trace_stop_sched_switched (__schedule)
    -5856 0D..1 619us : trace_stop_sched_switched <-5856> (37 0)

    what is visible in this trace is that CPU#1 ran try_to_wake_up() for
    PID:5856, it placed PID:5856 on CPU#0's runqueue and ran resched_task()
    for CPU#0. But it decided to not send an IPI that no CPU - due to
    TS_POLLING. But CPU#0 never woke up after its NEED_RESCHED bit was set,
    and only rescheduled to PID:5856 upon the next lapic timer IRQ. The
    result was a 600+ usecs latency and a missed wakeup!

    the bug turned out to be an idle-wakeup bug introduced into the mainline
    kernel this summer via an optimization in the x86_64 tree:

    commit 495ab9c045e1b0e5c82951b762257fe1c9d81564
    Author: Andi Kleen
    Date: Mon Jun 26 13:59:11 2006 +0200

    [PATCH] i386/x86-64/ia64: Move polling flag into thread_info_status

    During some profiling I noticed that default_idle causes a lot of
    memory traffic. I think that is caused by the atomic operations
    to clear/set the polling flag in thread_info. There is actually
    no reason to make this atomic - only the idle thread does it
    to itself, other CPUs only read it. So I moved it into ti->status.

    the problem is this type of change:

    if (!hlt_counter && boot_cpu_data.hlt_works_ok) {
    - clear_thread_flag(TIF_POLLING_NRFLAG);
    + current_thread_info()->status &= ~TS_POLLING;
    smp_mb__after_clear_bit();
    while (!need_resched()) {
    local_irq_disable();

    this changes clear_thread_flag() to an explicit clearing of TS_POLLING.
    clear_thread_flag() is defined as:

    clear_bit(flag, &ti->flags);

    and clear_bit() is a LOCK-ed atomic instruction on all x86 platforms:

    static inline void clear_bit(int nr, volatile unsigned long * addr)
    {
    __asm__ __volatile__( LOCK_PREFIX
    "btrl %1,%0"

    hence smp_mb__after_clear_bit() is defined as a simple compile barrier:

    #define smp_mb__after_clear_bit() barrier()

    but the explicit TS_POLLING clearing introduced by the patch:

    + current_thread_info()->status &= ~TS_POLLING;

    is not an atomic op! So the clearing of the TS_POLLING bit is freely
    reorderable with the reading of the NEED_RESCHED bit - and both now
    reside in different memory addresses.

    CPU idle wakeup very much depends on ordered memory ops, the clearing of
    the TS_POLLING flag must always be done before we test need_resched()
    and hit the idle instruction(s). [Symmetrically, the wakeup code needs
    to set NEED_RESCHED before it tests the TS_POLLING flag, so memory
    ordering is paramount.]

    Fernando's dual-core Athlon64 system has a sufficiently advanced memory
    ordering model so that it triggered this scenario very often.

    ( And it also turned out that the reason why these latencies never
    triggered on my testsystems is that i routinely use idle=poll, which
    was the only idle variant not affected by this bug. )

    The fix is to change the smp_mb__after_clear_bit() to an smp_mb(), to
    act as an absolute barrier between the TS_POLLING write and the
    NEED_RESCHED read. This affects almost all idling methods (default,
    ACPI, APM), on all 3 x86 architectures: i386, x86_64, ia64.

    Signed-off-by: Ingo Molnar
    Tested-by: Fernando Lopez-Lezcano
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

21 Dec, 2006

2 commits

  • if CONFIG_CALGARY_IOMMU is built into the kernel via
    CONFIG_CALGARY_IOMMU_ENABLED_BY_DEFAULT, or is enabled via the
    iommu=calgary boot option, then the detect_calgary() function runs to
    detect the presence of a Calgary IOMMU.

    detect_calgary() first searches the BIOS EBDA area for a "rio_table_hdr"
    BIOS table. It has this parsing algorithm for the EBDA:

    while (offset) {
    ...
    /* The next offset is stored in the 1st word. 0 means no more */
    offset = *((unsigned short *)(ptr + offset));
    }

    got that? Lets repeat it slowly: we've got a BIOS-supplied data
    structure, plus Linux kernel code that will only break out of an
    infinite parsing loop once the BIOS gives a zero offset. Ok?

    Translation: what an excellent opportunity for BIOS writers to lock up
    the Linux boot process in an utterly hard to debug place! Indeed the
    BIOS jumped on that opportunity on my box, which has the following EBDA
    chaining layout:

    384, 65282, 65535, 65535, 65535, 65535, 65535, 65535 ...

    see the pattern? So my, definitely non-Calgary system happily locks up
    in detect_calgary()!

    the patch below fixes the boot hang by trusting the BIOS-supplied data
    structure a bit less: the parser always has to make forward progress,
    and if it doesnt, we break out of the loop and i get the expected kernel
    message:

    Calgary: Unable to locate Rio Grande Table in EBDA - bailing!

    Signed-off-by: Ingo Molnar
    Acked-by: Muli Ben-Yehuda
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • one of my boxes didnt boot the 2.6.20-rc1-rt0 kernel rpm, it hung during
    early bootup. After an hour or two of happy debugging i narrowed it down
    to the CALGARY_IOMMU_ENABLED_BY_DEFAULT option, which was freshly added
    to 2.6.20 via the x86_64 tree and /enabled by default/.

    commit bff6547bb6a4e82c399d74e7fba78b12d2f162ed claims:

    [PATCH] Calgary: allow compiling Calgary in but not using it by default

    This patch makes it possible to compile Calgary in but not use it by
    default. In this mode, use 'iommu=calgary' to activate it.

    but the change does not actually practice it:

    config CALGARY_IOMMU_ENABLED_BY_DEFAULT
    bool "Should Calgary be enabled by default?"
    default y
    depends on CALGARY_IOMMU
    help
    Should Calgary be enabled by default? if you choose 'y', Calgary
    will be used (if it exists). If you choose 'n', Calgary will not be
    used even if it exists. If you choose 'n' and would like to use
    Calgary anyway, pass 'iommu=calgary' on the kernel command line.
    If unsure, say Y.

    it's both 'default y', and says "If unsure, say Y". Clearly not a typo.

    disabling this option makes my box boot again. The patch below fixes the
    Kconfig entry. Grumble.

    Signed-off-by: Ingo Molnar
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     

16 Dec, 2006

1 commit

  • It has caused more problems than it ever really solved, and is
    apparently not getting cleaned up and fixed. We can put it back when
    it's stable and isn't likely to make warning or bug events worse.

    In the meantime, enable frame pointers for more readable stack traces.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

13 Dec, 2006

1 commit


12 Dec, 2006

1 commit


11 Dec, 2006

1 commit


10 Dec, 2006

3 commits

  • The new PDA code uses a dummy _proxy_pda variable to describe
    memory references to the PDA. It is never referenced
    in inline assembly, but exists as input/output arguments.
    gcc 4.2 in some cases can CSE references to this which causes
    unresolved symbols. Define it to zero to avoid this.

    Signed-off-by: Andi Kleen

    Andi Kleen
     
  • 2.6.19 stopped booting (or booted based on build/config) on our x86_64
    systems due to a bug introduced in 2.6.19. check_nmi_watchdog schedules an
    IPI on all cpus to busy wait on a flag, but fails to set the busywait
    flag if NMI functionality is disabled. This causes the secondary cpus
    to spin in an endless loop, causing the kernel bootup to hang.
    Depending upon the build, the busywait flag got overwritten (stack variable)
    and caused the kernel to bootup on certain builds. Following patch fixes
    the bug by setting the busywait flag before returning from check_nmi_watchdog.
    I guess using a stack variable is not good here as the calling function could
    potentially return while the busy wait loop is still spinning on the flag.

    AK: I redid the patch significantly to be cleaner

    Signed-off-by: Ravikiran Thirumalai
    Signed-off-by: Shai Fultheim
    Signed-off-by: Andi Kleen

    Ravikiran G Thirumalai
     
  • Signed-off-by: Andi Kleen

    Andi Kleen
     

09 Dec, 2006

3 commits

  • This facility provides three entry points:

    ilog2() Log base 2 of unsigned long
    ilog2_u32() Log base 2 of u32
    ilog2_u64() Log base 2 of u64

    These facilities can either be used inside functions on dynamic data:

    int do_something(long q)
    {
    ...;
    y = ilog2(x)
    ...;
    }

    Or can be used to statically initialise global variables with constant values:

    unsigned n = ilog2(27);

    When performing static initialisation, the compiler will report "error:
    initializer element is not constant" if asked to take a log of zero or of
    something not reducible to a constant. They treat negative numbers as
    unsigned.

    When not dealing with a constant, they fall back to using fls() which permits
    them to use arch-specific log calculation instructions - such as BSR on
    x86/x86_64 or SCAN on FRV - if available.

    [akpm@osdl.org: MMC fix]
    Signed-off-by: David Howells
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Herbert Xu
    Cc: David Howells
    Cc: Wojtek Kaniewski
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Change all the uses of f_{dentry,vfsmnt} to f_path.{dentry,mnt} in the x86_64
    arch code.

    Signed-off-by: Josef "Jeff" Sipek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Josef "Jeff" Sipek
     
  • This makes x86-64 use the generic BUG machinery.

    The main advantage in using the generic BUG machinery for x86-64 is that
    the inlined overhead of BUG is just the ud2a instruction; the file+line
    information are no longer inlined into the instruction stream. This
    reduces cache pollution.

    Signed-off-by: Jeremy Fitzhardinge
    Cc: Andi Kleen
    Cc: Hugh Dickens
    Cc: Michael Ellerman
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeremy Fitzhardinge
     

08 Dec, 2006

9 commits

  • * 'for-linus' of git://one.firstfloor.org/home/andi/git/linux-2.6: (156 commits)
    [PATCH] x86-64: Export smp_call_function_single
    [PATCH] i386: Clean up smp_tune_scheduling()
    [PATCH] unwinder: move .eh_frame to RODATA
    [PATCH] unwinder: fully support linker generated .eh_frame_hdr section
    [PATCH] x86-64: don't use set_irq_regs()
    [PATCH] x86-64: check vector in setup_ioapic_dest to verify if need setup_IO_APIC_irq
    [PATCH] x86-64: Make ix86 default to HIGHMEM4G instead of NOHIGHMEM
    [PATCH] i386: replace kmalloc+memset with kzalloc
    [PATCH] x86-64: remove remaining pc98 code
    [PATCH] x86-64: remove unused variable
    [PATCH] x86-64: Fix constraints in atomic_add_return()
    [PATCH] x86-64: fix asm constraints in i386 atomic_add_return
    [PATCH] x86-64: Correct documentation for bzImage protocol v2.05
    [PATCH] x86-64: replace kmalloc+memset with kzalloc in MTRR code
    [PATCH] x86-64: Fix numaq build error
    [PATCH] x86-64: include/asm-x86_64/cpufeature.h isn't a userspace header
    [PATCH] unwinder: Add debugging output to the Dwarf2 unwinder
    [PATCH] x86-64: Clarify error message in GART code
    [PATCH] x86-64: Fix interrupt race in idle callback (3rd try)
    [PATCH] x86-64: Remove unwind stack pointer alignment forcing again
    ...

    Fixed conflict in include/linux/uaccess.h manually

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • The elf note saving code is currently duplicated over several
    architectures. This cleanup patch simply adds code to a common file and
    then replaces the arch-specific code with calls to the newly added code.

    The only drawback with this approach is that s390 doesn't fully support
    kexec-on-panic which for that arch leads to introduction of unused code.

    Signed-off-by: Magnus Damm
    Cc: Vivek Goyal
    Cc: Andi Kleen
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Magnus Damm
     
  • There was lots of #ifdef noise in the kernel due to hotcpu_notifier(fn,
    prio) not correctly marking 'fn' as used in the !HOTPLUG_CPU case, and thus
    generating compiler warnings of unused symbols, hence forcing people to add
    #ifdefs.

    the compiler can skip truly unused functions just fine:

    text data bss dec hex filename
    1624412 728710 3674856 6027978 5bfaca vmlinux.before
    1624412 728710 3674856 6027978 5bfaca vmlinux.after

    [akpm@osdl.org: topology.c fix]
    Signed-off-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ingo Molnar
     
  • smp_call_function_single() can deadlock if the caller disabled local
    interrupts (the target CPU could be spinning on call_lock). Check for that.

    Why on earth do these functions use spin_lock_bh()??

    Cc: "Randy.Dunlap"
    Cc: Andi Kleen
    Cc: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • When we are unregistering a kprobe-booster, we can't release its
    instruction buffer immediately on the preemptive kernel, because some
    processes might be preempted on the buffer. The freeze_processes() and
    thaw_processes() functions can clean most of processes up from the buffer.
    There are still some non-frozen threads who have the PF_NOFREEZE flag. If
    those threads are sleeping (not preempted) at the known place outside the
    buffer, we can ensure safety of freeing.

    However, the processing of this check routine takes a long time. So, this
    patch introduces the garbage collection mechanism of insn_slot. It also
    introduces the "dirty" flag to free_insn_slot because of efficiency.

    The "clean" instruction slots (dirty flag is cleared) are released
    immediately. But the "dirty" slots which are used by boosted kprobes, are
    marked as garbages. collect_garbage_slots() will be invoked to release
    "dirty" slots if there are more than INSNS_PER_PAGE garbage slots or if
    there are no unused slots.

    Cc: "Keshavamurthy, Anil S"
    Cc: Ananth N Mavinakayanahalli
    Cc: "bibo,mao"
    Cc: Prasanna S Panchamukhi
    Cc: Yumiko Sugita
    Cc: Satoshi Oshima
    Cc: Hideo Aoki
    Signed-off-by: Masami Hiramatsu
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Masami Hiramatsu
     
  • Define elf_addr_t in linux/elf.h. The size of the type is determined using
    ELF_CLASS. This allows us to remove the defines that today are spread all
    over .c and .h files.

    Signed-off-by: Magnus Damm
    Cc: Daniel Jacobowitz
    Cc: Roland McGrath
    Cc: Jakub Jelinek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Magnus Damm
     
  • After LOADER_TYPE && INITRD_START are true, the short if-condition
    for INITRD_START can never be false.

    Remove unused code from the else condition.

    Signed-off-by: Henry Nestler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Henry Nestler
     
  • The last thing we agreed on was to remove the macros entirely for 2.6.19,
    on all architectures. Unfortunately, I think nobody actually _did_ that,
    so they are still there.

    [akpm@osdl.org: x86_64 fix]
    Cc: David Woodhouse
    Cc: Greg Schafer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arnd Bergmann
     
  • SLAB_KERNEL is an alias of GFP_KERNEL.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

07 Dec, 2006

12 commits