08 May, 2009

5 commits

  • Lockdep reports the warning below when Li tries to offline one cpu:

    [ 110.835487] =================================
    [ 110.835616] [ INFO: inconsistent lock state ]
    [ 110.835688] 2.6.30-rc4-00336-g8c9ed89 #52
    [ 110.835757] ---------------------------------
    [ 110.835828] inconsistent {HARDIRQ-ON-W} -> {IN-HARDIRQ-W} usage.
    [ 110.835908] swapper/0 [HC1[1]:SC0[0]:HE0:SE1] takes:
    [ 110.835982] (cmci_discover_lock){?.+...}, at: [] cmci_clear+0x30/0x9b

    cmci_clear() can be called via smp_call_function_single().

    It is better to disable interrupt while holding cmci_discover_lock,
    to turn it into an irq-safe lock - we can deadlock otherwise.

    [ Impact: fix possible deadlock in the MCE code ]

    Reported-by: Shaohua Li
    Signed-off-by: Hidetoshi Seto
    Cc: Andi Kleen
    Cc: Andrew Morton
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    Reported-by: Shaohua Li

    Hidetoshi Seto
     
  • The Xen pagetables are no longer implicitly reserved as part of the other
    i386_start_kernel reservations, so make sure we explicitly reserve them.
    This prevents them from being released into the general kernel free page
    pool and reused.

    [ Impact: fix Xen guest crash ]

    Also-Bisected-by: Bryan Donlan
    Signed-off-by: Jeremy Fitzhardinge
    Cc: Xen-devel
    Cc: Linus Torvalds
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Jeremy Fitzhardinge
     
  • Tim Starling reported that crashdump will panic with kernel compiled
    with CONFIG_KEXEC_JUMP due to null pointer deference in
    machine_kexec_32.c: machine_kexec(), when deferencing
    kexec_image. Refering to:

    http://bugzilla.kernel.org/show_bug.cgi?id=13265

    This patch fixes the BUG via replacing global variable reference:
    kexec_image in machine_kexec() with local variable reference: image,
    which is more appropriate, and will not be null.

    Same BUG is in machine_kexec_64.c too, so fixed too in the same way.

    [ Impact: fix crash on kexec ]

    Reported-by: Tim Starling
    Signed-off-by: Huang Ying
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Huang Ying
     
  • With the introduction of the .brk section, special care must be taken
    that no unused page table entries remain if _brk_end and _end are
    separated by a 2M page boundary. cleanup_highmap() runs very early and
    hence cannot take care of that, hence potential entries needing to be
    removed past _brk_end must be cleared once the brk allocator has done
    its job.

    [ Impact: avoids undesirable TLB aliases ]

    Signed-off-by: Jan Beulich
    Signed-off-by: H. Peter Anvin

    Jan Beulich
     
  • If the first non-reserved (sub-)range doesn't fit the size requested,
    an endless loop will be entered. If a range returned from
    find_e820_area_size() turns out insufficient in size, the range must
    be skipped before calling the function again.

    [ Impact: fixes boot hang on some platforms ]

    Signed-off-by: Jan Beulich
    Signed-off-by: H. Peter Anvin

    Jan Beulich
     

06 May, 2009

4 commits


05 May, 2009

1 commit

  • Commit 7ad728f98162cb1af06a85b2a5fc422dddd4fb78
    (cpumask: x86: convert cpu_sibling_map/cpu_core_map to cpumask_var_t)
    changed the output of /proc/cpuinfo for siblings:

    Example on an AMD Phenom:

    physical id : 0
    siblings : 1
    core id : 3
    cpu cores : 4

    Before that commit it was:

    physical id : 0
    siblings : 4
    core id : 3
    cpu cores : 4

    Instead of cpu_core_mask it now uses cpu_sibling_mask to count siblings.
    This is due to the following hunk of above commit:

    | --- a/arch/x86/kernel/cpu/proc.c
    | +++ b/arch/x86/kernel/cpu/proc.c
    | @@ -14,7 +14,7 @@ static void show_cpuinfo_core(struct seq_file *m, struct cpuinf
    | if (c->x86_max_cores * smp_num_siblings > 1) {
    | seq_printf(m, "physical id\t: %d\n", c->phys_proc_id);
    | seq_printf(m, "siblings\t: %d\n",
    | - cpus_weight(per_cpu(cpu_core_map, cpu)));
    | + cpumask_weight(cpu_sibling_mask(cpu)));
    | seq_printf(m, "core id\t\t: %d\n", c->cpu_core_id);
    | seq_printf(m, "cpu cores\t: %d\n", c->booted_cores);
    | seq_printf(m, "apicid\t\t: %d\n", c->apicid);

    This was a mistake, because the impact line shows that this side-effect
    was not anticipated:

    Impact: reduce per-cpu size for CONFIG_CPUMASK_OFFSTACK=y

    So revert the respective hunk to restore the old behavior.

    [ Impact: fix sibling-info regression in /proc/cpuinfo ]

    Signed-off-by: Andreas Herrmann
    Cc: Rusty Russell
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Andreas Herrmann
     

04 May, 2009

1 commit


03 May, 2009

1 commit


02 May, 2009

1 commit

  • commit db949bba3c7cf2e664ac12e237c6d4c914f0c69d (x86-32: use non-lazy
    io bitmap context switching) broke ioperm for 32bit because it removed
    the lazy initialization of io_bitmap_base and did not set it to the
    real bitmap offset.

    [ Impact: fix non-working sys_ioperm() on 32-bit kernels ]

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

30 Apr, 2009

1 commit

  • According to the gettimeofday(2) manual:

    If either tv or tz is NULL, the corresponding structure is not
    set or returned.

    Since it is legal to give NULL as the tv argument, the code should make
    sure tv is not NULL before trying to dereference it.

    This issue manifests itself on x86_64 when vdso=0 is not on the kernel
    command-line and libc uses the vDSO for gettimeofday() (e.g. glibc >=
    2.7). A simple reproducer:

    #include
    #include

    int main(void)
    {
    struct timezone tz;

    gettimeofday(NULL, &tz);

    return 0;
    }

    See http://bugs.debian.org/466491 for more details.

    [ Impact: fix gettimeofday(NULL, &tz) segfault ]

    Signed-off-by: John Wright
    Cc: Andi Kleen
    Cc: John Wright
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    John Wright
     

29 Apr, 2009

1 commit

  • Matching on (addr == (p->addr + p->len)) causes problems when mappings
    are adjacent.

    [ Impact: fix mmiotrace confusion on adjacent iomaps ]

    Signed-off-by: Stuart Bennett
    Acked-by: Pekka Paalanen
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Stuart Bennett
     

28 Apr, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
    PCI: only save/restore existent registers in the PCIe capability
    x86/PCI: don't bother with root quirks if _CRS is used
    docbooks: add/fix PCI kernel-doc
    PCI: cleanup debug output resources
    x86/PCI: set_pci_bus_resources_arch_default cleanups
    x86/PCI: Move set_pci_bus_resources_arch_default into arch/x86
    x86/PCI: don't call e820_all_mapped with -1 in the mmconfig case
    PCI quirk: disable MSI on VIA VT3364 chipsets

    Linus Torvalds
     

27 Apr, 2009

3 commits

  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, hpet: Stop soliciting hpet=force users on ICH4M
    x86: check boundary in setup_node_bootmem()
    uv_time: add parameter to uv_read_rtc()
    x86: hpet: fix periodic mode programming on AMD 81xx
    x86: more than 8 32-bit CPUs requires X86_BIGSMP
    x86: avoid theoretical spurious NMI backtraces with CONFIG_CPUMASK_OFFSTACK=y
    x86: fix boot crash in NMI watchdog with CONFIG_CPUMASK_OFFSTACK=y and flat APIC
    x86-64: fix FPU corruption with signals and preemption
    x86/uv: fix for no memory at paddr 0
    docs, x86: add nox2apic back to kernel-parameters.txt
    x86: mm/numa_32.c calculate_numa_remap_pages should use __init
    x86, kbuild: make "make install" not depend on vmlinux
    x86/uv: fix init of cpu-less nodes
    x86/uv: fix init of memory-less nodes

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'irq-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86/irq: mark NUMA_MIGRATE_IRQ_DESC broken
    x86, irq: Remove IRQ_DISABLED check in process context IRQ move

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    locking: clarify kernel-taint warning message
    lockdep, x86: account for irqs enabled in paranoid_exit
    lockdep: more robust lockdep_map init sequence

    Linus Torvalds
     

24 Apr, 2009

3 commits

  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6: (34 commits)
    ACPI, i915: Register ACPI video even when not modesetting
    Revert "ACPICA: delete check for AML access to port 0x81-83"
    I/O port protection: update for windows compatibility.
    sony-laptop: always try to unblock rfkill on load
    sony-laptop: fix bogus error message display on resume
    ACPI: EC: Fix ACPI EC resume non-query interrupt message
    sony-laptop: SNC input event 38 fix
    sony-laptop: SNC 127 Initialization Fix
    sony-laptop: Duplicate SNC 127 Event Fix
    ACPI: prevent processor.max_cstate=0 boot crash
    ACPI/hpet: prevent boot hang when hpet=force used on ICH-4M
    ACPI: delete obsolete "bus master activity" proc field
    ACPI: idle: mark_tsc_unstable() at init-time, not run-time
    ACPI: add /sys/firmware/acpi/interrupts/sci_not counter
    ACPI video: fix an error when the brightness levels on AC and on Battery are same
    acpi-cpufreq: Do not let get_measured perf depend on internal variable
    acpi-cpufreq: style-only: add parens to math expression
    acpi-cpufreq: Cleanup: Use printk_once
    x86, acpi_cpufreq: Fix the NULL pointer dereference in get_measured_perf
    thinkpad-acpi: bump up version to 0.23
    ...

    Linus Torvalds
     
  • The HPET in the ICH4M is not documented in the data sheet
    because it was not officially validated.

    While it is fine for hackers to continue to use "hpet=force"
    to enable the hardware that they have, it is not prudent to
    solicit additional "hpet=force" users on this hardware.

    [ Impact: remove hpet=force syslog message on old-ICH systems ]

    Signed-off-by: Len Brown
    Acked-by: Venkatesh Pallipadi
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Len Brown
     
  • Len Brown
     

23 Apr, 2009

7 commits

  • Commit dc09855 ("x86/uv: fix init of memory-less nodes") causes a
    two sockets system (where node-1 doesn't have RAM installed) to crash.

    That commit makes node_possible include cpu nodes that do not have memory.
    So check boundary in setup_node_bootmem().

    [ Impact: fix boot crash on RAM-less NUMA node system ]

    Signed-off-by: Yinghai Lu
    Cc: Jack Steiner
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     
  • It will be overwriten later if _CRS is used, so don't bother to set it.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Jesse Barnes

    Yinghai Lu
     
  • Rename set_pci_bus_resources_arch_default to x86_pci_root_bus_res_quirks, move
    the weak version from common.c to i386.c, and before calling, make sure it's a
    root bus.

    Reviewed-by: Matthew Wilcox
    Signed-off-by: Yinghai Lu
    Signed-off-by: Jesse Barnes

    Yinghai Lu
     
  • Commit 30a18d6c3f1e774de656ebd8ff219d53e2ba4029 introduced a new
    function to set the PCI bus resources. Unfortunately, neither the
    author, nor the committers seemed to know that we already have somewhere
    to do that -- pcibios_fixup_bus(). This patch moves the hook (used only
    by the K8 code) into x86-specific code where it should have been in the
    first place.

    Cc: Yinghai Lu
    Signed-off-by: Matthew Wilcox
    Acked-by: Ingo Molnar
    Signed-off-by: Jesse Barnes

    Matthew Wilcox
     
  • e820_all_mapped need end is (addr + size) instead of (addr + size - 1)

    Cc: stable@kernel.org
    Acked-by: Ingo Molnar
    Signed-off-by: Yinghai Lu
    Signed-off-by: Jesse Barnes

    Yinghai Lu
     
  • The earlier patch to change the poller to a separate function subtly
    broke the boot logging logic. This could lead to machine checks
    getting logged at boot even when disabled or defaulting to off
    on some systems. Fix that.

    [ Impact: bug fix - avoid spurious MCE in log ]

    Signed-off-by: Andi Kleen
    Reviewed-by: Hidetoshi Seto
    Signed-off-by: H. Peter Anvin

    Andi Kleen
     
  • The polling timer while running per CPU still uses a global next_interval
    variable, which lead to some CPUs either polling too fast or too slow.
    This was not a serious problem because all errors get picked up eventually,
    but it's still better to avoid it. Turn next_interval into a per cpu variable.

    v2: Fix check_interval == 0 case (Hidetoshi Seto)

    [ Impact: minor bug fix ]

    Signed-off-by: Andi Kleen
    Reviewed-by: Hidetoshi Seto
    Signed-off-by: H. Peter Anvin

    Andi Kleen
     

22 Apr, 2009

9 commits

  • uv_read_rtc() is referenced by read member of struct clocksource clocksource_uv.
    In include/linux/clocksource.h, read of struct clocksource is declared as:
    cycle_t (*read)(struct clocksource *cs)

    This got introduced recently in:

    8e19608: clocksource: pass clocksource to read() callback

    But arch/x86/kernel/uv_time.c was not properly converted by that pach.

    This patch adds a dummy parameter (struct clocksource type) to uv_read_rtc() to
    fix the incompatible reference in clocksource_uv, and add a NULL parameter in
    all places where uv_read_rtc() gets called.

    [ Impact: cleanup, address compiler warning ]

    Signed-off-by: Coly Li
    Cc: Dimitri Sivanich
    Cc: Magnus Damm
    Cc: Andrew Morton
    Cc: Hugh Dickins
    LKML-Reference:
    Signed-off-by: Ingo Molnar
    Cc: Dimitri Sivanich

    Coly Li
     
  • (See http://bugzilla.kernel.org/show_bug.cgi?id=12961)

    It partially reverts commit c23e253e67c9d8a91a0ffa33c1f571a17f0a2403
    (x86: hpet: stop HPET_COUNTER when programming periodic mode)

    HPET on AMD 81xx chipset needs a second write (with HPET_TN_SETVAL
    cleared) to T0_CMP register to set the period in periodic mode.

    With this patch HPET_COUNTER is still stopped but not reset when HPET
    is programmed in periodic mode. This should help to avoid races when
    HPET is programmed in periodic mode and fixes a boot time hang that
    I've observed on a machine when using 1000HZ.

    [ Impact: fix boot time hang on machines with AMD 81xx chipset ]

    Reported-by: Jeff Mahoney
    Signed-off-by: Andreas Herrmann
    Tested-by: Jeff Mahoney
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Andreas Herrmann
     
  • Properly unregister cpufreq notifier on onload if it was registered
    during init.

    Signed-off-by: Jan Kiszka
    Signed-off-by: Avi Kivity

    Jan Kiszka
     
  • Not releasing the time_page causes a leak of that page or the compound
    page it is situated in.

    Cc: stable@kernel.org
    Signed-off-by: Joerg Roedel
    Signed-off-by: Avi Kivity

    Joerg Roedel
     
  • Complexity to fix it not worthwhile the gains, as discussed
    in http://article.gmane.org/gmane.comp.emulators.kvm.devel/28649.

    Cc: stable@kernel.org
    Signed-off-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Marcelo Tosatti
     
  • Merge reason: hpet.c changed upstream, make sure we test against that

    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU()
    does not agree with that specified by DEFINE_PER_CPU(). This means that
    architectures that have a small data section references relative to a base
    register may throw up linkage errors due to too great a displacement between
    where the base register points and the per-CPU variable.

    On FRV, the .h declaration says that the variable is in the .sdata section, but
    the .c definition says it's actually in the .data section. The linker throws
    up the following errors:

    kernel/built-in.o: In function `release_task':
    kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
    kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o

    To fix this, DECLARE_PER_CPU() should simply apply the same section attribute
    as does DEFINE_PER_CPU(). However, this is made slightly more complex by
    virtue of the fact that there are several variants on DEFINE, so these need to
    be matched by variants on DECLARE.

    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     
  • $ cat x86-more-than-8-cpus-requires-bigsmp.patch

    Enforce NR_CPUS
    LKML-Reference:
    Signed-off-by: H. Peter Anvin

    Michael K. Johnson
     
  • Pass clocksource pointer to the read() callback for clocksources. This
    allows us to share the callback between multiple instances.

    [hugh@veritas.com: fix powerpc build of clocksource pass clocksource mods]
    [akpm@linux-foundation.org: cleanup]
    Signed-off-by: Magnus Damm
    Acked-by: John Stultz
    Cc: Thomas Gleixner
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Magnus Damm
     

21 Apr, 2009

2 commits

  • In theory (though not shown in practice) alloc_cpumask_var() doesn't zero
    memory, so CPUs might print an "NMI backtrace for cpu %d" once on boot.

    (Bug introduced in fcef8576d8a64fc603e719c97d423f9f6d4e0e8b).

    [ Impact: avoid theoretical syslog noise in rare configs ]

    Signed-off-by: Rusty Russell
    Cc: Steven Rostedt
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Rusty Russell
     
  • fcef8576d8a64fc603e719c97d423f9f6d4e0e8b converted backtrace_mask to a
    cpumask_var_t, and assumed check_nmi_watchdog was called before
    nmi_watchdog_tick was ever called. Steven's oops shows I was wrong.

    This is something of a bandaid: I'm not sure we *should* be calling
    nmi_watchdog_tick before check_nmi_watchdog. Note that gcc eliminates
    this test for the CONFIG_CPUMASK_OFFSTACK=n case.

    [ Impact: fix boot crash in rare configs ]

    Reported-by: Steven Rostedt
    Signed-off-by: Rusty Russell
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Rusty Russell