12 Mar, 2009

1 commit


11 Mar, 2009

1 commit


10 Mar, 2009

2 commits

  • * 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/davej/cpufreq:
    [CPUFREQ] Add p4-clockmod sysfs-ui removal to feature-removal schedule.
    Revert "[CPUFREQ] Disable sysfs ui for p4-clockmod."

    Linus Torvalds
     
  • This reverts commit e088e4c9cdb618675874becb91b2fd581ee707e6.

    Removing the sysfs interface for p4-clockmod was flagged as a
    regression in bug 12826.

    Course of action:
    - Find out the remaining causes of overheating, and fix them
    if possible. ACPI should be doing the right thing automatically.
    If it isn't, we need to fix that.
    - mark p4-clockmod ui as deprecated
    - try again with the removal in six months.

    It's not really feasible to printk about the deprecation, because
    it needs to happen at all the sysfs entry points, which means adding
    a lot of strcmp("p4-clockmod".. calls to the core, which.. bleuch.

    Signed-off-by: Dave Jones

    Dave Jones
     

09 Mar, 2009

3 commits

  • Impact: remove lots of lguest boot WARN_ON() when CONFIG_SPARSE_IRQ=y

    We now need to call irq_to_desc_alloc_cpu() before
    set_irq_chip_and_handler_name(), but we can't do that from init_IRQ (no
    kmalloc available).

    So do it as we use interrupts instead. Also means we only alloc for
    irqs we use, which was the intent of CONFIG_SPARSE_IRQ anyway.

    Signed-off-by: Rusty Russell
    Cc: Ingo Molnar

    Rusty Russell
     
  • Impact: fix lguest boot crash on modern Intel machines

    The code in early_init_intel does:

    if (c->x86 > 6 || (c->x86 == 6 && c->x86_model >= 0xd)) {
    u64 misc_enable;

    rdmsrl(MSR_IA32_MISC_ENABLE, misc_enable);

    And that rdmsr faults (not allowed from non-0 PL). We can get around
    this by mugging the family ID part of the cpuid. 5 seems like a good
    number.

    Of course, this is a hack (how very lguest!). We could just indicate
    that we don't support MSRs, or implement lguest_rdmst.

    Reported-by: Patrick McHardy
    Signed-off-by: Rusty Russell
    Tested-by: Patrick McHardy

    Rusty Russell
     
  • Impact: fix race+crash in mmiotrace

    The list manipulation in remove_kmmio_fault_pages() was broken. If more
    than one consecutive kmmio_fault_page was re-added during the grace
    period between unregister_kmmio_probe() and remove_kmmio_fault_pages(),
    the list manipulation failed to remove pages from the release list.

    After a second grace period the pages get into rcu_free_kmmio_fault_pages()
    and raise a BUG_ON() kernel crash.

    The list manipulation is fixed to properly remove pages from the release
    list.

    This bug has been present from the very beginning of mmiotrace in the
    mainline kernel. It was introduced in 0fd0e3da ("x86: mmiotrace full
    patch, preview 1");

    An urgent fix for Linus. Tested by Stuart (on 32-bit) and Pekka
    (on amd and intel 64-bit systems, nouveau and nvidia proprietary).

    Signed-off-by: Stuart Bennett
    Signed-off-by: Pekka Paalanen
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Stuart Bennett
     

06 Mar, 2009

2 commits

  • ds_write_config() can write the BTS as well as the PEBS part of
    the DS config. ds_request_pebs() passes the wrong qualifier, which
    results in the wrong configuration to be written.

    Reported-by: Stephane Eranian
    Signed-off-by: Markus Metzger
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Markus Metzger
     
  • In case a ptraced task is reaped (while the tracer is still attached),
    ds_exit_thread() is called before ptrace_exit(). The latter will
    release the bts_tracer and remove the thread's ds_ctx.
    The former will WARN() if the context is not NULL.

    Oleg Nesterov submitted patches that move ptrace_exit() before
    exit_thread() and thus reverse the order of the above calls.

    Remove the bad warning. I will add it again when Oleg's changes are in.

    Signed-off-by: Markus Metzger
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Markus Metzger
     

05 Mar, 2009

4 commits

  • Dell XPS710 will hang on reboot. This is resolved by adding a quirk to
    set bios reboot.

    Signed-off-by: Leann Ogasawara
    Signed-off-by: Tim Gardner
    Cc: "manoj.iyer"
    Cc:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Leann Ogasawara
     
  • Impact: fix math-emu related crash while using GDB/ptrace

    init_fpu() calls finit to initialize a task's xstate, while finit always
    works on the current task. If we use PTRACE_GETFPREGS on another
    process and both processes did not already use floating point, we get
    a null pointer exception in finit.

    This patch creates a new function finit_task that takes a task_struct
    parameter. finit becomes a wrapper that simply calls finit_task with
    current. On the plus side this avoids many calls to get_current which
    would each resolve to an inline assembler mov instruction.

    An empty finit_task has been added to i387.h to avoid linker errors in
    case the compiler still emits the call in init_fpu when
    CONFIG_MATH_EMULATION is not defined.

    The declaration of finit in i387.h has been removed as the remaining
    code using this function gets its prototype from fpu_proto.h.

    Signed-off-by: Daniel Glöckner
    Cc: Suresh Siddha
    Cc: "Pallipadi Venkatesh"
    Cc: Arjan van de Ven
    Cc: Bill Metzenthen
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Daniel Glöckner
     
  • Impact: Fix boot failure on EFI system with large runtime memory range

    Brian Maly reported that some EFI system with large runtime memory
    range can not boot. Because the FIX_MAP used to map runtime memory
    range is smaller than run time memory range.

    This patch fixes this issue by re-implement efi_ioremap() with
    init_memory_mapping().

    Reported-and-tested-by: Brian Maly
    Signed-off-by: Huang Ying
    Cc: Brian Maly
    Cc: Yinghai Lu
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Huang Ying
     
  • Impact: reactivate DMI quirks on EFI hardware

    DMI tables are loaded by EFI, so the dmi calls must happen after
    efi_init() and not before.

    Currently Apple hardware uses DMI to determine the framebuffer mappings
    for efifb. Without DMI working you also have no video on MacBook Pro.

    This patch resolves the DMI issue for EFI hardware (DMI is now properly
    detected at boot), and additionally efifb now loads on Apple hardware
    (i.e. video works).

    Signed-off-by: Brian Maly
    Acked-by: Yinghai Lu
    Cc: ying.huang@intel.com
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    arch/x86/kernel/setup.c | 5 +++--
    1 file changed, 3 insertions(+), 2 deletions(-)

    Brian Maly
     

04 Mar, 2009

2 commits


03 Mar, 2009

5 commits

  • Impact: fix stuck NMIs and non-working oprofile on certain CPUs

    Resetting the counter width of the performance counters on Intel's
    Core2 CPUs, breaks the delivery of NMIs, when running in x86_64 mode.

    This should fix bug #12395:

    http://bugzilla.kernel.org/show_bug.cgi?id=12395

    Signed-off-by: Tim Blechmann
    Signed-off-by: Robert Richter
    LKML-Reference:
    Cc:
    Signed-off-by: Ingo Molnar

    Tim Blechmann
     
  • Impact: fix failed EFI bootup in certain circumstances

    Ying Huang found init_memory_mapping() has problem with small ranges
    less than 2M when he tried to direct map the EFI runtime code out of
    max_low_pfn_mapped.

    It turns out we never considered that case and didn't check the range...

    Reported-by: Ying Huang
    Signed-off-by: Yinghai Lu
    Cc: Brian Maly
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    fix warning in io_mapping_map_wc()
    x86: i915 needs pgprot_writecombine() and is_io_mapping_possible()

    Linus Torvalds
     
  • On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
    ljmp, and then use the "syscall" instruction to make a 64-bit system
    call. A 64-bit process make a 32-bit system call with int $0x80.

    In both these cases under CONFIG_SECCOMP=y, secure_computing() will use
    the wrong system call number table. The fix is simple: test TS_COMPAT
    instead of TIF_IA32. Here is an example exploit:

    /* test case for seccomp circumvention on x86-64

    There are two failure modes: compile with -m64 or compile with -m32.

    The -m64 case is the worst one, because it does "chmod 777 ." (could
    be any chmod call). The -m32 case demonstrates it was able to do
    stat(), which can glean information but not harm anything directly.

    A buggy kernel will let the test do something, print, and exit 1; a
    fixed kernel will make it exit with SIGKILL before it does anything.
    */

    #define _GNU_SOURCE
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    int
    main (int argc, char **argv)
    {
    char buf[100];
    static const char dot[] = ".";
    long ret;
    unsigned st[24];

    if (prctl (PR_SET_SECCOMP, 1, 0, 0, 0) != 0)
    perror ("prctl(PR_SET_SECCOMP) -- not compiled into kernel?");

    #ifdef __x86_64__
    assert ((uintptr_t) dot < (1UL << 32));
    asm ("int $0x80 # %0 st_uid=%u\n", st[7]);
    else
    ret = snprintf (buf, sizeof buf, "result %ld\n", ret);
    #else
    # error "not this one"
    #endif

    write (1, buf, ret);

    syscall (__NR_exit, 1);
    return 2;
    }

    Signed-off-by: Roland McGrath
    [ I don't know if anybody actually uses seccomp, but it's enabled in
    at least both Fedora and SuSE kernels, so maybe somebody is. - Linus ]
    Signed-off-by: Linus Torvalds

    Roland McGrath
     
  • On x86-64, a 32-bit process (TIF_IA32) can switch to 64-bit mode with
    ljmp, and then use the "syscall" instruction to make a 64-bit system
    call. A 64-bit process make a 32-bit system call with int $0x80.

    In both these cases, audit_syscall_entry() will use the wrong system
    call number table and the wrong system call argument registers. This
    could be used to circumvent a syscall audit configuration that filters
    based on the syscall numbers or argument details.

    Signed-off-by: Roland McGrath
    Signed-off-by: Linus Torvalds

    Roland McGrath
     

02 Mar, 2009

7 commits

  • There was a theoretical possibility to a race between arming a page in
    post_kmmio_handler() and disarming the page in
    release_kmmio_fault_page():

    cpu0 cpu1
    ------------------------------------------------------------------
    mmiotrace shutdown
    enter release_kmmio_fault_page
    fault on the page
    disarm the page
    disarm the page
    handle the MMIO access
    re-arm the page
    put the page on release list
    remove_kmmio_fault_pages()
    fault on the page
    page not known to mmiotrace
    fall back to do_page_fault()
    *KABOOM*

    (This scenario also shows the double disarm case which is allowed.)

    Fixed by acquiring kmmio_lock in post_kmmio_handler() and checking
    if the page is being released from mmiotrace.

    Signed-off-by: Pekka Paalanen
    Cc: Stuart Bennett
    Cc: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Upgrade some kmmio.c debug messages to warnings.
    Allow secondary faults on probed pages to fall through, and only log
    secondary faults that are not due to non-present pages.

    Patch edited by Pekka Paalanen.

    Signed-off-by: Stuart Bennett
    Signed-off-by: Pekka Paalanen
    Cc: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Stuart Bennett
     
  • From 36772dcb6ffbbb68254cbfc379a103acd2fbfefc Mon Sep 17 00:00:00 2001
    From: Pekka Paalanen
    Date: Sat, 28 Feb 2009 21:34:59 +0200

    Split set_page_presence() in kmmio.c into two more functions set_pmd_presence()
    and set_pte_presence(). Purely code reorganization, no functional changes.

    Signed-off-by: Pekka Paalanen
    Cc: Stuart Bennett
    Cc: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • From baa99e2b32449ec7bf147c234adfa444caecac8a Mon Sep 17 00:00:00 2001
    From: Pekka Paalanen
    Date: Sun, 22 Feb 2009 20:02:43 +0200

    Blindly setting _PAGE_PRESENT in disarm_kmmio_fault_page() overlooks the
    possibility, that the page was not present when it was armed.

    Make arm_kmmio_fault_page() store the previous page presence in struct
    kmmio_fault_page and use it on disarm.

    This patch was originally written by Stuart Bennett, but Pekka Paalanen
    rewrote it a little different.

    Signed-off-by: Pekka Paalanen
    Cc: Stuart Bennett
    Cc: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Print a full warning once, if arming or disarming a page fails.

    Also, if initial arming fails, do not handle the page further. This
    avoids the possibility of a page failing to arm and then later claiming
    to have handled any fault on that page.

    WARN_ONCE added by Pekka Paalanen.

    Signed-off-by: Stuart Bennett
    Signed-off-by: Pekka Paalanen
    Cc: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Stuart Bennett
     
  • Apparently pages far into an ioremapped region might not actually be
    mapped during ioremap(). Add an optional read test to try to trigger a
    multiply faulting MMIO access. Also add more messages to the kernel log
    to help debugging.

    This patch is based on a patch suggested by
    Stuart Bennett
    who discovered bugs in mmiotrace related to normal kernel space faults.

    Signed-off-by: Pekka Paalanen
    Cc: Stuart Bennett
    Cc: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     
  • Check the read values against the written values in the MMIO read/write
    test. This test shows if the given MMIO test area really works as
    memory, which is a prerequisite for a successful mmiotrace test.

    Signed-off-by: Pekka Paalanen
    Cc: Stuart Bennett
    Cc: Steven Rostedt
    Signed-off-by: Ingo Molnar

    Pekka Paalanen
     

28 Feb, 2009

1 commit


27 Feb, 2009

1 commit

  • Now that the obvious bugs have been worked out, specifically
    the iwlagn issue, and the write buffer errata, DMAR should be safe
    to turn back on by default. (We've had it on since those patches were
    first written a few weeks ago, without any noticeable bug reports
    (most have been due to the dma-api debug patchset.))

    Signed-off-by: Kyle McMartin
    Acked-by: David Woodhouse
    Signed-off-by: Ingo Molnar

    Kyle McMartin
     

26 Feb, 2009

1 commit


25 Feb, 2009

2 commits

  • io_mapping_create_wc should take a resource_size_t parameter in place of
    unsigned long. With unsigned long, there will be no way to map greater than 4GB
    address in i386/32 bit.

    On x86, greater than 4GB addresses cannot be mapped on i386 without PAE. Return
    error for such a case.

    Patch also adds a structure for io_mapping, that saves the base, size and
    type on HAVE_ATOMIC_IOMAP archs, that can be used to verify the offset on
    io_mapping_map calls.

    Signed-off-by: Venkatesh Pallipadi
    Signed-off-by: Suresh Siddha
    Cc: Dave Airlie
    Cc: Jesse Barnes
    Cc: Eric Anholt
    Cc: Keith Packard
    Signed-off-by: Ingo Molnar

    Venkatesh Pallipadi
     
  • This was changed to a physmap_t giving a clashing symbol redefinition,
    but actually using a physmap_t consumes rather a lot of space on x86,
    so stick with a private copy renamed with a voyager_ prefix and made
    static. Nothing outside of the Voyager code uses it, anyway.

    Signed-off-by: James Bottomley
    Signed-off-by: H. Peter Anvin

    James Bottomley
     

23 Feb, 2009

2 commits


22 Feb, 2009

2 commits

  • As acpi_enter_sleep_state can fail, take this into account in
    do_suspend_lowlevel and don't return to the do_suspend_lowlevel's
    caller. This would break (currently) fpu status and preempt count.

    Technically, this means use `call' instead of `jmp' and `jmp' to
    the `resume_point' after the `call' (i.e. if
    acpi_enter_sleep_state returns=fails). `resume_point' will handle
    the restore of fpu and preempt count gracefully.

    Signed-off-by: Jiri Slaby
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Len Brown

    Jiri Slaby
     
  • - remove %ds re-set, it's already set in wakeup_long64
    - remove double labels and alignment (ENTRY already adds both)
    - use meaningful resume point labelname
    - skip alignment while jumping from wakeup_long64 to the resume point
    - remove .size, .type and unused labels
    [v2]
    - added ENDPROCs

    Signed-off-by: Jiri Slaby
    Acked-by: Cyrill Gorcunov
    Acked-by: Pavel Machek
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Len Brown

    Jiri Slaby
     

21 Feb, 2009

1 commit

  • Impact: Bug fix on UP

    Checkin 6ec68bff3c81e776a455f6aca95c8c5f1d630198:
    x86, mce: reinitialize per cpu features on resume

    introduced a call to mce_cpu_features() in the resume path, in order
    for the MCE machinery to get properly reinitialized after a resume.
    However, this function (and its successors) was flagged __cpuinit,
    which becomes __init on UP configurations (on SMP suspend/resume
    requires CPU hotplug and so this would not be seen.)

    Remove the offending __cpuinit annotations for mce_cpu_features() and
    its successor functions.

    Cc: Andi Kleen
    Cc: Linus Torvalds
    Signed-off-by: H. Peter Anvin

    H. Peter Anvin
     

20 Feb, 2009

3 commits

  • Steven Rostedt found a bug in where in his modified kernel
    ftrace was unable to modify the kernel text, due to the PMD
    itself having been marked read-only as well in
    split_large_page().

    The fix, suggested by Linus, is to not try to 'clone' the
    reference protection of a huge-page, but to use the standard
    (and permissive) page protection bits of KERNPG_TABLE.

    The 'cloning' makes sense for the ptes but it's a confused and
    incorrect concept at the page table level - because the
    pagetable entry is a set of all ptes and hence cannot
    'clone' any single protection attribute - the ptes can be any
    mixture of protections.

    With the permissive KERNPG_TABLE, even if the pte protections
    get changed after this point (due to ftrace doing code-patching
    or other similar activities like kprobes), the resulting combined
    protections will still be correct and the pte's restrictive
    (or permissive) protections will control it.

    Also update the comment.

    This bug was there for a long time but has not caused visible
    problems before as it needs a rather large read-only area to
    trigger. Steve possibly hacked his kernel with some really
    large arrays or so. Anyway, the bug is definitely worth fixing.

    [ Huang Ying also experienced problems in this area when writing
    the EFI code, but the real bug in split_large_page() was not
    realized back then. ]

    Reported-by: Steven Rostedt
    Reported-by: Huang Ying
    Acked-by: Linus Torvalds
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • Impact: fix time warps under vmware

    Similar to the check for TSC going backwards in the TSC clocksource,
    we also need this check for VMI clocksource.

    Signed-off-by: Alok N Kataria
    Cc: Zachary Amsden
    Signed-off-by: Ingo Molnar
    Cc: stable@kernel.org

    Alok N Kataria
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, mce: fix ifdef for 64bit thermal apic vector clear on shutdown
    x86, mce: use force_sig_info to kill process in machine check
    x86, mce: reinitialize per cpu features on resume
    x86, rcu: fix strange load average and ksoftirqd behavior

    Linus Torvalds