31 Aug, 2009

9 commits


27 Aug, 2009

31 commits

  • Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Move smp_read_mpc_oem from quirks to x86_init.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The mpc_apic_id setup is handled by a x86_quirk. Make it a
    x86_init_ops function with a default implementation.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • 32bit and also the numaq code have special requirements on the
    ioapic_id setup. Convert it to a x86_init_ops function and get rid
    of the quirks and #ifdefs

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The x86 quirkification introduced an extra ugly hackery with a
    variable pointer in the mpparse code. If the pointer is initialized
    then it is dereferenced and the variable set to 0 or incremented.

    Create a x86_init_ops function and let the affected numaq code
    hold the function. Default init is a setup noop.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • memory_setup is overridden by x86_quirks and by paravirts with weak
    functions and quirks. Unify the whole mess and make it an
    unconditional x86_init_ops function which defaults to the standard
    function and can be overridden by the early platform code.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • reserve_ebda_region needs to be called befor start_kernel. Moorestown
    needs to override it. Make it a x86_init_ops function and initialize
    it with the default reserve_ebda_region.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The 32bit and the 64bit code are slighty different in the reservation
    of standard resources. Also the upcoming Moorestown support needs its
    own version of that.

    Add it to x86_init_ops and initialize it with the 64bit default. 32bit
    overrides it in early boot. Now moorestown can add it's own override
    w/o sprinkling the code with more #ifdefs

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • probe_roms is only used on 32bit. Add it to the x86_init ops and
    remove the #ifdefs.

    Default initializer is x86_init_noop() which is overridden in
    the 32bit boot code.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • The upcoming Moorestown support brings the embedded world to x86. The
    setup code of x86 has already a couple of hooks which are either
    x86_quirks or paravirt ops. Some of those setup hooks are pretty
    convoluted like the timer setup and the tsc calibration code. But
    there are other places which could do with a cleanup.

    Instead of having inline functions/macros which are modified at
    compile time I decided to introduce x86_init ops which are
    unconditional in the code and make it clear that they can be changed
    either during compile time or in the early boot process. The function
    pointers are initialized by default functions which can be noops so
    that the pointer can be called unconditionally in the most cases. This
    also allows us to remove 32bit/64bit, paravirt and other #ifdeffery.

    paravirt guests are just a hardware platform in the setup code, so we
    should treat them as such and not hide all behind multiple layers of
    indirection and compile time dependencies.

    It's more obvious that x86_init.timers.timer_init() is a function
    pointer than the late_time_init = choose_time_init() obscurity. It's
    also way simpler to grep for x86_init.timers.timer_init and find all
    the places which modify that function pointer instead of analyzing
    weak functions, macros and paravirt indirections.

    Note. This is not a general paravirt_ops replacement. It just will
    move setup related hooks which are potentially useful for other
    platform setup purposes as well out of the paravirt domain.

    Add the base infrastructure without any functionality.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Reason: The tsc init cleanup depends on sched_clock_init moving past
    late_time_init.

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Reason: The setup cleanups conflict with the paravirt cleanups. Avoid
    a rather large merge conflict

    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Some architectures initialize clocks and timers in late_time_init and
    x86 wants to do the same to avoid FIXMAP hackery for calibrating the
    TSC. That would result in undefined sched_clock readout and wreckaged
    printk timestamps again. We probably have those already on archs which
    do all their time/clock setup in late_time_init.

    There is no harm to move that after late_time_init except that a few
    more boot timestamps are stale. The scheduler is not active at that
    point so no real wreckage is expected.

    Signed-off-by: Thomas Gleixner
    LKML-Reference:
    Cc: linux-arch@vger.kernel.org

    Thomas Gleixner
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
    virtio: net refill on out-of-memory
    smc91x: fix compilation on SMP

    Linus Torvalds
     
  • * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
    powerpc/ps3: Update ps3_defconfig
    powerpc/ps3: Add missing check for PS3 to rtc-ps3 platform device registration

    Linus Torvalds
     
  • Update ps3_defconfig.

    o Refresh for 2.6.31.
    o Remove MTD support.
    o Add more HID drivers.

    Signed-off-by: Geoff Levand
    Signed-off-by: Benjamin Herrenschmidt

    Geoff Levand
     
  • On non-PS3, we get:

    | kernel BUG at drivers/rtc/rtc-ps3.c:36!

    because the rtc-ps3 platform device is registered unconditionally in a kernel
    with builtin support for PS3.

    Reported-by: Sachin Sant
    Signed-off-by: Geert Uytterhoeven
    Acked-by: Geoff Levand
    Signed-off-by: Benjamin Herrenschmidt

    Geert Uytterhoeven
     
  • …s/security-testing-2.6

    * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/security-testing-2.6:
    IMA: iint put in ima_counts_get and put

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
    m68k,m68knommu: Wire up rt_tgsigqueueinfo and perf_counter_open
    m68k: Fix redefinition of pgprot_noncached
    arch/m68k/include/asm/motorola_pgalloc.h: fix kunmap arg
    m68k: cnt reaches -1, not 0
    m68k: count can reach 51, not 50

    Linus Torvalds
     
  • If we change the inverted attribute to another value, the LED will not be
    inverted until we change the GPIO state.

    Signed-off-by: Thadeu Lima de Souza Cascardo
    Cc: Samuel R. C. Vale
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thadeu Lima de Souza Cascardo
     
  • When setting the same GPIO number, multiple IRQ shared requests will be
    done without freing the previous request. It will also try to free a
    failed request or an already freed IRQ if 0 was written to the gpio file.

    All these oops and leaks were fixed with the following solution: keep the
    previous allocated GPIO (if any) still allocated in case the new request
    fails. The alternative solution would desallocate the previous allocated
    GPIO and set gpio as 0.

    Signed-off-by: Thadeu Lima de Souza Cascardo
    Signed-off-by: Samuel R. C. Vale
    Cc: Richard Purdie
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Thadeu Lima de Souza Cascardo
     
  • This failure is very common on many platforms. Handling it in the ACPI
    processor driver is enough, and we don't need a warning message unless
    CONFIG_ACPI_DEBUG is set.

    Based on a patch from Zhang Rui.

    Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13389

    Signed-off-by: Frans Pop
    Acked-by: Zhang Rui
    Cc: Len Brown
    Cc: "Rafael J. Wysocki"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frans Pop
     
  • If the BIOS reports an invalid throttling state (which seems to be
    fairly common after system boot), a reset is done to state T0.
    Because of a check in acpi_processor_get_throttling_ptc(), the reset
    never actually gets executed, which results in the error reoccurring
    on every access of for example /proc/acpi/processor/CPU0/throttling.

    Add a 'force' option to acpi_processor_set_throttling() to ensure
    the reset really takes effect.

    Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13389

    This patch, together with the next one, fixes a regression introduced in
    2.6.30, listed on the regression list. They have been available for 2.5
    months now in bugzilla, but have not been picked up, despite various
    reminders and without any reason given.

    Google shows that numerous people are hitting this issue. The issue is in
    itself relatively minor, but the bug in the code is clear.

    The patches have been in all my kernels and today testing has shown that
    throttling works correctly with the patches applied when the system
    overheats (http://bugzilla.kernel.org/show_bug.cgi?id=13918#c14).

    Signed-off-by: Frans Pop
    Acked-by: Zhang Rui
    Cc: Len Brown
    Cc: "Rafael J. Wysocki"
    Cc: Rusty Russell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Frans Pop
     
  • Summary:
    Kernel panic arise when stack protection is enabled, since strncat will
    add a null terminating byte '\0'; So in functions
    like this one (wmi_query_block):
    char wc[4]="WC";
    ....
    strncat(method, block->object_id, 2);
    ...
    the length of wc should be n+1 (wc[5]) or stack protection
    fault will arise. This is not noticeable when stack protection is
    disabled,but , isn't good either.
    Config used: [CONFIG_CC_STACKPROTECTOR_ALL=y,
    CONFIG_CC_STACKPROTECTOR=y]

    Panic Trace
    ------------
    .... stack-protector: kernel stack corrupted in : fa7b182c
    2.6.30-rc8-obelisco-generic
    call_trace:
    [] ? panic+0x45/0xd9
    [] ? __stack_chk_fail+0x1c/0x40
    [] ? wmi_query_block+0x15a/0x162 [wmi]
    [] ? wmi_query_block+0x15a/0x162 [wmi]
    [] ? acer_wmi_init+0x00/0x61a [acer_wmi]
    [] ? acer_wmi_init+0x135/0x61a [acer_wmi]
    [] ? do_one_initcall+0x50+0x126

    Addresses http://bugzilla.kernel.org/show_bug.cgi?id=13514

    Signed-off-by: Costantino Leandro
    Signed-off-by: Carlos Corbacho
    Cc: Len Brown
    Cc: Bjorn Helgaas
    Cc: "Rafael J. Wysocki"
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Costantino Leandro
     
  • Jens reported early_ioremap messages with old ASUS board...

    > [ 1.507461] pci 0000:00:09.0: Firmware left e100 interrupts enabled; disabling
    > [ 1.532778] early_ioremap(3fffd080, 0000005c) [0] => Pid: 1, comm: swapper Not tainted 2.6.31-rc4 #36
    > [ 1.561007] Call Trace:
    > [ 1.568638] [] ? printk+0x18/0x1d
    > [ 1.581734] [] __early_ioremap+0x74/0x1e9
    > [ 1.596898] [] early_ioremap+0x1a/0x1c
    > [ 1.611270] [] __acpi_map_table+0x18/0x1a
    > [ 1.626451] [] acpi_os_map_memory+0x1d/0x25
    > [ 1.642129] [] acpi_tb_verify_table+0x20/0x49
    > [ 1.658321] [] acpi_get_table_with_size+0x53/0xa1
    > [ 1.675553] [] acpi_get_table+0x10/0x15
    > [ 1.690192] [] acpi_processor_init+0x23/0xab
    > [ 1.706126] [] do_one_initcall+0x33/0x180
    > [ 1.721279] [] ? acpi_processor_init+0x0/0xab
    > [ 1.737479] [] ? register_irq_proc+0xaa/0xc0
    > [ 1.753411] [] ? init_irq_proc+0x67/0x80
    > [ 1.768316] [] kernel_init+0x120/0x176
    > [ 1.782678] [] ? kernel_init+0x0/0x176
    > [ 1.797062] [] kernel_thread_helper+0x7/0x10
    > [ 1.812984] 00000080 + ffe00000

    that is rather later.
    acpi_gbl_permanent_mmap should be set in acpi_early_init()
    if acpi is not disabled

    and we have
    > [ 0.000000] ASUS P2B-DS detected: force use of acpi=ht

    just don't load acpi_processor_init...

    Reported-and-tested-by: Jens Rosenboom
    Signed-off-by: Yinghai Lu
    Acked-by: Ingo Molnar
    Cc: Len Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yinghai Lu
     
  • The return value of the get_temp function is not checked when doing a
    thermal zone update. This may lead to a critical shutdown if get_temp
    fails and the content of the temp variable is incorrectly set higher than
    the critical trip point.

    This has been observed on a system with incorrect ACPI implementation
    where the corresponding methods were not serialized and therefore
    sometimes triggered ACPI errors (AE_ALREADY_EXISTS). The following
    critical shutdowns indicated a temperature of 2097 C, which was obviously
    wrong.

    The patch adds a return value check that jumps over all trip point
    evaluations printing a warning if get_temp fails. The trip points are
    evaluated again on the next polling interval with successful get_temp
    execution.

    Signed-off-by: Michael Brunner
    Acked-by: Zhang Rui
    Cc: Len Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michael Brunner
     
  • Spotted by Hiroshi Shimamoto who also provided the test-case below.

    copy_process() uses signal->count as a reference counter, but it is not.
    This test case

    #include
    #include
    #include
    #include
    #include
    #include

    void *null_thread(void *p)
    {
    for (;;)
    sleep(1);

    return NULL;
    }

    void *exec_thread(void *p)
    {
    execl("/bin/true", "/bin/true", NULL);

    return null_thread(p);
    }

    int main(int argc, char **argv)
    {
    for (;;) {
    pid_t pid;
    int ret, status;

    pid = fork();
    if (pid < 0)
    break;

    if (!pid) {
    pthread_t tid;

    pthread_create(&tid, NULL, exec_thread, NULL);
    for (;;)
    pthread_create(&tid, NULL, null_thread, NULL);
    }

    do {
    ret = waitpid(pid, &status, 0);
    } while (ret == -1 && errno == EINTR);
    }

    return 0;
    }

    quickly creates an unkillable task.

    If copy_process(CLONE_THREAD) races with de_thread()
    copy_signal()->atomic(signal->count) breaks the signal->notify_count
    logic, and the execing thread can hang forever in kernel space.

    Change copy_process() to increment count/live only when we know for sure
    we can't fail. In this case the forked thread will take care of its
    reference to signal correctly.

    If copy_process() fails, check CLONE_THREAD flag. If it it set - do
    nothing, the counters were not changed and current belongs to the same
    thread group. If it is not set, ->signal must be released in any case
    (and ->count must be == 1), the forked child is the only thread in the
    thread group.

    We need more cleanups here, in particular signal->count should not be used
    by de_thread/__exit_signal at all. This patch only fixes the bug.

    Reported-by: Hiroshi Shimamoto
    Tested-by: Hiroshi Shimamoto
    Signed-off-by: Oleg Nesterov
    Acked-by: Roland McGrath
    Cc: KAMEZAWA Hiroyuki
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     
  • An mlocked page might lose the isolatation race. This causes the page to
    clear PG_mlocked while it remains in a VM_LOCKED vma. This means it can
    be put onto the [in]active list. We can rescue it by using try_to_unmap()
    in shrink_page_list().

    But now, As Wu Fengguang pointed out, vmscan has a bug. If the page has
    PG_referenced, it can't reach try_to_unmap() in shrink_page_list() but is
    put into the active list. If the page is referenced repeatedly, it can
    remain on the [in]active list without being moving to the unevictable
    list.

    This patch fixes it.

    Reported-by: Wu Fengguang
    Signed-off-by: Minchan Kim
    Reviewed-by: KOSAKI Motohiro <
    Cc: Lee Schermerhorn
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • It's problematic to allow signed element_nr's or total's to be passed as
    part of the flex array API.

    flex_array_alloc() allows total_nr_elements to be set to a negative
    quantity, which is obviously erroneous.

    flex_array_get() and flex_array_put() allows negative array indices in
    dereferencing an array part, which could address memory mapped before
    struct flex_array.

    The fix is to convert all existing element_nr formals to be qualified as
    unsigned. Existing checks to compare it to total_nr_elements or the max
    array size based on element_size need not be changed.

    Signed-off-by: David Rientjes
    Cc: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • The `parts' member of struct flex_array should evaluate to an incomplete
    type so that sizeof() cannot be used and C99 does not require the
    zero-length specification.

    Signed-off-by: David Rientjes
    Acked-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes