21 Jun, 2008

21 commits

  • With built-in scsi disk driver, the final link fails with a following
    error:
    `.exit.text' referenced in section `.rodata' of drivers/built-in.o:
    defined in discarded section `.exit.text' of drivers/built-in.o

    This happens with -Os (CONFIG_CC_OPTIMIZE_FOR_SIZE=y) with all gcc-4
    versions, and also with -O2 and gcc-4.3.

    The problem is in sd.c:sd_major() being inlined into __exit function
    exit_sd(), and the compiler generating a jump table in .rodata section
    for the 'switch' statement in sd_major(). So we have references to
    discarded section.

    Fixed with a big hammer in the form of -fno-jump-tables.

    Note that jump tables vs. discarded sections is a generic problem,
    other architectures are just lucky not to suffer from it. But with
    a slightly more complex switch/case statement it can be reproduced
    on x86 as well. So maybe at some point we should consider
    -fno-jump-tables as a generic compile option...

    Signed-off-by: Ivan Kokshaysky
    Signed-off-by: Linus Torvalds

    Ivan Kokshaysky
     
  • To calculate addresses of locally defined variables, GCC uses 32-bit
    displacement from the GP. Which doesn't work for per cpu variables in
    modules, as an offset to the kernel per cpu area is way above 4G.

    The workaround is to force allocation of a GOT entry for per cpu variable
    using ldq instruction with a 'literal' relocation.
    I had to use custom asm/percpu.h, as a required argument magic doesn't
    work with asm-generic/percpu.h macros.

    Signed-off-by: Ivan Kokshaysky
    Signed-off-by: Linus Torvalds

    Ivan Kokshaysky
     
  • Linus Torvalds
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
    BAST: Remove old IDE driver
    pcmcia ide kingston compactflash's have a new manufacturer id
    pcmcia: add another pata/ide ID
    pcmcia: add an pata/ide ID
    ide: increase timeout in wait_drive_not_busy()
    palm_bk3710: fix resource management

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394-2.6:
    ieee1394: Kconfig menu touch-up
    firewire: Kconfig menu touch-up
    firewire: deadline for PHY config transmission
    firewire: fw-ohci: unify printk prefixes
    firewire: fill_bus_reset_event needs lock protection
    firewire: fw-ohci: write selfIDBufferPtr before LinkControl.rcvSelfID
    firewire: fw-ohci: disable PHY packet reception into AR context
    firewire: fw-ohci: use of uninitialized data in AR handler
    firewire: don't panic on invalid AR request buffer

    Linus Torvalds
     
  • * 'release' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux-acpi-2.6:
    ACPI: no AC status notification
    ACPI Exception (video-1721): UNKNOWN_STATUS_CODE, Cant attach device

    Linus Torvalds
     
  • * 'drm-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6: (21 commits)
    drm: only trust core drm ioctls - driver ioctls are a mess.
    drm/i915: add support for Intel series 4 chipsets.
    drm/radeon: add hier-z registers for r300 and r500 chipsets
    drm/radeon: use DSTCACHE_CTLSTAT rather than RB2D_DSTCACHE_CTLSTAT
    drm/radeon: switch IGP gart to use radeon_write_agp_base()
    drm/radeon: Restore sw interrupt on resume
    drm/r500: add support for AGP based cards.
    drm/radeon: fix texture uploads with large 3d textures (bug 13980)
    drm/radeon: add initial r500 support.
    drm/radeon: init pipe setup in kernel code.
    drm/radeon: fixup radeon_do_engine_reset
    drm/radeon: fix pixcache and purge/cache flushing registers
    drm/radeon: write AGP_BASE_2 on chips that support it.
    drm/radeon: merge IGP chip setup and fixup RS400 vs RS480 support
    drm/radeon: IGP clean up register and magic numbers.
    drm/rs690: set base 2 to 0.
    drm/rs690: set all of gart base address.
    radeon: add production microcode from AMD
    drm: pcigart use proper pci map interfaces.
    drm: the sg alloc ioctl should write back the handle to userspace
    ...

    Linus Torvalds
     
  • * 'agp-patches' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/agp-2.6:
    [agp]: fixup chipset flush for new Intel G4x.
    agp: brown paper bag patch - put back the two lines it took out.

    Linus Torvalds
     
  • …/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression
    rcupreempt: remove export of rcu_batches_completed_bh
    cpuset: limit the input of cpuset.sched_relax_domain_level

    Linus Torvalds
     
  • …l/git/tip/linux-2.6-tip

    * 'sched-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    sched, delay accounting: fix incorrect delay time when constantly waiting on runqueue
    sched: CPU hotplug events must not destroy scheduler domains created by the cpusets
    sched: rt-group: fix RR buglet
    sched: rt-group: heirarchy aware throttle
    sched: rt-group: fix hierarchy
    sched: NULL pointer dereference while setting sched_rt_period_us
    sched: fix defined-but-unused warning

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86, geode: add a VSA2 ID for General Software
    x86: use BOOTMEM_EXCLUSIVE on 32-bit
    x86, 32-bit: fix boot failure on TSC-less processors
    x86: fix NULL pointer deref in __switch_to
    x86: set PAE PHYSICAL_MASK_SHIFT to 44 bits.

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/cooloney/blackfin-2.6:
    Blackfin Serial Driver: Use timer to poll CTS PIN instead of workqueue.
    Blackfin arch: fix typo error in bf548 serial header file

    Linus Torvalds
     
  • * 'upstream-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev:
    ahci: sis can't do PMP
    ata_piix: add TECRA M4 to broken suspend list
    LIBATA: Add HAVE_PATA_PLATFORM to select PATA_PLATFORM driver
    sata_mv: warn on PIO with multiple DRQs
    sata_mv: enable async_notify for 60x1 Rev.C0 and higher
    libata: don't check whether to use DMA or not for no data commands
    ahci: jmb361 has only one port

    Linus Torvalds
     
  • The inline assembly in drivers/watchdog/hpwdt.c was incredibly broken,
    and included all the function prologue and epilogue stuff, even though
    it was itself then inside a C function where the compiler would add its
    own prologue and epilogue on top of it all.

    This then just _happened_ to work if you had exactly the right compiler
    version and exactly the right compiler flags, so that gcc just happened
    to not create any prologue at all (the gcc-generated epilogue wouldn't
    matter, since it would never be reached).

    But the more proper way to fix it is to simply not do this. Move the
    inline asm to the top level, with no surrounding function at all (the
    better alternative would be to remove the prologue and make it actually
    use proper description of the arguments to the inline asm, but that's a
    bigger change than the one I'm willing to make right now).

    Tested-by: S.Çağlar Onur
    Acked-by: Thomas Mingarelli
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Remove the old BAST IDE driver, as we are now using the platform-pata
    support.

    Signed-off-by: Ben Dooks
    Cc: Jeff Garzik
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Ben Dooks
     
  • Up to now, Kingston compactflash cards (ab)used the Toshiba Manufacturer's ID,
    In their new CF cards, they use a new one. Let's the ide subsystem
    recognize CF cards with the new id.

    Signed-off-by: Christophe Niclaes
    Acked-by: Philippe De Muyter
    Cc: Alan Cox
    Cc: Dominik Brodowski
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Christophe Niclaes
     
  • Addition of Transcend 1GB 45x id so that it is properly detected.

    [bart: fix typo in ide-cs's ID spotted by Alan Cox]

    Signed-off-by: William Peters
    Signed-off-by: Kristoffer Ericson
    CC: Alan Cox
    CC: linux-ide@vger.kernel.org
    Signed-off-by: Dominik Brodowski
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Kristoffer Ericson
     
  • Add an id for:

    product info: "M-Systems", "CF300", ""
    manfid: 0x000a, 0x0000
    function: 4 (fixed disk)

    Signed-off-by: Matt Reimer
    CC: Alan Cox
    CC: linux-ide@vger.kernel.org
    Signed-off-by: Dominik Brodowski
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Matt Reimer
     
  • Some ATAPI devices take longer than the current max timeout value to
    become ready (i.e. TEAC DV-W28ECW takes 6 ms) so increase the timeout
    value to 10 ms.

    This fixes kernel.org bugzilla bug #10887:
    http://bugzilla.kernel.org/show_bug.cgi?id=10887

    Reported-by: Masanari Iida
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • The driver expected a *virtual* address in the IDE platform device's memory
    resource and didn't request the memory region for the register block. Fix this
    taking into account the fact that DaVinci SoC devices are fixed-mapped to the
    virtual memory early and we can get their virtual addresses using IO_ADDRESS()
    macro, not having to call ioremap()...

    While at it, also do some cosmetic changes...

    Signed-off-by: Sergei Shtylyov
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Sergei Shtylyov
     
  • KAMEZAWA Hiroyuki and Oleg Nesterov point out that since the commit
    557ed1fa2620dc119adb86b34c614e152a629a80 ("remove ZERO_PAGE") removed
    the ZERO_PAGE from the VM mappings, any users of get_user_pages() will
    generally now populate the VM with real empty pages needlessly.

    We used to get the ZERO_PAGE when we did the "handle_mm_fault()", but
    since fault handling no longer uses ZERO_PAGE for new anonymous pages,
    we now need to handle that special case in follow_page() instead.

    In particular, the removal of ZERO_PAGE effectively removed the core
    file writing optimization where we would skip writing pages that had not
    been populated at all, and increased memory pressure a lot by allocating
    all those useless newly zeroed pages.

    This reinstates the optimization by making the unmapped PTE case the
    same as for a non-existent page table, which already did this correctly.

    While at it, this also fixes the XIP case for follow_page(), where the
    caller could not differentiate between the case of a page that simply
    could not be used (because it had no "struct page" associated with it)
    and a page that just wasn't mapped.

    We do that by simply returning an error pointer for pages that could not
    be turned into a "struct page *". The error is arbitrarily picked to be
    EFAULT, since that was what get_user_pages() already used for the
    equivalent IO-mapped page case.

    [ Also removed an impossible test for pte_offset_map_lock() failing:
    that's not how that function works ]

    Acked-by: Oleg Nesterov
    Acked-by: Nick Piggin
    Cc: KAMEZAWA Hiroyuki
    Cc: Hugh Dickins
    Cc: Andrew Morton
    Cc: Ingo Molnar
    Cc: Roland McGrath
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

20 Jun, 2008

5 commits


19 Jun, 2008

14 commits

  • General Software writes their own VSA2 module for their version
    of the Geode BIOS, which returns a different ID then the standard
    VSA2. This was causing the framebuffer driver to break for most
    GSW boards.

    Signed-off-by: Jordan Crouse
    Cc: tglx@linutronix.de
    Cc: linux-geode@lists.infradead.org
    Signed-off-by: Ingo Molnar

    Jordan Crouse
     
  • This patch corrects the incorrect value of per process run-queue wait
    time reported by delay statistics. The anomaly was due to the following
    reason. When a process leaves the CPU and immediately starts waiting for
    CPU on the runqueue (which means it remains in the TASK_RUNNABLE state),
    the time of re-entry into the run-queue is never recorded. Due to this,
    the waiting time on the runqueue from this point of re-entry upto the
    next time it hits the CPU is not accounted for. This is solved by
    recording the time of re-entry of a process leaving the CPU in the
    sched_info_depart() function IF the process will go back to waiting on
    the run-queue. This IF condition is verified by checking whether the
    process is still in the TASK_RUNNABLE state.

    The patch was tested on 2.6.26-rc6 using two simple CPU hog programs.
    The values noted prior to the fix did not account for the time spent on
    the runqueue waiting. After the fix, the correct values were reported
    back to user space.

    Signed-off-by: Bharath Ravi
    Signed-off-by: Madhava K R
    Cc: dhaval@linux.vnet.ibm.com
    Cc: vatsa@in.ibm.com
    Cc: balbir@in.ibm.com
    Acked-by: Peter Zijlstra
    Signed-off-by: Ingo Molnar

    Bharath Ravi
     
  • This allows other threads to run when the serial driver polls the CTS
    PIN in a loop.

    Signed-off-by: Sonic Zhang
    Signed-off-by: Bryan Wu

    Sonic Zhang
     
  • Signed-off-by: Sonic Zhang
    Signed-off-by: Bryan Wu

    Sonic Zhang
     
  • This patch uses the BOOTMEM_EXCLUSIVE for crashkernel reservation also for
    i386 and prints a error message on failure.

    The patch is still for 2.6.26 since it is only bug fixing. The unification
    of reserve_crashkernel() between i386 and x86_64 should be done for 2.6.27.

    Signed-off-by: Bernhard Walle
    Signed-off-by: Ingo Molnar
    Cc:

    Bernhard Walle
     
  • Booting 2.6.26-rc6 on my 486 DX/4 fails with a "BUG: Int 6"
    (invalid opcode) and a kernel halt immediately after the
    kernel has been uncompressed. The BUG shows EIP pointing
    to an rdtsc instruction in native_read_tsc(), invoked from
    native_sched_clock().

    (This error occurs so early that not even the serial console
    can capture it.)

    A bisection showed that this bug first occurs in 2.6.26-rc3-git7,
    via commit 9ccc906c97e34fd91dc6aaf5b69b52d824386910:

    >x86: distangle user disabled TSC from unstable
    >
    >tsc_enabled is set to 0 from the command line switch "notsc" and from
    >the mark_tsc_unstable code. Seperate those functionalities and replace
    >tsc_enable with tsc_disable. This makes also the native_sched_clock()
    >decision when to use TSC understandable.
    >
    >Preparatory patch to solve the sched_clock() issue on 32 bit.
    >
    >Signed-off-by: Thomas Gleixner

    The core reason for this bug is that native_sched_clock() gets
    called before tsc_init().

    Before the commit above, tsc_32.c used a "tsc_enabled" variable
    which defaulted to 0 == disabled, and which only got enabled late
    in tsc_init(). Thus early calls to native_sched_clock() would skip
    the TSC and use jiffies instead.

    After the commit above, tsc_32.c uses a "tsc_disabled" variable
    which defaults to 0, meaning that the TSC is Ok to use. Early calls
    to native_sched_clock() now erroneously try to use the TSC on
    !cpu_has_tsc processors, leading to invalid opcode exceptions.

    My proposed fix is to initialise tsc_disabled to a "soft disabled"
    state distinct from the hard disabled state set up by the "notsc"
    kernel option. This fixes the native_sched_clock() problem. It also
    allows tsc_init() to be simplified: instead of setting tsc_disabled = 1
    on every error return, we just set tsc_disabled = 0 once when all
    checks have succeeded.

    I've verified that this lets my 486 boot again. I've also verified
    that a Core2 machine still uses the TSC as clocksource after the patch.

    Signed-off-by: Mikael Pettersson
    Signed-off-by: Ingo Molnar

    Mikael Pettersson
     
  • Patrick McHardy reported a crash:

    > > I get this oops once a day, its apparently triggered by something
    > > run by cron, but the process is a different one each time.
    > >
    > > Kernel is -git from yesterday shortly before the -rc6 release
    > > (last commit is the usb-2.6 merge, the x86 patches are missing),
    > > .config is attached.
    > >
    > > I'll retry with current -git, but the patches that have gone in
    > > since I last updated don't look related.
    > >
    > > [62060.043009] BUG: unable to handle kernel NULL pointer dereference at
    > > 000001ff
    > > [62060.043009] IP: [] __switch_to+0x2f/0x118
    > > [62060.043009] *pde = 00000000
    > > [62060.043009] Oops: 0002 [#1] PREEMPT

    Vegard Nossum analyzed it:

    > This decodes to
    >
    > 0: 0f ae 00 fxsave (%eax)
    >
    > so it's related to the floating-point context. This is the exact
    > location of the crash:
    >
    > $ addr2line -e arch/x86/kernel/process_32.o -i ab0
    > include/asm/i387.h:232
    > include/asm/i387.h:262
    > arch/x86/kernel/process_32.c:595
    >
    > ...so it looks like prev_task->thread.xstate->fxsave has become NULL.
    > Or maybe it never had any other value.

    Somehow (as described below) TS_USEDFPU is set but the fpu is not
    allocated or freed.

    Another possible FPU pre-emption issue with the sleazy FPU optimization
    which was benign before but not so anymore, with the dynamic FPU allocation
    patch.

    New task is getting exec'd and it is prempted at the below point.

    flush_thread() {
    ...
    /*
    * Forget coprocessor state..
    */
    clear_fpu(tsk);
    5, we will do a math_state_restore() which sets
    the task's TS_USEDFPU. After it continues from the above preemption point
    it does clear_used_math() and much later free_thread_xstate().

    Now, at the next context switch, it is quite possible that xstate is
    null, used_math() is not set and TS_USEDFPU is still set. This will
    trigger unlazy_fpu() causing kernel oops.

    Fix this by clearing tsk's fpu_counter before clearing task's fpu.

    Reported-by: Patrick McHardy
    Signed-off-by: Suresh Siddha
    Signed-off-by: Ingo Molnar

    Suresh Siddha
     
  • When a 64-bit x86 processor runs in 32-bit PAE mode, a pte can
    potentially have the same number of physical address bits as the
    64-bit host ("Enhanced Legacy PAE Paging"). This means, in theory,
    we could have up to 52 bits of physical address in a pte.

    The 32-bit kernel uses a 32-bit unsigned long to represent a pfn.
    This means that it can only represent physical addresses up to 32+12=44
    bits wide. Rather than widening pfns everywhere, just set 2^44 as the
    Linux x86_32-PAE architectural limit for physical address size.

    This is a bugfix for two cases:
    1. running a 32-bit PAE kernel on a machine with
    more than 64GB RAM.
    2. running a 32-bit PAE Xen guest on a host machine with
    more than 64GB RAM

    In both cases, a pte could need to have more than 36 bits of physical,
    and masking it to 36-bits will cause fairly severe havoc.

    Signed-off-by: Jeremy Fitzhardinge
    Cc: Jan Beulich
    Cc:
    Signed-off-by: Ingo Molnar

    Jeremy Fitzhardinge
     
  • The touch_nmi_watchdog() routine on x86 ultimately calls
    touch_softlockup_watchdog(). The problem is that to touch the
    softlockup watchdog, the cpu_clock code has to be called which could
    involve multiple cpu locks and can lead to a hard hang if one of the
    locks is held by a processor that is not going to return anytime soon
    (such as could be the case with kgdb or perhaps even with some other
    kind of exception).

    This patch causes the public version of the
    touch_softlockup_watchdog() to defer the cpu clock access to a later
    point.

    The test case for this problem is to use the following kernel config
    options:

    CONFIG_KGDB_TESTS=y
    CONFIG_KGDB_TESTS_ON_BOOT=y
    CONFIG_KGDB_TESTS_BOOT_STRING="V1F100I100000"

    It should be noted that kgdb test suite and these options were not
    available until 2.6.26-rc2, so it was necessary to patch the kgdb
    test suite during the bisection.

    I would consider this patch a regression fix because the problem first
    appeared in commit 27ec4407790d075c325e1f4da0a19c56953cce23 when some
    logic was added to try to periodically sync the clocks. It was
    possible to work around this particular problem by simply not
    performing the sync anytime the system was in a critical context.
    This was ok until commit 3e51f33fcc7f55e6df25d15b55ed10c8b4da84cd,
    which added config option CONFIG_HAVE_UNSTABLE_SCHED_CLOCK and some
    multi-cpu locks to sync the clocks. It became clear that accessing
    this code from an nmi was the source of the lockups. Avoiding the
    access to the low level clock code from an code inside the NMI
    processing also fixed the problem with the 27ec44... commit.

    Signed-off-by: Jason Wessel
    Signed-off-by: Ingo Molnar

    Jason Wessel
     
  • In rcupreempt, rcu_batches_completed_bh is defined as a static inline in
    the header file. This does not need to be exported, and not only that,
    this breaks my PPC build.

    Signed-off-by: Steven Rostedt
    Cc: "Paul E. McKenney"
    Cc: paulus@samba.org
    Cc: linuxppc-dev@ozlabs.org
    Signed-off-by: Thomas Gleixner

    Steven Rostedt
     
  • We allow the inputs to be [-1 ... SD_LV_MAX), and return -EINVAL
    for inputs outside this range.

    Signed-off-by: Li Zefan
    Acked-by: Paul Menage
    Acked-by: Paul Jackson
    Acked-by: Hidetoshi Seto
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Li Zefan
     
  • Ingo Molnar
     
  • First issue is not related to the cpusets. We're simply leaking doms_cur.
    It's allocated in arch_init_sched_domains() which is called for every
    hotplug event. So we just keep reallocation doms_cur without freeing it.
    I introduced free_sched_domains() function that cleans things up.

    Second issue is that sched domains created by the cpusets are
    completely destroyed by the CPU hotplug events. For all CPU hotplug
    events scheduler attaches all CPUs to the NULL domain and then puts
    them all into the single domain thereby destroying domains created
    by the cpusets (partition_sched_domains).
    The solution is simple, when cpusets are enabled scheduler should not
    create default domain and instead let cpusets do that. Which is
    exactly what the patch does.

    Signed-off-by: Max Krasnyansky
    Cc: pj@sgi.com
    Cc: menage@google.com
    Cc: rostedt@goodmis.org
    Acked-by: Peter Zijlstra
    Signed-off-by: Thomas Gleixner

    Max Krasnyansky
     
  • In tick_task_rt() we first call update_curr_rt() which can dequeue a runqueue
    due to it running out of runtime, and then we try to requeue it, of it also
    having exhausted its RR quota. Obviously requeueing something that is no longer
    on the runqueue will not have the expected result.

    Signed-off-by: Peter Zijlstra
    Tested-by: Daniel K.
    Signed-off-by: Ingo Molnar

    Peter Zijlstra