08 Jul, 2008

8 commits

  • 1. add reserve_bootmem_generic for 32bit
    2. change len to unsigned long
    3. make early_res_to_bootmem to use it

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     
  • we are checking mptable early for numaq, so don't need to reserve_bootmem
    for it. bootmem is not there yet.

    do the same thing as 64-bit.

    found it on 64g above system from 64-bit kernel kexec to 32 bit kernel with
    numaq support.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     
  • fix typo in bigsmp switching.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     
  • since we now have 32-bit support for e820_register_active_regions(),
    we can merge the parsing of the mem=/memmap= boot parameters.

    Signed-off-by: Yinghai Lu
    Signed-off-by: Ingo Molnar

    Yinghai Lu
     
  • Signed-off-by: Ingo Molnar

    Ingo Molnar
     
  • This patch uses reserve_bootmem_generic() instead of reserve_bootmem()
    to reserve the crashkernel memory on x86_64. That's necessary for NUMA
    machines, see 00212fef814612245ed0261cbac8426d0c9a31a5:

    [PATCH] Fix kdump Crash Kernel boot memory reservation for NUMA machines

    This patch will fix a boot memory reservation bug that trashes memory on
    the ES7000 when loading the kdump crash kernel.

    The code in arch/x86_64/kernel/setup.c to reserve boot memory for the crash
    kernel uses the non-numa aware "reserve_bootmem" function instead of the
    NUMA aware "reserve_bootmem_generic". I checked to make sure that no other
    function was using "reserve_bootmem" and found none, except the ones that
    had NUMA ifdef'ed out.

    I have tested this patch only on an ES7000 with NUMA on and off (numa=off)
    in a single (non-NUMA) and multi-cell (NUMA) configurations.

    Signed-off-by: Amul Shah
    Looks-good-to: Vivek Goyal
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    The switch-back to reserve_bootmem() was accidentally introduced in
    5c3391f9f749023a49c64d607da4fb49263690eb when adding the BOOTMEM_EXCLUSIVE
    parameter.

    Signed-off-by: Bernhard Walle
    Signed-off-by: Ingo Molnar

    Bernhard Walle
     
  • This patch adds a 'flags' parameter to reserve_bootmem_generic() like it
    already has been added in reserve_bootmem() with commit
    72a7fe3967dbf86cb34e24fbf1d957fe24d2f246.

    It also changes all users to use BOOTMEM_DEFAULT, which doesn't effectively
    change the behaviour. Since the change is x86-specific, I don't think it's
    necessary to add a new API for migration. There are only 4 users of that
    function.

    The change is necessary for the next patch, using reserve_bootmem_generic()
    for crashkernel reservation.

    Signed-off-by: Bernhard Walle
    Signed-off-by: Ingo Molnar

    Bernhard Walle
     
  • Ingo Molnar
     

07 Jul, 2008

3 commits

  • * 'kvm-updates-2.6.26' of git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm:
    KVM: IOAPIC: Fix level-triggered irq injection hang
    x86: KVM guest: Add memory clobber to hypercalls

    Linus Torvalds
     
  • The pxa27x DMA controller defaults to 64-bit alignment. This caused
    the SCR reads to fail (and, depending on card type, error out) when
    card->raw_scr was not aligned on a 8-byte boundary.

    For performance reasons all scatter-gather addresses passed to
    pxamci_request should be aligned on 8-byte boundaries, but if
    this can't be guaranteed, byte aligned DMA transfers in the
    have to be enabled in the controller to get correct behaviour.

    Signed-off-by: Philipp Zabel
    Signed-off-by: Pierre Ossman
    Signed-off-by: Linus Torvalds

    Philipp Zabel
     
  • This reverts commit e872154921a6b5256a3c412dd69158ac0b135176.

    Andrey Borzenkov reports that it resulted in a totally hung machine for
    him when loading the OHCI driver. Extensive netconsole capture with
    SysRq output shows that modprobe gets stuck in ohci_hub_status_data()
    when probing and enabling the OHCI controller, see for example

    http://lkml.org/lkml/2008/7/5/236

    for an analysis.

    The problem appears to be an interrupt flood triggered by the commit
    that gets reverted, and Andrey confirmed that the revert makes things
    work for him again.

    Reported-and-tested-by: Andrey Borzenkov
    Acked-by: Alan Stern
    Acked-by: David Brownell
    Cc: Greg Kroah-Hartman
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

06 Jul, 2008

13 commits

  • The "remote_irr" variable is used to indicate an interrupt
    which has been received by the LAPIC, but not acked.

    In our EOI handler, we unset remote_irr and re-inject the
    interrupt if the interrupt line is still asserted.

    However, we do not set remote_irr here, leading to a
    situation where if kvm_ioapic_set_irq() is called, then we go
    ahead and call ioapic_service(). This means that IRR is
    re-asserted even though the interrupt is currently in service
    (i.e. LAPIC IRR is cleared and ISR/TMR set)

    The issue with this is that when the currently executing
    interrupt handler finishes and writes LAPIC EOI, then TMR is
    unset and EOI sent to the IOAPIC. Since IRR is now asserted,
    but TMR is not, then when the second interrupt is handled,
    no EOI is sent and if there is any pending interrupt, it is
    not re-injected.

    This fixes a hang only seen while running mke2fs -j on an
    8Gb virtio disk backed by a fully sparse raw file, with
    aliguori "avoid fragmented virtio-blk transfers by copying"
    changes.

    Signed-off-by: Mark McLoughlin
    Acked-by: Marcelo Tosatti
    Signed-off-by: Avi Kivity

    Mark McLoughlin
     
  • Hypercalls can modify arbitrary regions of memory. Make sure to indicate this
    in the clobber list. This fixes a hang when using KVM_GUEST kernel built with
    GCC 4.3.0.

    This was originally spotted and analyzed by Marcelo.

    Signed-off-by: Anthony Liguori
    Signed-off-by: Avi Kivity

    Anthony Liguori
     
  • Linus Torvalds
     
  • Fix some issues in pagemap_read noted by Alexey:

    - initialize pagemap_walk.mm to "mm" , so the code starts working as
    advertised

    - initialize ->private to "&pm" so it wouldn't immediately oops in
    pagemap_pte_hole()

    - unstatic struct pagemap_walk, so two threads won't fsckup each other
    (including those started by root, including flipping ->mm when you don't
    have permissions)

    - pagemap_read() contains two calls to ptrace_may_attach(), second one
    looks unneeded.

    - avoid possible kmalloc(0) and integer wraparound.

    Cc: Alexey Dobriyan
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    [ Personally, I'd just remove the functionality entirely - Linus ]
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • These two macros are useful beyond lock debugging. Moved definitions from
    include/linux/debug_locks.h to include/linux/kernel.h, so code that needs
    them does not have to include the former, which would have been a less
    intuitive choice of a header.

    Signed-off-by: Eduard - Gabriel Munteanu
    Acked-by: Pekka Enberg
    Signed-off-by: Linus Torvalds

    Eduard - Gabriel Munteanu
     
  • …/git/tip/linux-2.6-tip

    * 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    softlockup: print a module list on being stuck

    Linus Torvalds
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    x86 ACPI: fix resume from suspend to RAM on uniprocessor x86-64
    x86 ACPI: normalize segment descriptor register on resume

    Linus Torvalds
     
  • Don't use a static entry, so as to prevent races during concurrent use
    of this function.

    Reported-by: Alexey Dobriyan
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6:
    ide: ide_unregister() locking bugfix
    ide: ide_unregister() warm-plug bugfix
    ide: fix hwif->gendev refcounting

    Linus Torvalds
     
  • Commit ea0c62f7cf70f13a67830471b613337bd0c9a62e tried to clear all
    bits in irq_stat but it didn't actually achieve that as irq_stat was
    anded with port_map right after read. This patch makes ahci driver
    always use the unmasked value to clear irq_status.

    While at it, add explanation on the peculiarities of ahci IRQ
    clearing.

    This was spotted by Linus Torvalds.

    Signed-off-by: Tejun Heo
    Signed-off-by: Linus Torvalds

    Tejun Heo
     
  • Holding ide_lock for ide_release_dma_engine() call is unnecessary
    and triggers WARN_ON(irqs_disabled()) in dma_free_coherent().

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • Fix ide_unregister() to work for ports with no devices attached to them.

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • class->dev_release is called by device_release() iff dev->release
    is not present so ide_port_class_release() is never called and the
    last hwif->gendev reference is not dropped.

    Fix it by removing ide_port_class_release() and get_device() call
    from ide_register_port() (device_create_drvdata() takes a hwif->gendev
    reference anyway).

    This patch fixes hang on wait_for_completion(&hwif->gendev_rel_comp)
    in ide_unregister() reported by Pavel Machek.

    Cc: Pavel Machek
    Cc: "Rafael J. Wysocki"
    Cc: Greg KH
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     

05 Jul, 2008

16 commits

  • Most places in the kernel that go BUG: print a module list
    (which is very useful for doing statistics and finding patterns),
    however the softlockup detector does not do this yet.

    This patch adds the one line change to fix this gap.

    Signed-off-by: Arjan van de Ven
    Signed-off-by: Ingo Molnar

    Arjan van de Ven
     
  • Ingo Molnar
     
  • Since the trampoline code is now used for ACPI resume from suspend to RAM,
    the trampoline page tables have to be fixed up during boot not only on SMP
    systems, but also on UP systems that use the trampoline.

    Reference: http://bugzilla.kernel.org/show_bug.cgi?id=10923

    Reported-by: Dionisus Torimens
    Signed-off-by: Rafael J. Wysocki
    Cc: Andi Kleen
    Cc: Andrew Morton
    Cc: pm list
    Signed-off-by: Ingo Molnar

    Rafael J. Wysocki
     
  • Some Dell laptops enter resume with apparent garbage in the segment
    descriptor registers (almost certainly the result of a botched
    transition from protected to real mode.) The only way to clean that
    up is to enter protected mode ourselves and clean out the descriptor
    registers.

    This fixes resume on Dell XPS M1210 and Dell D620.

    Reference: http://bugzilla.kernel.org/show_bug.cgi?id=10927

    Signed-off-by: H. Peter Anvin
    Cc: Andrew Morton
    Cc: Pavel Machek
    Cc: pm list
    Cc: Len Brown
    Signed-off-by: Ingo Molnar
    Tested-by: Kirill A. Shutemov
    Signed-off-by: Rafael J. Wysocki
    Signed-off-by: Ingo Molnar

    H. Peter Anvin
     
  • Flags considered internal to the mempolicy kernel code are stored as part
    of the "flags" member of struct mempolicy.

    Before exposing a policy type to userspace via get_mempolicy(), these
    internal flags must be masked. Flags exposed to userspace, however,
    should still be returned to the user.

    Signed-off-by: David Rientjes
    Signed-off-by: Linus Torvalds

    David Rientjes
     
  • …git/tip/linux-2.6-tip

    * 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
    xen: fix address truncation in pte mfn<->pfn conversion
    arch/x86/mm/init_64.c: early_memtest(): fix types
    x86: fix Intel Mac booting with EFI

    Linus Torvalds
     
  • Even the newer ENE controllers have bugs in their DMA engine that make
    it too dangerous to use. Disable it until someone has figured out under
    which conditions it corrupts data.

    This has caused problems at least once, and can be found as bug report
    10925 in the kernel bugzilla.

    Signed-off-by: Pierre Ossman
    Signed-off-by: Linus Torvalds

    Pierre Ossman
     
  • Document the kernel boot parameter: relax_domain_level=.

    Signed-off-by: Paul Jackson
    Cc: Michael Kerrisk
    Reviewed-by: Li Zefan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul Jackson
     
  • # cat /devcg/devices.list
    a *:* rwm
    # echo a > devices.allow
    # cat /devcg/devices.list
    a *:* rwm
    a 0:0 rwm

    This is odd and maybe confusing. With this patch, writing 'a' to
    devices.allow will add 'a *:* rwm' to the whitelist.

    Also a few fixes and updates to the document.

    Signed-off-by: Li Zefan
    Cc: Pavel Emelyanov
    Cc: Serge E. Hallyn
    Cc: Paul Menage
    Cc: Balbir Singh
    Cc: James Morris
    Cc: Chris Wright
    Cc: Stephen Smalley
    Cc: KAMEZAWA Hiroyuki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Li Zefan
     
  • Acked-By: Debora Velarde
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rajiv Andrade
     
  • There is a bug in the output of /sys/devices/system/node/node[n]/meminfo
    where the Active and Inactive values are in pages instead of Kbytes.

    Looks like this occurred back in 2.6.20 when the code was changed
    over to use node_page_state().

    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Blackwood
     
  • In linux-next there is a commit ("x86: Add performance variants of cpumask
    operators") which, as part of the 4096 cpu support work adds some new APIs
    for dealing with cpu masks. Add trivial versions of these now so that
    subsystems can update in a timely manner and avoid conflicts in linux-next
    and the next merge window.

    Cc: Mike Travis
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Signed-off-by: Stephen Rothwell
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     
  • The CaFe chip has a hardware bug that ends up with us getting a timeout
    value that's too small, causing the following sorts of problems:

    [ 60.525138] mmcblk0: error -110 transferring data
    [ 60.531477] end_request: I/O error, dev mmcblk0, sector 1484353
    [ 60.533371] Buffer I/O error on device mmcblk0p2, logical block 181632
    [ 60.533371] lost page write due to I/O error on mmcblk0p2

    Presumably this is an off-by-one error in the hardware. Incrementing
    the timeout count value that we stuff into the TIMEOUT_CONTROL register
    gets us a value that works. This bug was originally discovered by
    Pierre Ossman, I believe.

    [thanks to Robert Millan for proving that this was still a problem]

    Signed-off-by: Andres Salomon
    Cc: Pierre Ossman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andres Salomon
     
  • This has been sitting around unloved for way too long..

    The Marvell CaFe chip's SD implementation chokes during card insertion
    if one attempts to set the voltage and power up in the same
    SDHCI_POWER_CONTROL register write. This adds a quirk that does
    that particular dance in two steps.

    It also adds an entry to pci_ids.h for the CaFe chip's SD device.

    Signed-off-by: Andres Salomon
    Cc: Pierre Ossman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andres Salomon
     
  • This patch changes the way we determine the maximum number of outstanding
    commands for each controller.

    Most Smart Array controllers can support up to 1024 commands, the notable
    exceptions are the E200 and E200i.

    The next generation of controllers which were just added support a mode of
    operation called Zero Memory Raid (ZMR). In this mode they only support
    64 outstanding commands. In Full Function Raid (FFR) mode they support
    1024.

    We have been setting the queue depth by arbitrarily assigning some value
    for each controller. We needed a better way to set the queue depth to
    avoid lots of annoying "fifo full" messages. So we made the driver a
    little smarter. We now read the config table and subtract 4 from the
    returned value. The -4 is to allow some room for ioctl calls which are
    not tracked the same way as io commands are tracked.

    Please consider this for inclusion.

    Signed-off-by: Mike Miller
    Cc: Jens Axboe
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Miller
     
  • The old one bounces.

    Signed-off-by: Geert Uytterhoeven
    Cc: Andreas Dilger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven