21 May, 2009

6 commits

  • * 'upstream' of git://ftp.linux-mips.org/pub/scm/upstream-linus:
    MIPS: 64-bit: Fix system lockup.
    MIPS: IP28: Change to build with -mr10k-cache-barrier=store
    MIPS: IP22: Fix hang in power button interrupt handler
    MIPS: IP32: Fix hang on shutdown in power button interrupt handler.

    Linus Torvalds
     
  • * master.kernel.org:/home/rmk/linux-2.6-arm: (25 commits)
    [ARM] 5519/1: amba probe: pass "struct amba_id *" instead of void *
    [ARM] 5517/1: integrator: don't put clock lookups in __initdata
    [ARM] 5518/1: versatile: don't put clock lookups in __initdata
    [ARM] mach-l7200: fix spelling of SYS_CLOCK_OFF
    [ARM] Double check memmap is actually valid with a memmap has unexpected holes V2
    [ARM] realview: fix broadcast tick support
    [ARM] realview: remove useless smp_cross_call_done()
    [ARM] smp: fix cpumask usage in ARM SMP code
    [ARM] 5513/1: Eurotech VIPER SBC: fix compilation error
    [ARM] 5509/1: ep93xx: clkdev enable UARTS
    ARM: OMAP2/3: Change omapfb to use clkdev for dispc and rfbi, v2
    ARM: OMAP3: Fix HW SAVEANDRESTORE shift define
    ARM: OMAP3: Fix number of GPIO lines for 34xx
    [ARM] S3C: Do not set clk->owner field if unset
    [ARM] S3C2410: mach-bast.c registering i2c data too early
    [ARM] S3C24XX: Fix unused code warning in arch/arm/plat-s3c24xx/dma.c
    [ARM] S3C64XX: fix GPIO debug
    [ARM] S3C64XX: GPIO include cleanup
    [ARM] nwfpe: fix 'floatx80_is_nan' sparse warning
    [ARM] nwfpe: Add decleration for ExtendedCPDO
    ...

    Linus Torvalds
     
  • The address range size calculation inside local_flush_tlb_kernel_range()
    is being truncated by a too small size variable holder on 64-bit systems.
    The truncated size can result in an erroneous tlbsize check that means we
    sit spinning inside a loop trying to flush a hige number of TLB entries.
    This is for all intents and purposes a system hang. Fix by using an
    appropriately sized valiable to hold the size.

    [Ralf: Greg's original patch submission identified the issue and fixed one
    instance in tlb-r4k.c but there there were several more. For consistency
    I also modified tlb-r3k.c even though that file is only used on 32-bit.]

    Signed-off-by: Greg Ungerer
    Signed-off-by: Ralf Baechle

    Greg Ungerer
     
  • Richard Sandiford's new code for inserting the cache-barriers, for GCC
    4.3 and above and already incorporated in the current GCC-release, uses
    a slightly different option-syntax.

    Signed-off-by: peter fuerst
    Signed-off-by: Ralf Baechle

    peter fuerst
     
  • The hang was caused by the use of disable_irq() from the interrupt handler
    itself. Fixed by the use of disable_irq_nosync(). The issue was
    triggered by:

    commit 3aa551c9b4c40018f0e261a178e3d25478dc04a9
    Author: Thomas Gleixner
    Date: Mon Mar 23 18:28:15 2009 +0100

    genirq: add threaded interrupt handler support

    Signed-off-by: Ralf Baechle

    Ralf Baechle
     
  • The hang was caused by the use of disable_irq() from the interrupt handler
    itself. Fixed by the use of disable_irq_nosync(). The issue was
    triggered by:

    commit 3aa551c9b4c40018f0e261a178e3d25478dc04a9
    Author: Thomas Gleixner
    Date: Mon Mar 23 18:28:15 2009 +0100

    genirq: add threaded interrupt handler support

    Signed-off-by: Ralf Baechle

    Andrew Randrianasulu
     

20 May, 2009

1 commit


19 May, 2009

5 commits


18 May, 2009

11 commits

  • + Fix typographic fault.

    Signed-off-by: Michal Simek

    Michal Simek
     
  • Signed-off-by: Michal Simek

    Michal Simek
     
  • Signed-off-by: Pavel Roskin
    Signed-off-by: Andrew Morton
    Signed-off-by: Russell King

    Pavel Roskin
     
  • pfn_valid() is meant to be able to tell if a given PFN has valid memmap
    associated with it or not. In FLATMEM, it is expected that holes always
    have valid memmap as long as there is valid PFNs either side of the hole.
    In SPARSEMEM, it is assumed that a valid section has a memmap for the
    entire section.

    However, ARM and maybe other embedded architectures in the future free
    memmap backing holes to save memory on the assumption the memmap is never
    used. The page_zone linkages are then broken even though pfn_valid()
    returns true. A walker of the full memmap must then do this additional
    check to ensure the memmap they are looking at is sane by making sure the
    zone and PFN linkages are still valid. This is expensive, but walkers of
    the full memmap are extremely rare.

    This was caught before for FLATMEM and hacked around but it hits again for
    SPARSEMEM because the page_zone linkages can look ok where the PFN linkages
    are totally screwed. This looks like a hatchet job but the reality is that
    any clean solution would end up consumning all the memory saved by punching
    these unexpected holes in the memmap. For example, we tried marking the
    memmap within the section invalid but the section size exceeds the size of
    the hole in most cases so pfn_valid() starts returning false where valid
    memmap exists. Shrinking the size of the section would increase memory
    consumption offsetting the gains.

    This patch identifies when an architecture is punching unexpected holes
    in the memmap that the memory model cannot automatically detect and sets
    ARCH_HAS_HOLES_MEMORYMODEL. At the moment, this is restricted to EP93xx
    which is the model sub-architecture this has been reported on but may expand
    later. When set, walkers of the full memmap must call memmap_valid_within()
    for each PFN and passing in what it expects the page and zone to be for
    that PFN. If it finds the linkages to be broken, it assumes the memmap is
    invalid for that PFN.

    Signed-off-by: Mel Gorman
    Signed-off-by: Russell King

    Mel Gorman
     
  • I don't think anything guarantees that the objects in data.page_aligned
    are a multiple of PAGE_SIZE, thus the section may end on any boundary.

    So the following section, .data.cacheline_aligned needs an explicit
    alignment.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     
  • Refresh and set these options:

    CONFIG_SYSFS_DEPRECATED_V2: y -> n
    CONFIG_INPUT_JOYSTICK: y -> n
    CONFIG_HID_SONY: n -> m
    CONFIG_RTC_DRV_PS3: - -> m

    Signed-off-by: Geoff Levand
    Signed-off-by: Benjamin Herrenschmidt

    Geoff Levand
     
  • After upgrading my distcc boxes from gcc 4.2.2 to 4.4.0, the function
    graph tracer broke. This was discovered on my x86 boxes.

    The issue is that gcc used the same register for an output as it did for
    an input in an asm statement. I first thought this was a bug in gcc and
    reported it. I was notified that gcc was correct and that the output had
    to be flagged as an "early clobber".

    I noticed that powerpc had the same issue and this patch fixes it.

    Signed-off-by: Steven Rostedt
    Signed-off-by: Benjamin Herrenschmidt

    Steven Rostedt
     
  • pr_debug() can now result in code being generated even when #DEBUG
    is not defined. That's not really desirable in the ftrace code
    which we want to be snappy.

    With CONFIG_DYNAMIC_DEBUG=y:

    size before:
    text data bss dec hex filename
    3334 672 4 4010 faa arch/powerpc/kernel/ftrace.o

    size after:
    text data bss dec hex filename
    2616 360 4 2980 ba4 arch/powerpc/kernel/ftrace.o

    Signed-off-by: Michael Ellerman
    Acked-by: Steven Rostedt
    Signed-off-by: Benjamin Herrenschmidt

    Michael Ellerman
     
  • With CONFIG_DEBUG_VM, an assertion is made when changing the protection
    flags of a PTE that the PTE is locked. Huge pages use a different pagetable
    format and the assertion is bogus and will always trigger with a bug looking
    something like

    Unable to handle kernel paging request for data at address 0xf1a00235800006f8
    Faulting instruction address: 0xc000000000034a80
    Oops: Kernel access of bad area, sig: 11 [#1]
    SMP NR_CPUS=32 NUMA Maple
    Modules linked in: dm_snapshot dm_mirror dm_region_hash
    dm_log dm_mod loop evdev ext3 jbd mbcache sg sd_mod ide_pci_generic
    pata_amd ata_generic ipr libata tg3 libphy scsi_mod windfarm_pid
    windfarm_smu_sat windfarm_max6690_sensor windfarm_lm75_sensor
    windfarm_cpufreq_clamp windfarm_core i2c_powermac
    NIP: c000000000034a80 LR: c000000000034b18 CTR: 0000000000000003
    REGS: c000000003037600 TRAP: 0300 Not tainted (2.6.30-rc3-autokern1)
    MSR: 9000000000009032 CR: 28002484 XER: 200fffff
    DAR: f1a00235800006f8, DSISR: 0000000040010000
    TASK = c0000002e54cc740[2960] 'map_high_trunca' THREAD: c000000003034000 CPU: 2
    GPR00: 4000000000000000 c000000003037880 c000000000895d30 c0000002e5a2e500
    GPR04: 00000000a0000000 c0000002edc40880 0000005700000393 0000000000000001
    GPR08: f000000011ac0000 01a00235800006e8 00000000000000f5 f1a00235800006e8
    GPR12: 0000000028000484 c0000000008dd780 0000000000001000 0000000000000000
    GPR16: fffffffffffff000 0000000000000000 00000000a0000000 c000000003037a20
    GPR20: c0000002e5f4ece8 0000000000001000 c0000002edc40880 0000000000000000
    GPR24: c0000002e5f4ece8 0000000000000000 00000000a0000000 c0000002e5f4ece8
    GPR28: 0000005700000393 c0000002e5a2e500 00000000a0000000 c000000003037880
    NIP [c000000000034a80] .assert_pte_locked+0xa4/0xd0
    LR [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
    Call Trace:
    [c000000003037880] [c000000003037990] 0xc000000003037990 (unreliable)
    [c000000003037910] [c000000000034b18] .ptep_set_access_flags+0x6c/0xb4
    [c0000000030379b0] [c00000000014bef8] .hugetlb_cow+0x124/0x674
    [c000000003037b00] [c00000000014c930] .hugetlb_fault+0x4e8/0x6f8
    [c000000003037c00] [c00000000013443c] .handle_mm_fault+0xac/0x828
    [c000000003037cf0] [c0000000000340a8] .do_page_fault+0x39c/0x584
    [c000000003037e30] [c0000000000057b0] handle_page_fault+0x20/0x5c
    Instruction dump:
    7d29582a 7d200074 7800d182 0b000000 3c004000 3960ffff 780007c6 796b00c4
    7d290214 7929a302 1d290068 7d6b4a14 7c000074 7800d182 0b000000

    This patch fixes the problem by not asseting the PTE is locked for VMAs
    backed by huge pages.

    Signed-off-by: Mel Gorman
    Signed-off-by: Benjamin Herrenschmidt

    Mel Gorman
     
  • Russell King
     
  • Having discussed broadcast tick support with Thomas Glexiner, the
    broadcast tick devices should be registered with a higher rating
    than the global tick device, and it should have the ONESHOT and
    PERIODIC feature flags set.

    Signed-off-by: Russell King
    Acked-by: Thomas Glexiner

    Russell King
     

17 May, 2009

6 commits


16 May, 2009

4 commits

  • This makes the framebuffer work on omap3.

    Also fix the clk_get usage for checkpatch.pl
    "ERROR: do not use assignment in if condition".

    Cc: Imre Deak
    Cc: linux-fbdev-devel@lists.sourceforge.net
    Acked-by: Krzysztof Helt
    Signed-off-by: Tony Lindgren

    Tony Lindgren
     
  • The OMAP3430ES2_SAVEANDRESTORE_SHIFT macro is used
    by powerdomain code in
    "1 << OMAP3430ES2_SAVEANDRESTORE_SHIFT" manner, but
    the definition was also (1 << 4), meaning we actually
    modified bit 16. So the definition needs to be 4.

    This fixes also a cold reset HW bug in OMAP3430 ES3.x
    where some of the efuse bits are not isolated during
    wake-up from off mode. This can cause randomish
    cold resets with off mode. Enabling the USBTLL hardware
    SAVEANDRESTORE causes the core power up assert to be
    delayed in a way that we will not get faulty values
    when boot ROM is reading the unisolated registers.

    Signed-off-by: Kalle Jokiniemi
    Acked-by: Kevin Hilman
    Acked-by: Paul Walmsley
    Signed-off-by: Tony Lindgren

    Kalle Jokiniemi
     
  • As per 3430 TRM, there are 6 banks [0 to 191]

    Signed-off-by: Tom Rix
    Signed-off-by: Vikram Pandita
    Signed-off-by: Tony Lindgren

    Vikram Pandita
     
  • Xiaohui Xin and some other folks at Intel have been looking into what's
    behind the performance hit of paravirt_ops when running native.

    It appears that the hit is entirely due to the paravirtualized
    spinlocks introduced by:

    | commit 8efcbab674de2bee45a2e4cdf97de16b8e609ac8
    | Date: Mon Jul 7 12:07:51 2008 -0700
    |
    | paravirt: introduce a "lock-byte" spinlock implementation

    The extra call/return in the spinlock path is somehow
    causing an increase in the cycles/instruction of somewhere around 2-7%
    (seems to vary quite a lot from test to test). The working theory is
    that the CPU's pipeline is getting upset about the
    call->call->locked-op->return->return, and seems to be failing to
    speculate (though I haven't seen anything definitive about the precise
    reasons). This doesn't entirely make sense, because the performance
    hit is also visible on unlock and other operations which don't involve
    locked instructions. But spinlock operations clearly swamp all the
    other pvops operations, even though I can't imagine that they're
    nearly as common (there's only a .05% increase in instructions
    executed).

    If I disable just the pv-spinlock calls, my tests show that pvops is
    identical to non-pvops performance on native (my measurements show that
    it is actually about .1% faster, but Xiaohui shows a .05% slowdown).

    Summary of results, averaging 10 runs of the "mmperf" test, using a
    no-pvops build as baseline:

    nopv Pv-nospin Pv-spin
    CPU cycles 100.00% 99.89% 102.18%
    instructions 100.00% 100.10% 100.15%
    CPI 100.00% 99.79% 102.03%
    cache ref 100.00% 100.84% 100.28%
    cache miss 100.00% 90.47% 88.56%
    cache miss rate 100.00% 89.72% 88.31%
    branches 100.00% 99.93% 100.04%
    branch miss 100.00% 103.66% 107.72%
    branch miss rt 100.00% 103.73% 107.67%
    wallclock 100.00% 99.90% 102.20%

    The clear effect here is that the 2% increase in CPI is
    directly reflected in the final wallclock time.

    (The other interesting effect is that the more ops are
    out of line calls via pvops, the lower the cache access
    and miss rates. Not too surprising, but it suggests that
    the non-pvops kernel is over-inlined. On the flipside,
    the branch misses go up correspondingly...)

    So, what's the fix?

    Paravirt patching turns all the pvops calls into direct calls, so
    _spin_lock etc do end up having direct calls. For example, the compiler
    generated code for paravirtualized _spin_lock is:

    : mov %gs:0xb4c8,%rax
    : incl 0xffffffffffffe044(%rax)
    : callq *0xffffffff805a5b30
    : retq

    The indirect call will get patched to:
    : mov %gs:0xb4c8,%rax
    : incl 0xffffffffffffe044(%rax)
    : callq
    : nop; nop /* or whatever 2-byte nop */
    : retq

    One possibility is to inline _spin_lock, etc, when building an
    optimised kernel (ie, when there's no spinlock/preempt
    instrumentation/debugging enabled). That will remove the outer
    call/return pair, returning the instruction stream to a single
    call/return, which will presumably execute the same as the non-pvops
    case. The downsides arel 1) it will replicate the
    preempt_disable/enable code at eack lock/unlock callsite; this code is
    fairly small, but not nothing; and 2) the spinlock definitions are
    already a very heavily tangled mass of #ifdefs and other preprocessor
    magic, and making any changes will be non-trivial.

    The other obvious answer is to disable pv-spinlocks. Making them a
    separate config option is fairly easy, and it would be trivial to
    enable them only when Xen is enabled (as the only non-default user).
    But it doesn't really address the common case of a distro build which
    is going to have Xen support enabled, and leaves the open question of
    whether the native performance cost of pv-spinlocks is worth the
    performance improvement on a loaded Xen system (10% saving of overall
    system CPU when guests block rather than spin). Still it is a
    reasonable short-term workaround.

    [ Impact: fix pvops performance regression when running native ]

    Analysed-by: "Xin Xiaohui"
    Analysed-by: "Li Xin"
    Analysed-by: "Nakajima Jun"
    Signed-off-by: Jeremy Fitzhardinge
    Acked-by: H. Peter Anvin
    Cc: Nick Piggin
    Cc: Xen-devel
    LKML-Reference:
    [ fixed the help text ]
    Signed-off-by: Ingo Molnar

    Jeremy Fitzhardinge
     

15 May, 2009

7 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound-2.6:
    ASoC: DaVinci EVM board support buildfixes
    ASoC: DaVinci I2S updates
    ASoC: davinci-pcm buildfixes
    ALSA: pcsp: fix printk format warning
    ALSA: riptide: postfix increment and off by one
    pxa2xx-ac97: fix reset gpio mode setting
    ASoC: soc-core: fix crash when removing not instantiated card

    Linus Torvalds
     
  • * 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jwessel/linux-2.6-kgdb:
    kgdb: gdb documentation fix
    kgdb,i386: use address that SP register points to in the exception frame
    sysrq, intel_fb: fix sysrq g collision

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.dk/linux-2.6-block:
    Revert "mm: add /proc controls for pdflush threads"
    viocd: needs to depend on BLOCK
    block: fix the bio_vec array index out-of-bounds test

    Linus Torvalds
     
  • * 'merge' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc:
    powerpc: Fix PCI ROM access
    powerpc/pseries: Really fix the oprofile CPU type on pseries
    serial/nwpserial: Fix wrong register read address and add interrupt acknowledge.
    powerpc/cell: Make ptcal more reliable
    powerpc: Allow mem=x cmdline to work with 4G+
    powerpc/mpic: Fix incorrect allocation of interrupt rev-map
    powerpc: Fix oprofile sampling of marked events on POWER7
    powerpc/iseries: Fix pci breakage due to bad dma_data initialization
    powerpc: Fix mktree build error on Mac OS X host
    powerpc/virtex: Fix duplicate level irq events.
    powerpc/virtex: Add uImage to the default images list
    powerpc/boot: add simpleImage.* to clean-files list
    powerpc/8xx: Update defconfigs
    powerpc/embedded6xx: Update defconfigs
    powerpc/86xx: Update defconfigs
    powerpc/85xx: Update defconfigs
    powerpc/83xx: Update defconfigs
    powerpc/fsl_soc: Remove mpc83xx_wdt_init, again

    Linus Torvalds
     
  • The s3c24xx_register_clock() function has been doing a test
    on clk->owner to see if it is NULL, and then setting itself
    as the owner if clk->owner == NULL.

    This is not needed, arch/arm/plat-s3c/clock.c cannot be
    compiled as a module, and even if it was, it should not be
    playing with this field if it being registered from somewhere
    else.

    The best course of action is to remove this bit of
    code completely.

    Signed-off-by: Ben Dooks

    Ben Dooks
     
  • The BAST support code is calling s3c_i2c0_set_platdata() from
    the map_io() entry, instead of the bast_init() code. This causes
    the registration to fail due to kmalloc() not being available
    at the time.

    This fixes the following error:
    s3c_i2c0_set_platdata: no memory for platform data

    Signed-off-by: Ben Dooks

    Ben Dooks
     
  • Fix unused code warning in arch/arm/plat-s3c24xx/dma.c if there
    is no PM support enabled. The function to_dma_chan() should
    be marked inline so that the compiler will eliminate it without
    warning if it isn't used.

    arch/arm/plat-s3c24xx/dma.c:1239: warning: 'to_dma_chan' defined but not used

    Signed-off-by: Ben Dooks

    Ben Dooks