30 Oct, 2009

1 commit


24 Sep, 2009

1 commit

  • Some System p configurations can already have more than 16 nodes so we
    need to increase NODES_SHIFT. I chose 256 to give us some room to grow in the
    future, although we can look at something smaller if the memory bloat is
    considered too much.

    Unless we clamp MAX_ACTIVE_REGIONS we end up with 300kB of extra bloat in
    early_node_map in mm/page_alloc.c:

    < 6144 early_node_map
    > 307200 early_node_map

    due to:

    #if MAX_NUMNODES >= 32
    /* If there can be many nodes, allow up to 50 holes per node */
    #define MAX_ACTIVE_REGIONS (MAX_NUMNODES*50)
    #else
    /* By default, allow up to 256 distinct regions */
    #define MAX_ACTIVE_REGIONS 256

    Since our memory is mostly contiguous it seems reasonable to keep this
    at 256 for now. I also set 32bit to 32 to save space (is there any chance
    a 32bit system will have more than 32 discontiguous memory ranges?).

    Even with that fixed we have a few data structures that grow:

    < 896 bootmem_node_data
    > 14336 bootmem_node_data

    < 1280 node_devices
    > 20480 node_devices

    < 25088 kmalloc_caches
    > 59648 kmalloc_caches

    < 1632 hstates
    > 21792 hstates

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     

21 Sep, 2009

1 commit

  • Bye-bye Performance Counters, welcome Performance Events!

    In the past few months the perfcounters subsystem has grown out its
    initial role of counting hardware events, and has become (and is
    becoming) a much broader generic event enumeration, reporting, logging,
    monitoring, analysis facility.

    Naming its core object 'perf_counter' and naming the subsystem
    'perfcounters' has become more and more of a misnomer. With pending
    code like hw-breakpoints support the 'counter' name is less and
    less appropriate.

    All in one, we've decided to rename the subsystem to 'performance
    events' and to propagate this rename through all fields, variables
    and API names. (in an ABI compatible fashion)

    The word 'event' is also a bit shorter than 'counter' - which makes
    it slightly more convenient to write/handle as well.

    Thanks goes to Stephane Eranian who first observed this misnomer and
    suggested a rename.

    User-space tooling and ABI compatibility is not affected - this patch
    should be function-invariant. (Also, defconfigs were not touched to
    keep the size down.)

    This patch has been generated via the following script:

    FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

    sed -i \
    -e 's/PERF_EVENT_/PERF_RECORD_/g' \
    -e 's/PERF_COUNTER/PERF_EVENT/g' \
    -e 's/perf_counter/perf_event/g' \
    -e 's/nb_counters/nb_events/g' \
    -e 's/swcounter/swevent/g' \
    -e 's/tpcounter_event/tp_event/g' \
    $FILES

    for N in $(find . -name perf_counter.[ch]); do
    M=$(echo $N | sed 's/perf_counter/perf_event/g')
    mv $N $M
    done

    FILES=$(find . -name perf_event.*)

    sed -i \
    -e 's/COUNTER_MASK/REG_MASK/g' \
    -e 's/COUNTER/EVENT/g' \
    -e 's/\/event_id/g' \
    -e 's/counter/event/g' \
    -e 's/Counter/Event/g' \
    $FILES

    ... to keep it as correct as possible. This script can also be
    used by anyone who has pending perfcounters patches - it converts
    a Linux kernel tree over to the new naming. We tried to time this
    change to the point in time where the amount of pending patches
    is the smallest: the end of the merge window.

    Namespace clashes were fixed up in a preparatory patch - and some
    stylistic fallout will be fixed up in a subsequent patch.

    ( NOTE: 'counters' are still the proper terminology when we deal
    with hardware registers - and these sed scripts are a bit
    over-eager in renaming them. I've undone some of that, but
    in case there's something left where 'counter' would be
    better than 'event' we can undo that on an individual basis
    instead of touching an otherwise nicely automated patch. )

    Suggested-by: Stephane Eranian
    Acked-by: Peter Zijlstra
    Acked-by: Paul Mackerras
    Reviewed-by: Arjan van de Ven
    Cc: Mike Galbraith
    Cc: Arnaldo Carvalho de Melo
    Cc: Frederic Weisbecker
    Cc: Steven Rostedt
    Cc: Benjamin Herrenschmidt
    Cc: David Howells
    Cc: Kyle McMartin
    Cc: Martin Schwidefsky
    Cc: "David S. Miller"
    Cc: Thomas Gleixner
    Cc: "H. Peter Anvin"
    Cc:
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Ingo Molnar
     

16 Sep, 2009

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (134 commits)
    powerpc/nvram: Enable use Generic NVRAM driver for different size chips
    powerpc/iseries: Fix oops reading from /proc/iSeries/mf/*/cmdline
    powerpc/ps3: Workaround for flash memory I/O error
    powerpc/booke: Don't set DABR on 64-bit BookE, use DAC1 instead
    powerpc/perf_counters: Reduce stack usage of power_check_constraints
    powerpc: Fix bug where perf_counters breaks oprofile
    powerpc/85xx: Fix SMP compile error and allow NULL for smp_ops
    powerpc/irq: Improve nanodoc
    powerpc: Fix some late PowerMac G5 with PCIe ATI graphics
    powerpc/fsl-booke: Use HW PTE format if CONFIG_PTE_64BIT
    powerpc/book3e: Add missing page sizes
    powerpc/pseries: Fix to handle slb resize across migration
    powerpc/powermac: Thermal control turns system off too eagerly
    powerpc/pci: Merge ppc32 and ppc64 versions of phb_scan()
    powerpc/405ex: support cuImage via included dtb
    powerpc/405ex: provide necessary fixup function to support cuImage
    powerpc/40x: Add support for the ESTeem 195E (PPC405EP) SBC
    powerpc/44x: Add Eiger AMCC (AppliedMicro) PPC460SX evaluation board support.
    powerpc/44x: Update Arches defconfig
    powerpc/44x: Update Arches dts
    ...

    Fix up conflicts in drivers/char/agp/uninorth-agp.c

    Linus Torvalds
     

28 Aug, 2009

2 commits


20 Aug, 2009

2 commits

  • This contains all the bits that didn't fit in previous patches :-) This
    includes the actual exception handlers assembly, the changes to the
    kernel entry, other misc bits and wiring it all up in Kconfig.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     
  • The current definitions set ranges and defaults for 32 and 64-bit
    only using "PPC_STD_MMU" which means hash based MMU. This uselessly
    restrict the usefulness for the upcoming 64-bit BookE port, but more
    than that, it's broken on 32-bit since the only 32-bit platform
    supporting multiple page sizes currently is 44x which does -not-
    have PPC_STD_MMU_32 set.

    This fixes it by using PPC64 and PPC32 instead.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     

14 Aug, 2009

1 commit

  • Now that percpu allows arbitrary embedding of the first chunk,
    powerpc64 can easily be converted to dynamic percpu allocator.
    Convert it. powerpc supports several large page sizes. Cap atom_size
    at 1M. There isn't much to gain by going above that anyway.

    Signed-off-by: Tejun Heo
    Cc: Benjamin Herrenschmidt

    Tejun Heo
     

04 Jul, 2009

1 commit

  • Pull linus#master to merge PER_CPU_DEF_ATTRIBUTES and alpha build fix
    changes. As alpha in percpu tree uses 'weak' attribute instead of
    inline assembly, there's no need for __used attribute.

    Conflicts:
    arch/alpha/include/asm/percpu.h
    arch/mn10300/kernel/vmlinux.lds.S
    include/linux/percpu-defs.h

    Tejun Heo
     

26 Jun, 2009

1 commit

  • Based on initial work from: Dale Farnsworth

    Add the low level irq tracing hooks for 32-bit powerpc needed
    to enable full lockdep functionality.

    The approach taken to deal with the code in entry_32.S is that
    we don't trace all the transitions of MSR:EE when we just turn
    it off to peek at TI_FLAGS without races. Only when we are
    calling into C code or returning from exceptions with a state
    that have changed from what lockdep thinks.

    There's a little bugger though: If we take an exception that
    keeps interrupts enabled (such as an alignment exception) while
    interrupts are enabled, we will call trace_hardirqs_on() on the
    way back spurriously. Not a big deal, but to get rid of it would
    require remembering in pt_regs that the exception was one of the
    type that kept interrupts enabled which we don't know at this
    stage. (Well, we could test all cases for regs->trap but that
    sucks too much).

    Signed-off-by: Benjamin Herrenschmidt
    Tested-by: Kumar Gala

    Benjamin Herrenschmidt
     

24 Jun, 2009

1 commit

  • This patch makes most !CONFIG_HAVE_SETUP_PER_CPU_AREA archs use
    dynamic percpu allocator. The first chunk is allocated using
    embedding helper and 8k is reserved for modules. This ensures that
    the new allocator behaves almost identically to the original allocator
    as long as static percpu variables are concerned, so it shouldn't
    introduce much breakage.

    s390 and alpha use custom SHIFT_PERCPU_PTR() to work around addressing
    range limit the addressing model imposes. Unfortunately, this breaks
    if the address is specified using a variable, so for now, the two
    archs aren't converted.

    The following architectures are affected by this change.

    * sh
    * arm
    * cris
    * mips
    * sparc(32)
    * blackfin
    * avr32
    * parisc (broken, under investigation)
    * m32r
    * powerpc(32)

    As this change makes the dynamic allocator the default one,
    CONFIG_HAVE_DYNAMIC_PER_CPU_AREA is replaced with its invert -
    CONFIG_HAVE_LEGACY_PER_CPU_AREA, which is added to yet-to-be converted
    archs. These archs implement their own setup_per_cpu_areas() and the
    conversion is not trivial.

    * powerpc(64)
    * sparc(64)
    * ia64
    * alpha
    * s390

    Boot and batch alloc/free tests on x86_32 with debug code (x86_32
    doesn't use default first chunk initialization). Compile tested on
    sparc(32), powerpc(32), arm and alpha.

    Kyle McMartin reported that this change breaks parisc. The problem is
    still under investigation and he is okay with pushing this patch
    forward and fixing parisc later.

    [ Impact: use dynamic allocator for most archs w/o custom percpu setup ]

    Signed-off-by: Tejun Heo
    Acked-by: Rusty Russell
    Acked-by: David S. Miller
    Acked-by: Benjamin Herrenschmidt
    Acked-by: Martin Schwidefsky
    Reviewed-by: Christoph Lameter
    Cc: Paul Mundt
    Cc: Russell King
    Cc: Mikael Starvik
    Cc: Ralf Baechle
    Cc: Bryan Wu
    Cc: Kyle McMartin
    Cc: Matthew Wilcox
    Cc: Grant Grundler
    Cc: Hirokazu Takata
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Heiko Carstens
    Cc: Ingo Molnar

    Tejun Heo
     

18 Jun, 2009

1 commit

  • This enables the perf_counter subsystem on 32-bit powerpc. Since we
    don't have any support for hardware counters on 32-bit powerpc yet,
    only software counters can be used.

    Besides selecting HAVE_PERF_COUNTERS for 32-bit powerpc as well as
    64-bit, the main thing this does is add an implementation of
    set_perf_counter_pending(). This needs to arrange for
    perf_counter_do_pending() to be called when interrupts are enabled.
    Rather than add code to local_irq_restore as 64-bit does, the 32-bit
    set_perf_counter_pending() generates an interrupt by setting the
    decrementer to 1 so that a decrementer interrupt will become pending
    in 1 or 2 timebase ticks (if a decrementer interrupt isn't already
    pending). When interrupts are enabled, timer_interrupt() will be
    called, and some new code in there calls perf_counter_do_pending().
    We use a per-cpu array of flags to indicate whether we need to call
    perf_counter_do_pending() or not.

    This introduces a couple of new Kconfig symbols: PPC_HAVE_PMU_SUPPORT,
    which is selected by processor families for which we have hardware PMU
    support (currently only PPC64), and PPC_PERF_CTRS, which enables the
    powerpc-specific perf_counter back-end.

    Signed-off-by: Paul Mackerras
    Cc: Peter Zijlstra
    Cc: linuxppc-dev@ozlabs.org
    Cc: benh@kernel.crashing.org
    LKML-Reference:
    Signed-off-by: Ingo Molnar

    Paul Mackerras
     

15 Jun, 2009

2 commits


09 Jun, 2009

1 commit

  • This patch includes the basic infrastructure to use swiotlb
    bounce buffering on 32-bit powerpc. It is not yet enabled on
    any platforms. Probably the most interesting bit is the
    addition of addr_needs_map to dma_ops - we need this as
    a dma_op because the decision of whether or not an addr
    can be mapped by a device is device-specific.

    Signed-off-by: Becky Bruce
    Acked-by: Kumar Gala
    Signed-off-by: Benjamin Herrenschmidt

    Becky Bruce
     

29 May, 2009

1 commit


27 May, 2009

2 commits

  • The implementation we just revived has issues, such as using a
    Kconfig-defined virtual address area in kernel space that nothing
    actually carves out (and thus will overlap whatever is there),
    or having some dependencies on being self contained in a single
    PTE page which adds unnecessary constraints on the kernel virtual
    address space.

    This fixes it by using more classic PTE accessors and automatically
    locating the area for consistent memory, carving an appropriate hole
    in the kernel virtual address space, leaving only the size of that
    area as a Kconfig option. It also brings some dma-mask related fixes
    from the ARM implementation which was almost identical initially but
    grew its own fixes.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     
  • This reverts commit 33f00dcedb0e22cdb156a23632814fc580fcfcf8.

    While it was a good idea to try to use the mm/vmalloc.c allocator instead
    of our own (in fact, ours is itself a dup on an old variant of the vmalloc
    one), unfortunately, the approach is terminally busted since
    dma_alloc_coherent() can be called at interrupt time or in atomic contexts
    and there's little chances we'll make the code in mm/vmalloc.c cope with\ that :-(

    Until we can get the generic code to forbid that idiocy and fix all
    drivers abusing it, we pretty much have no choice but revert to
    our custom virtual space allocator.

    There's also a problem with SMP safety since freeing such mapping
    would require an IPI which cannot be done at interrupt time.

    However, right now, I don't think we support any platform that is
    both SMP and has non-coherent DMA (don't laugh, I know such things
    do exist !) so we can sort that out later.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     

21 May, 2009

1 commit


03 May, 2009

1 commit

  • The powerpc kernel always requires an Open Firmware like device tree
    to supply device information. On systems without OF, this comes from
    a flattened device tree blob. This blob is usually generated by dtc,
    a tool which compiles a text description of the device tree into the
    flattened format used by the kernel. Sometimes, the bootwrapper makes
    small changes to the pre-compiled device tree blob (e.g. filling in
    the size of RAM). To do this it uses the libfdt library.

    Because these are only used on powerpc, the code for both these tools
    is included under arch/powerpc/boot (these were imported and are
    periodically updated from the upstream dtc tree).

    However, the microblaze architecture, currently being prepared for
    merging to mainline also uses dtc to produce device tree blobs. A few
    other archs have also mentioned some interest in using dtc.
    Therefore, this patch moves dtc and libfdt from arch/powerpc into
    scripts, where it can be used by any architecture.

    The vast bulk of this patch is a literal move, the rest is adjusting
    the various Makefiles to use dtc and libfdt correctly from their new
    locations.

    Signed-off-by: David Gibson
    Signed-off-by: Linus Torvalds

    David Gibson
     

15 Apr, 2009

1 commit


07 Apr, 2009

2 commits


04 Apr, 2009

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (28 commits)
    trivial: Update my email address
    trivial: NULL noise: drivers/mtd/tests/mtd_*test.c
    trivial: NULL noise: drivers/media/dvb/frontends/drx397xD_fw.h
    trivial: Fix misspelling of "Celsius".
    trivial: remove unused variable 'path' in alloc_file()
    trivial: fix a pdlfush -> pdflush typo in comment
    trivial: jbd header comment typo fix for JBD_PARANOID_IOFAIL
    trivial: wusb: Storage class should be before const qualifier
    trivial: drivers/char/bsr.c: Storage class should be before const qualifier
    trivial: h8300: Storage class should be before const qualifier
    trivial: fix where cgroup documentation is not correctly referred to
    trivial: Give the right path in Documentation example
    trivial: MTD: remove EOL from MODULE_DESCRIPTION
    trivial: Fix typo in bio_split()'s documentation
    trivial: PWM: fix of #endif comment
    trivial: fix typos/grammar errors in Kconfig texts
    trivial: Fix misspelling of firmware
    trivial: cgroups: documentation typo and spelling corrections
    trivial: Update contact info for Jochen Hein
    trivial: fix typo "resgister" -> "register"
    ...

    Linus Torvalds
     

01 Apr, 2009

1 commit

  • CONFIG_DEBUG_PAGEALLOC is now supported by x86, powerpc, sparc64, and
    s390. This patch implements it for the rest of the architectures by
    filling the pages with poison byte patterns after free_pages() and
    verifying the poison patterns before alloc_pages().

    This generic one cannot detect invalid page accesses immediately but
    invalid read access may cause invalid dereference by poisoned memory and
    invalid write access can be detected after a long delay.

    Signed-off-by: Akinobu Mita
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Akinobu Mita
     

31 Mar, 2009

1 commit


30 Mar, 2009

1 commit


11 Mar, 2009

1 commit

  • CONFIG_PPC_MULTIPLATFORM is a remain of the pre-powerpc days and isn't
    really meaningful anymore. It was basically equivalent to PPC64 || 6xx.

    This removes it along with the following changes:

    - 32-bit platforms that relied on PPC32 && PPC_MULTIPLATFORM now rely
    on 6xx which is what they want anyway.

    - A new symbol, PPC_BOOK3S, is defined that represent compliance with
    the "Server" variant of the architecture. This is set when either 6xx
    or PPC64 is set and open the door for future BOOK3E 64-bit.

    - 64-bit platforms that relied on PPC64 && PPC_MULTIPLATFORM now use
    PPC64 && PPC_BOOK3S

    - A separate and selectable CONFIG_PPC_OF_BOOT_TRAMPOLINE option is now
    used to control the use of prom_init.c

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     

03 Mar, 2009

1 commit


23 Feb, 2009

5 commits

  • This patch rewrites consistent dma allocations support to use vmalloc
    layer to allocate virtual memory space from vmalloc pool and get rid
    of CONFIG_CONSISTENT_{START,SIZE}.

    This greatly simplifies the code by effectively removing a custom
    allocator we had for virtual space.

    Signed-off-by: Ilya Yanok
    Signed-off-by: Benjamin Herrenschmidt

    Ilya Yanok
     
  • This patch gets function graph tracing working with dynamic function
    tracer on PowerPC32.

    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Steven Rostedt
    Signed-off-by: Benjamin Herrenschmidt

    Steven Rostedt
     
  • This patch ports the function graph tracer for PowerPC, but only
    for static function tracing.

    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Steven Rostedt
    Signed-off-by: Benjamin Herrenschmidt

    Steven Rostedt
     
  • This is the port of the function graph tracer to PowerPC with
    dynamic tracing.

    Geoff Lavand tested on PS3.

    Tested-by: Geoff Levand
    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Steven Rostedt
    Signed-off-by: Benjamin Herrenschmidt

    Steven Rostedt
     
  • This is a port of the function graph tracer that was written by
    Frederic Weisbecker for the x86.

    This only works for PPC64 at the moment and only for static tracing.
    PPC32 and dynamic function graph tracing support will come later.

    The trace produces a visual calling of functions:

    # tracer: function_graph
    #
    # CPU DURATION FUNCTION CALLS
    # | | | | | | |
    0) 2.224 us | }
    0) ! 271.024 us | }
    0) ! 320.080 us | }
    0) ! 324.656 us | }
    0) ! 329.136 us | }
    0) | .put_prev_task_fair() {
    0) | .update_curr() {
    0) 2.240 us | .update_min_vruntime();
    0) 6.512 us | }
    0) 2.528 us | .__enqueue_entity();
    0) + 15.536 us | }
    0) | .pick_next_task_fair() {
    0) 2.032 us | .__pick_next_entity();
    0) 2.064 us | .__clear_buddies();
    0) | .set_next_entity() {
    0) 2.672 us | .__dequeue_entity();
    0) 6.864 us | }

    Geoff Lavand tested on PS3.

    Tested-by: Geoff Levand
    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Steven Rostedt
    Signed-off-by: Benjamin Herrenschmidt

    Steven Rostedt
     

15 Feb, 2009

1 commit

  • This patch adds support for 256KB pages on ppc44x-based boards.

    For simplification of implementation with 256KB pages we still assume
    2-level paging. As a side effect this leads to wasting extra memory space
    reserved for PTE tables: only 1/4 of pages allocated for PTEs are
    actually used. But this may be an acceptable trade-off to achieve the
    high performance we have with big PAGE_SIZEs in some applications (e.g.
    RAID).

    Also with 256KB PAGE_SIZE we increase THREAD_SIZE up to 32KB to minimize
    the risk of stack overflows in the cases of on-stack arrays, which size
    depends on the page size (e.g. multipage BIOs, NTFS, etc.).

    With 256KB PAGE_SIZE we need to decrease the PKMAP_ORDER at least down
    to 9, otherwise all high memory (2 ^ 10 * PAGE_SIZE == 256MB) we'll be
    occupied by PKMAP addresses leaving no place for vmalloc. We do not
    separate PKMAP_ORDER for 256K from 16K/64K PAGE_SIZE here; actually that
    value of 10 in support for 16K/64K had been selected rather intuitively.
    Thus now for all cases of PAGE_SIZE on ppc44x (including the default, 4KB,
    one) we have 512 pages for PKMAP.

    Because ELF standard supports only page sizes up to 64K, then you should
    use binutils later than 2.17.50.0.3 with '-zmax-page-size' set to 256K
    for building applications, which are to be run with the 256KB-page sized
    kernel. If using the older binutils, then you should patch them like follows:

    --- binutils/bfd/elf32-ppc.c.orig
    +++ binutils/bfd/elf32-ppc.c

    -#define ELF_MAXPAGESIZE 0x10000
    +#define ELF_MAXPAGESIZE 0x40000

    One more restriction we currently have with 256KB page sizes is inability
    to use shmem safely, so, for now, the 256KB is available only if you turn
    the CONFIG_SHMEM option off (another variant is to use BROKEN).
    Though, if you need shmem with 256KB pages, you can always remove the !SHMEM
    dependency in 'config PPC_256K_PAGES', and use the workaround available here:
    http://lkml.org/lkml/2008/12/19/20

    Signed-off-by: Yuri Tikhonov
    Signed-off-by: Ilya Yanok
    Signed-off-by: Josh Boyer

    Yuri Tikhonov
     

29 Jan, 2009

3 commits

  • The FSL PCI code depends on PCI quirks being enabled to function
    properly. We can ensure this by doing a select in Kconfig of
    PCI_QUIRKS.

    Signed-off-by: Kumar Gala

    Kumar Gala
     
  • On booke processors, the code that maps low memory only uses up to three
    CAM entries, even though there are sixteen and nothing else uses them.

    Make this number configurable in the advanced options menu along with max
    low memory size. If one wants 1 GB of lowmem, then it's typically
    necessary to have four CAM entries.

    Signed-off-by: Trent Piepho
    Signed-off-by: Kumar Gala

    Trent Piepho
     
  • The code that maps kernel low memory would only use page sizes up to 256
    MB. On E500v2 pages up to 4 GB are supported.

    However, a page must be aligned to a multiple of the page's size. I.e.
    256 MB pages must aligned to a 256 MB boundary. This was enforced by a
    requirement that the physical and virtual addresses of the start of lowmem
    be aligned to 256 MB. Clearly requiring 1GB or 4GB alignment to allow
    pages of that size isn't acceptable.

    To solve this, I simply have adjust_total_lowmem() take alignment into
    account when it decides what size pages to use. Give it PAGE_OFFSET =
    0x7000_0000, PHYSICAL_START = 0x3000_0000, and 2GB of RAM, and it will map
    pages like this:
    PA 0x3000_0000 VA 0x7000_0000 Size 256 MB
    PA 0x4000_0000 VA 0x8000_0000 Size 1 GB
    PA 0x8000_0000 VA 0xC000_0000 Size 256 MB
    PA 0x9000_0000 VA 0xD000_0000 Size 256 MB
    PA 0xA000_0000 VA 0xE000_0000 Size 256 MB

    Because the lowmem mapping code now takes alignment into account,
    PHYSICAL_ALIGN can be lowered from 256 MB to 64 MB. Even lower might be
    possible. The lowmem code will work down to 4 kB but it's possible some of
    the boot code will fail before then. Poor alignment will force small pages
    to be used, which combined with the limited number of TLB1 pages available,
    will result in very little memory getting mapped. So alignments less than
    64 MB probably aren't very useful anyway.

    Signed-off-by: Trent Piepho
    Signed-off-by: Kumar Gala

    Trent Piepho
     

28 Jan, 2009

1 commit