07 Jan, 2012

1 commit

  • * 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/benh/powerpc: (185 commits)
    powerpc: fix compile error with 85xx/p1010rdb.c
    powerpc: fix compile error with 85xx/p1023_rds.c
    powerpc/fsl: add MSI support for the Freescale hypervisor
    arch/powerpc/sysdev/fsl_rmu.c: introduce missing kfree
    powerpc/fsl: Add support for Integrated Flash Controller
    powerpc/fsl: update compatiable on fsl 16550 uart nodes
    powerpc/85xx: fix PCI and localbus properties in p1022ds.dts
    powerpc/85xx: re-enable ePAPR byte channel driver in corenet32_smp_defconfig
    powerpc/fsl: Update defconfigs to enable some standard FSL HW features
    powerpc: Add TBI PHY node to first MDIO bus
    sbc834x: put full compat string in board match check
    powerpc/fsl-pci: Allow 64-bit PCIe devices to DMA to any memory address
    powerpc: Fix unpaired probe_hcall_entry and probe_hcall_exit
    offb: Fix setting of the pseudo-palette for >8bpp
    offb: Add palette hack for qemu "standard vga" framebuffer
    offb: Fix bug in calculating requested vram size
    powerpc/boot: Change the WARN to INFO for boot wrapper overlap message
    powerpc/44x: Fix build error on currituck platform
    powerpc/boot: Change the load address for the wrapper to fit the kernel
    powerpc/44x: Enable CRASH_DUMP for 440x
    ...

    Fix up a trivial conflict in arch/powerpc/include/asm/cputime.h due to
    the additional sparse-checking code for cputime_t.

    Linus Torvalds
     

09 Dec, 2011

1 commit

  • powerpc doesn't access early_node_map[] directly and enabling
    HAVE_MEMBLOCK_NODE_MAP is trivial - replacing add_active_range() calls
    with memblock_set_node() and selecting HAVE_MEMBLOCK_NODE_MAP is
    enough.

    Signed-off-by: Tejun Heo
    Cc: Benjamin Herrenschmidt
    Cc: Yinghai Lu

    Tejun Heo
     

08 Dec, 2011

1 commit

  • With CONFIG_STRICT_DEVMEM=y, user space cannot read any part of /dev/mem.
    Since this breaks librtas, punch a hole in /dev/mem to allow access to the
    rmo_buffer that librtas needs.

    Anton Blanchard reported the problem and helped with the fix.

    A quick test for this patch:

    # cat /proc/rtas/rmo_buffer
    000000000f190000 10000

    # python -c "print 0x000000000f190000 / 0x10000"
    3865

    # dd if=/dev/mem of=/tmp/foo count=1 bs=64k skip=3865
    1+0 records in
    1+0 records out
    65536 bytes (66 kB) copied, 0.000205235 s, 319 MB/s

    # dd if=/dev/mem of=/tmp/foo
    dd: reading `/dev/mem': Operation not permitted
    0+0 records in
    0+0 records out
    0 bytes (0 B) copied, 0.00022519 s, 0.0 kB/s

    Signed-off-by: Sukadev Bhattiprolu
    Signed-off-by: Benjamin Herrenschmidt

    sukadev@linux.vnet.ibm.com
     

07 Dec, 2011

1 commit


28 Nov, 2011

1 commit

  • As described in the help text in the patch, this token restricts general
    access to /dev/mem as a way of increasing the security. Specifically, access
    to exclusive IOMEM and kernel RAM is denied unless CONFIG_STRICT_DEVMEM is
    set to 'n'.

    Implement the 'devmem_is_allowed()' interface for Powerpc. It will be
    called from range_is_allowed() when userpsace attempts to access /dev/mem.

    This patch is based on an earlier patch from Steve Best and with input from
    Paul Mackerras and Scott Wood.

    [BenH] Fixed a typo or two and removed the generic change which should
    be submitted as a separate patch

    Signed-off-by: Sukadev Bhattiprolu
    Signed-off-by: Benjamin Herrenschmidt

    sukadev@linux.vnet.ibm.com
     

08 Nov, 2011

1 commit

  • We've resisted adding System RAM to /proc/iomem because it is
    the wrong place for it. Unfortunately we continue to find tools
    that rely on this behaviour so give up and add it in.

    Signed-off-by: Anton Blanchard
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     

07 Nov, 2011

1 commit

  • * 'modsplit-Oct31_2011' of git://git.kernel.org/pub/scm/linux/kernel/git/paulg/linux: (230 commits)
    Revert "tracing: Include module.h in define_trace.h"
    irq: don't put module.h into irq.h for tracking irqgen modules.
    bluetooth: macroize two small inlines to avoid module.h
    ip_vs.h: fix implicit use of module_get/module_put from module.h
    nf_conntrack.h: fix up fallout from implicit moduleparam.h presence
    include: replace linux/module.h with "struct module" wherever possible
    include: convert various register fcns to macros to avoid include chaining
    crypto.h: remove unused crypto_tfm_alg_modname() inline
    uwb.h: fix implicit use of asm/page.h for PAGE_SIZE
    pm_runtime.h: explicitly requires notifier.h
    linux/dmaengine.h: fix implicit use of bitmap.h and asm/page.h
    miscdevice.h: fix up implicit use of lists and types
    stop_machine.h: fix implicit use of smp.h for smp_processor_id
    of: fix implicit use of errno.h in include/linux/of.h
    of_platform.h: delete needless include
    acpi: remove module.h include from platform/aclinux.h
    miscdevice.h: delete unnecessary inclusion of module.h
    device_cgroup.h: delete needless include
    net: sch_generic remove redundant use of
    net: inet_timewait_sock doesnt need
    ...

    Fix up trivial conflicts (other header files, and removal of the ab3550 mfd driver) in
    - drivers/media/dvb/frontends/dibx000_common.c
    - drivers/media/video/{mt9m111.c,ov6650.c}
    - drivers/mfd/ab3550-core.c
    - include/linux/dmaengine.h

    Linus Torvalds
     

01 Nov, 2011

1 commit


20 Sep, 2011

2 commits

  • If we echo an address the hypervisor doesn't like to
    /sys/devices/system/memory/probe we oops the box:

    # echo 0x10000000000 > /sys/devices/system/memory/probe

    kernel BUG at arch/powerpc/mm/hash_utils_64.c:541!

    The backtrace is:

    create_section_mapping
    arch_add_memory
    add_memory
    memory_probe_store
    sysdev_class_store
    sysfs_write_file
    vfs_write
    SyS_write

    In create_section_mapping we BUG if htab_bolt_mapping returned
    an error. A better approach is to return an error which will
    propagate back to userspace.

    Rerunning the test with this patch applied:

    # echo 0x10000000000 > /sys/devices/system/memory/probe
    -bash: echo: write error: Invalid argument

    Signed-off-by: Anton Blanchard
    Cc: stable@kernel.org
    Signed-off-by: Benjamin Herrenschmidt

    Anton Blanchard
     
  • Enable hugepages on Freescale BookE processors. This allows the kernel to
    use huge TLB entries to map pages, which can greatly reduce the number of
    TLB misses and the amount of TLB thrashing experienced by applications with
    large memory footprints. Care should be taken when using this on FSL
    processors, as the number of large TLB entries supported by the core is low
    (16-64) on current processors.

    The supported set of hugepage sizes include 4m, 16m, 64m, 256m, and 1g.
    Page sizes larger than the max zone size are called "gigantic" pages and
    must be allocated on the command line (and cannot be deallocated).

    This is currently only fully implemented for Freescale 32-bit BookE
    processors, but there is some infrastructure in the code for
    64-bit BooKE.

    Signed-off-by: Becky Bruce
    Signed-off-by: David Gibson
    Signed-off-by: Benjamin Herrenschmidt

    Becky Bruce
     

19 Jul, 2011

1 commit

  • On 32bit platforms that support >= 4GB memory total_ram was truncated.
    This creates a confusing printk:
    Top of RAM: 0x100000000, Total RAM: 0x0
    Fix that:
    Top of RAM: 0x100000000, Total RAM: 0x100000000

    Signed-off-by: Tony Breeds
    Acked-by: Josh Boyer
    Signed-off-by: Benjamin Herrenschmidt

    Tony Breeds
     

08 Jul, 2011

1 commit


30 Jun, 2011

3 commits


29 Jun, 2011

1 commit


09 Jun, 2011

1 commit

  • When using 64K pages with a separate cpio rootfs, U-Boot will align
    the rootfs on a 4K page boundary. When the memory is reserved, and
    subsequent early memblock_alloc is called, it will allocate memory
    between the 64K page alignment and reserved memory. When the reserved
    memory is subsequently freed, it is done so by pages, causing the
    early memblock_alloc requests to be re-used, which in my case, caused
    the device-tree to be clobbered.

    This patch forces the reserved memory for initrd to be kernel page
    aligned, and will move the device tree if it overlaps with the range
    extension of initrd. This patch will also consolidate the identical
    function free_initrd_mem() from mm/init_32.c, init_64.c to mm/mem.c,
    and adds the same range extension when freeing initrd. free_initrd_mem()
    is also moved to the __init section.

    Many thanks to Milton Miller for his input on this patch.

    [BenH: Fixed build without CONFIG_BLK_DEV_INITRD]

    Signed-off-by: Dave Carroll
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     

31 Mar, 2011

1 commit


13 Oct, 2010

1 commit

  • We need to round memory regions correctly -- specifically, we need to
    round reserved region in the more expansive direction (lower limit
    down, upper limit up) whereas usable memory regions need to be rounded
    in the more restrictive direction (lower limit up, upper limit down).

    This introduces two set of inlines:

    memblock_region_memory_base_pfn()
    memblock_region_memory_end_pfn()
    memblock_region_reserved_base_pfn()
    memblock_region_reserved_end_pfn()

    Although they are antisymmetric (and therefore are technically
    duplicates) the use of the different inlines explicitly documents the
    programmer's intention.

    The lack of proper rounding caused a bug on ARM, which was then found
    to also affect other architectures.

    Reported-by: Russell King
    Signed-off-by: Yinghai Lu
    LKML-Reference:
    Cc: Jeremy Fitzhardinge
    Signed-off-by: H. Peter Anvin

    Yinghai Lu
     

05 Aug, 2010

1 commit


04 Aug, 2010

2 commits


14 Jul, 2010

1 commit

  • via following scripts

    FILES=$(find * -type f | grep -vE 'oprofile|[^K]config')

    sed -i \
    -e 's/lmb/memblock/g' \
    -e 's/LMB/MEMBLOCK/g' \
    $FILES

    for N in $(find . -name lmb.[ch]); do
    M=$(echo $N | sed 's/lmb/memblock/g')
    mv $N $M
    done

    and remove some wrong change like lmbench and dlmb etc.

    also move memblock.c from lib/ to mm/

    Suggested-by: Ingo Molnar
    Acked-by: "H. Peter Anvin"
    Acked-by: Benjamin Herrenschmidt
    Acked-by: Linus Torvalds
    Signed-off-by: Yinghai Lu
    Signed-off-by: Benjamin Herrenschmidt

    Yinghai Lu
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

19 Mar, 2010

1 commit

  • powerpc initializes swiotlb before parsing the kernel boot options so
    swiotlb options (e.g. specifying the swiotlb buffer size) are ignored.

    Any time before freeing bootmem works for swiotlb so this patch moves
    powerpc's swiotlb initialization after parsing the kernel boot
    options, mem_init (as x86 does).

    Signed-off-by: FUJITA Tomonori
    Tested-by: Becky Bruce
    Tested-by: Albert Herranz
    Signed-off-by: Benjamin Herrenschmidt

    FUJITA Tomonori
     

21 Feb, 2010

1 commit

  • On VIVT ARM, when we have multiple shared mappings of the same file
    in the same MM, we need to ensure that we have coherency across all
    copies. We do this via make_coherent() by making the pages
    uncacheable.

    This used to work fine, until we allowed highmem with highpte - we
    now have a page table which is mapped as required, and is not available
    for modification via update_mmu_cache().

    Ralf Beache suggested getting rid of the PTE value passed to
    update_mmu_cache():

    On MIPS update_mmu_cache() calls __update_tlb() which walks pagetables
    to construct a pointer to the pte again. Passing a pte_t * is much
    more elegant. Maybe we might even replace the pte argument with the
    pte_t?

    Ben Herrenschmidt would also like the pte pointer for PowerPC:

    Passing the ptep in there is exactly what I want. I want that
    -instead- of the PTE value, because I have issue on some ppc cases,
    for I$/D$ coherency, where set_pte_at() may decide to mask out the
    _PAGE_EXEC.

    So, pass in the mapped page table pointer into update_mmu_cache(), and
    remove the PTE value, updating all implementations and call sites to
    suit.

    Includes a fix from Stephen Rothwell:

    sparc: fix fallout from update_mmu_cache API change

    Signed-off-by: Stephen Rothwell

    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Russell King

    Russell King
     

30 Oct, 2009

1 commit

  • The hugepage arch code provides a number of hook functions/macros
    which mirror the functionality of various normal page pte access
    functions. Various changes in the normal page accessors (in
    particular BenH's recent changes to the handling of lazy icache
    flushing and PAGE_EXEC) have caused the hugepage versions to get out
    of sync with the originals. In some cases, this is a bug, at least on
    some MMU types.

    One of the reasons that some hooks were not identical to the normal
    page versions, is that the fact we're dealing with a hugepage needed
    to be passed down do use the correct dcache-icache flush function.
    This patch makes the main flush_dcache_icache_page() function hugepage
    aware (by checking for the PageCompound flag). That in turn means we
    can make set_huge_pte_at() just a call to set_pte_at() bringing it
    back into sync. As a bonus, this lets us remove the
    hash_huge_page_do_lazy_icache() function, replacing it with a call to
    the hash_page_do_lazy_icache() function it was based on.

    Some other hugepage pte access hooks - huge_ptep_get_and_clear() and
    huge_ptep_clear_flush() - are not so easily unified, but this patch at
    least brings them back into sync with the current versions of the
    corresponding normal page functions.

    Signed-off-by: David Gibson
    Signed-off-by: Benjamin Herrenschmidt

    David Gibson
     

23 Sep, 2009

1 commit

  • Originally, walk_memory_resource() was introduced to traverse all memory
    of "System RAM" for detecting memory hotplug/unplug range. For doing so,
    flags of IORESOUCE_MEM|IORESOURCE_BUSY was used and this was enough for
    memory hotplug.

    But for using other purpose, /proc/kcore, this may includes some firmware
    area marked as IORESOURCE_BUSY | IORESOUCE_MEM. This patch makes the
    check strict to find out busy "System RAM".

    Note: PPC64 keeps their own walk_memory_resouce(), which walk through
    ppc64's lmb informaton. Because old kclist_add() is called per lmb, this
    patch makes no difference in behavior, finally.

    And this patch removes CONFIG_MEMORY_HOTPLUG check from this function.
    Because pfn_valid() just show "there is memmap or not* and cannot be used
    for "there is physical memory or not", this function is useful in generic
    to scan physical memory range.

    Signed-off-by: KAMEZAWA Hiroyuki
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: WANG Cong
    Cc: Américo Wang
    Cc: David Rientjes
    Cc: Roland Dreier
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KAMEZAWA Hiroyuki
     

22 Sep, 2009

1 commit

  • Commit 96177299416dbccb73b54e6b344260154a445375 ("Drop free_pages()")
    modified nr_free_pages() to return 'unsigned long' instead of 'unsigned
    int'. This made the casts to 'unsigned long' in most callers superfluous,
    so remove them.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Geert Uytterhoeven
    Reviewed-by: Christoph Lameter
    Acked-by: Ingo Molnar
    Acked-by: Russell King
    Acked-by: David S. Miller
    Acked-by: Kyle McMartin
    Acked-by: WANG Cong
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Haavard Skinnemoen
    Cc: Mikael Starvik
    Cc: "Luck, Tony"
    Cc: Hirokazu Takata
    Cc: Ralf Baechle
    Cc: David Howells
    Acked-by: Benjamin Herrenschmidt
    Cc: Martin Schwidefsky
    Cc: Paul Mundt
    Cc: Chris Zankel
    Cc: Michal Simek
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Geert Uytterhoeven
     

27 May, 2009

2 commits

  • The implementation we just revived has issues, such as using a
    Kconfig-defined virtual address area in kernel space that nothing
    actually carves out (and thus will overlap whatever is there),
    or having some dependencies on being self contained in a single
    PTE page which adds unnecessary constraints on the kernel virtual
    address space.

    This fixes it by using more classic PTE accessors and automatically
    locating the area for consistent memory, carving an appropriate hole
    in the kernel virtual address space, leaving only the size of that
    area as a Kconfig option. It also brings some dma-mask related fixes
    from the ARM implementation which was almost identical initially but
    grew its own fixes.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     
  • Make FIXADDR_TOP a compile time constant and cleanup a
    couple of definitions relative to the layout of the kernel
    address space on ppc32. We also print out that layout at
    boot time for debugging purposes.

    This is a pre-requisite for properly fixing non-coherent
    DMA allocactions.

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     

15 May, 2009

1 commit


11 Feb, 2009

1 commit

  • This patch reworks the way we do I and D cache coherency on PowerPC.

    The "old" way was split in 3 different parts depending on the processor type:

    - Hash with per-page exec support (64-bit and >= POWER4 only) does it
    at hashing time, by preventing exec on unclean pages and cleaning pages
    on exec faults.

    - Everything without per-page exec support (32-bit hash, 8xx, and
    64-bit < POWER4) does it for all page going to user space in update_mmu_cache().

    - Embedded with per-page exec support does it from do_page_fault() on
    exec faults, in a way similar to what the hash code does.

    That leads to confusion, and bugs. For example, the method using update_mmu_cache()
    is racy on SMP where another processor can see the new PTE and hash it in before
    we have cleaned the cache, and then blow trying to execute. This is hard to hit but
    I think it has bitten us in the past.

    Also, it's inefficient for embedded where we always end up having to do at least
    one more page fault.

    This reworks the whole thing by moving the cache sync into two main call sites,
    though we keep different behaviours depending on the HW capability. The call
    sites are set_pte_at() which is now made out of line, and ptep_set_access_flags()
    which joins the former in pgtable.c

    The base idea for Embedded with per-page exec support, is that we now do the
    flush at set_pte_at() time when coming from an exec fault, which allows us
    to avoid the double fault problem completely (we can even improve the situation
    more by implementing TLB preload in update_mmu_cache() but that's for later).

    If for some reason we didn't do it there and we try to execute, we'll hit
    the page fault, which will do a minor fault, which will hit ptep_set_access_flags()
    to do things like update _PAGE_ACCESSED or _PAGE_DIRTY if needed, we just make
    this guys also perform the I/D cache sync for exec faults now. This second path
    is the catch all for things that weren't cleaned at set_pte_at() time.

    For cpus without per-pag exec support, we always do the sync at set_pte_at(),
    thus guaranteeing that when the PTE is visible to other processors, the cache
    is clean.

    For the 64-bit hash with per-page exec support case, we keep the old mechanism
    for now. I'll look into changing it later, once I've reworked a bit how we
    use _PAGE_EXEC.

    This is also a first step for adding _PAGE_EXEC support for embedded platforms

    Signed-off-by: Benjamin Herrenschmidt

    Benjamin Herrenschmidt
     

07 Jan, 2009

1 commit

  • Show node to memory section relationship with symlinks in sysfs

    Add /sys/devices/system/node/nodeX/memoryY symlinks for all
    the memory sections located on nodeX. For example:
    /sys/devices/system/node/node1/memory135 -> ../../memory/memory135
    indicates that memory section 135 resides on node1.

    Also revises documentation to cover this change as well as updating
    Documentation/ABI/testing/sysfs-devices-memory to include descriptions
    of memory hotremove files 'phys_device', 'phys_index', and 'state'
    that were previously not described there.

    In addition to it always being a good policy to provide users with
    the maximum possible amount of physical location information for
    resources that can be hot-added and/or hot-removed, the following
    are some (but likely not all) of the user benefits provided by
    this change.
    Immediate:
    - Provides information needed to determine the specific node
    on which a defective DIMM is located. This will reduce system
    downtime when the node or defective DIMM is swapped out.
    - Prevents unintended onlining of a memory section that was
    previously offlined due to a defective DIMM. This could happen
    during node hot-add when the user or node hot-add assist script
    onlines _all_ offlined sections due to user or script inability
    to identify the specific memory sections located on the hot-added
    node. The consequences of reintroducing the defective memory
    could be ugly.
    - Provides information needed to vary the amount and distribution
    of memory on specific nodes for testing or debugging purposes.
    Future:
    - Will provide information needed to identify the memory
    sections that need to be offlined prior to physical removal
    of a specific node.

    Symlink creation during boot was tested on 2-node x86_64, 2-node
    ppc64, and 2-node ia64 systems. Symlink creation during physical
    memory hot-add tested on a 2-node x86_64 system.

    Signed-off-by: Gary Hade
    Signed-off-by: Badari Pulavarty
    Acked-by: Ingo Molnar
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Gary Hade
     

21 Dec, 2008

2 commits

  • Currently, we never set _PAGE_COHERENT in the PTEs, we just OR it in
    in the hash code based on some CPU feature bit. We also manipulate
    _PAGE_NO_CACHE and _PAGE_GUARDED by hand in all sorts of places.

    This changes the logic so that instead, the PTE now contains
    _PAGE_COHERENT for all normal RAM pages thay have I = 0 on platforms
    that need it. The hash code clears it if the feature bit is not set.

    It also adds some clean accessors to setup various valid combinations
    of access flags and change various bits of code to use them instead.

    This should help having the PTE actually containing the bit
    combinations that we really want.

    I also removed _PAGE_GUARDED from _PAGE_BASE on 44x and instead
    set it explicitely from the TLB miss. I will ultimately remove it
    completely as it appears that it might not be needed after all
    but in the meantime, having it in the TLB miss makes things a
    lot easier.

    Signed-off-by: Benjamin Herrenschmidt
    Acked-by: Kumar Gala
    Signed-off-by: Paul Mackerras

    Benjamin Herrenschmidt
     
  • This commit moves the whole no-hash TLB handling out of line into a
    new tlb_nohash.c file, and implements some basic SMP support using
    IPIs and/or broadcast tlbivax instructions.

    Note that I'm using local invalidations for D->I cache coherency.

    At worst, if another processor is trying to execute the same and
    has the old entry in its TLB, it will just take a fault and re-do
    the TLB flush locally (it won't re-do the cache flush in any case).

    Signed-off-by: Benjamin Herrenschmidt
    Acked-by: Kumar Gala
    Signed-off-by: Paul Mackerras

    Benjamin Herrenschmidt
     

20 Oct, 2008

1 commit

  • There is nothing architecture specific about remove_memory().
    remove_memory() function is common for all architectures which support
    hotplug memory remove. Instead of duplicating it in every architecture,
    collapse them into arch neutral function.

    [akpm@linux-foundation.org: fix the export]
    Signed-off-by: Badari Pulavarty
    Cc: Yasunori Goto
    Cc: Gary Hade
    Cc: Mel Gorman
    Cc: Yasunori Goto
    Cc: "Luck, Tony"
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Badari Pulavarty
     

07 Oct, 2008

1 commit

  • Commit 8b150478 ("ppc: make phys_mem_access_prot() work with pfns
    instead of addresses") fixed page_is_ram() in arch/ppc to avoid overflow
    for addresses above 4G on 32-bit kernels. However arch/powerpc's
    page_is_ram() is missing the same fix -- it computes a physical address
    by doing pfn << PAGE_SHIFT, which overflows if pfn corresponds to a page
    above 4G.

    In particular this causes pages above 4G to be mapped with the wrong
    caching attribute; for example many ppc440-based SoCs have PCI space
    above 4G, and mmap()ing MMIO space may end up with a mapping that has
    caching enabled.

    Fix this by working with the pfn and avoiding the conversion to
    physical address that causes the overflow. This patch compares the
    pfn to max_pfn, which is a semantic change from the old code -- that
    code compared the physical address to high_memory, which corresponds
    to max_low_pfn. However, I think that was is another bug, since
    highmem pages are still RAM.

    Reported-by: vb
    Signed-off-by: Roland Dreier
    Acked-by: Benjamin Herrenschmidt
    Signed-off-by: Benjamin Herrenschmidt

    Roland Dreier
     

04 Aug, 2008

1 commit


27 Jul, 2008

1 commit

  • Remove arch-specific show_mem() in favor of the generic version.

    This also removes the following redundant information display:

    - pages in swapcache, printed by show_swap_cache_info()

    where show_mem() calls show_free_areas(), which calls
    show_swap_cache_info().

    Signed-off-by: Johannes Weiner
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner