31 Oct, 2005

6 commits


30 Oct, 2005

34 commits

  • Patch from Richard Purdie

    Add spitz irda platform support

    Signed-off-by: Richard Purdie
    Signed-off-by: Russell King

    Richard Purdie
     
  • Patch from Richard Purdie

    Add corgi irda platform support

    Signed-off-by: Richard Purdie
    Signed-off-by: Russell King

    Richard Purdie
     
  • Patch from Richard Purdie

    Add poodle irda platform support

    Signed-off-by: Richard Purdie
    Signed-off-by: Russell King

    Richard Purdie
     
  • Patch from Richard Purdie

    Update the PXA irda driver to match the recent platform device
    suspend/resume level changes.

    Signed-off-by: Richard Purdie
    Signed-off-by: Russell King

    Richard Purdie
     
  • Linus Torvalds
     
  • move EXPORT_SYMBOL(filemap_populate) to the proper place: just after
    function itself: it's easy to miss that function is exported otherwise.

    Signed-off-by: Nikita Danilov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nikita Danilov
     
  • In 'mm' change the explicit use of a for-loop using NR_CPUS into the
    general for_each_cpu() constructs. This widens the scope of potential
    future optimizations of the general constructs, as well as takes advantage
    of the existing optimizations of first_cpu() and next_cpu(), which is
    advantageous when the true CPU count is much smaller than NR_CPUS.

    Signed-off-by: John Hawkes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    John Hawkes
     
  • Policy contextualization is only useful for task based policies and not for
    vma based policies. It may be useful to define allowed nodes that are not
    accessible from this thread because other threads may have access to these
    nodes. Without this patch strange memory policy situations may cause an
    application to fail with out of memory.

    Example:

    Let's say we have two threads A and B that share the same address space and
    a huge array computational array X.

    Thread A is restricted by its cpuset to nodes 0 and 1 and thread B is
    restricted by its cpuset to nodes 2 and 3.

    Thread A now wants to restrict allocations to the first node and thus
    applies a BIND policy on X to node 0 and 2. The cpuset limits this to node
    0. Thus pages for X must be allocated on node 0 now.

    Thread B now touches a page that has never been used in X and faults in a
    page. According to the BIND policy of the vma for X the page must be
    allocated on page 0. However, the cpuset of B does not allow allocation on
    0 and 1. Now the application fails in alloc_pages with out of memory.

    Signed-off-by: Christoph Lameter
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • - Do a separation between do_xxx and sys_xxx functions. sys_xxx functions
    take variable sized bitmaps from user space as arguments. do_xxx functions
    take fixed sized nodemask_t as arguments and may be used from inside the
    kernel. Doing so simplifies the initialization code. There is no
    fs = kernel_ds assumption anymore.

    - Split up get_nodes into get_nodes (which gets the node list) and
    contextualize_policy which restricts the nodes to those accessible
    to the task and updates cpusets.

    - Add comments explaining limitations of bind policy

    Signed-off-by: Christoph Lameter
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Here is a set of ppc64 specific patches that at least allow
    compilation/booting with the following configurations:

    FLATMEM
    SPARSEMEN
    SPARSEMEM + MEMORY_HOTPLUG

    Signed-off-by: Mike Kravetz
    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • Adds the necessary for non-NUMA hot-add of highmem to an existing zone on
    i386.

    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • From: IWAMOTO Toshihiro
    > I found the tests does not work well with Dave's patchset.
    > I've found the followings:
    >
    > - setup_per_zone_pages_min() calls should be added in
    > capture_page_range() and online_pages()
    > - lru_add_drain() should be called before try_to_migrate_pages()

    The following patch deals with the first item.

    Signed-off-by: IWAMOTO Toshihiro
    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • This basically keeps up from having to extern __kmalloc_section_memmap().

    The vaddr_in_vmalloc_area() helper could go in a vmalloc header, but that
    header gets hard to work with, because it needs some arch-specific macros.
    Just stick it in here for now, instead of creating another header.

    Signed-off-by: Dave Hansen
    Signed-off-by: Lion Vollnhals
    Signed-off-by: Jiri Slaby
    Signed-off-by: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • This adds generic memory add/remove and supporting functions for memory
    hotplug into a new file as well as a memory hotplug kernel config option.

    Individual architecture patches will follow.

    For now, disable memory hotplug when swsusp is enabled. There's a lot of
    churn there right now. We'll fix it up properly once it calms down.

    Signed-off-by: Matt Tolentino
    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • See the "fixup bad_range()" patch for more information, but this actually
    creates a the lock to protect things making assumptions about a zone's size
    staying constant at runtime.

    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • pgdat->node_size_lock is basically only neeeded in one place in the normal
    code: show_mem(), which is the arch-specific sysrq-m printing function.

    Strictly speaking, the architectures not doing memory hotplug do no need this
    locking in show_mem(). However, they are all included for completeness. This
    should also make any future consolidation of all of the implementations a
    little more straightforward.

    This lock is also held in the sparsemem code during a memory removal, as
    sections are invalidated. This is the place there pfn_valid() is made false
    for a memory area that's being removed. The lock is only required when doing
    pfn_valid() operations on memory which the user does not already have a
    reference on the page, such as in show_mem().

    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • When doing memory hotplug operations, the size of existing zones can obviously
    change. This means that zone->zone_{start_pfn,spanned_pages} can change.

    There are currently no locks that protect these structure members. However,
    they are rarely accessed at runtime. Outside of swsusp, the only place that I
    can find is bad_range().

    So, split bad_range() up into two pieces: one that needs to be locked and
    anther that doesn't.

    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • A little helper that we use in the hotplug code.

    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • If a zone is empty at boot-time and then hot-added to later, it needs to run
    the same init code that would have been run on it at boot.

    This patch breaks out zone table and per-cpu-pages functions for use by the
    hotplug code. You can almost see all of the free_area_init_core() function on
    one page now. :)

    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • The following series implements memory hot-add for ppc64 and i386. There are
    x86_64 and ia64 implementations that will be submitted shortly as well,
    through the normal maintainers.

    This patch:

    local_mapnr is unused, except for in an alpha header. Keep the alpha one,
    kill the rest.

    Signed-off-by: Dave Hansen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     
  • We had a problem on ppc64 where with more than 4 threads a large system
    wouldn't scale well while faulting in the .text (most of the time was spent
    in the kernel despite it was an userland compute intensive app). The
    reason is the useless overwrite of the same pte from all cpu.

    I fixed it this way (verified on an older kernel but the forward port is
    almost identical). This will benefit all archs not just ppc64.

    Signed-off-by: Andrea Arcangeli
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • Basic overcommit checking for hugetlb_file_map() based on an implementation
    used with demand faulting in SLES9.

    Since demand faulting can't guarantee the availability of pages at mmap
    time, this patch implements a basic sanity check to ensure that the number
    of huge pages required to satisfy the mmap are currently available.
    Despite the obvious race, I think it is a good start on doing proper
    accounting. I'd like to work towards an accounting system that mimics the
    semantics of normal pages (especially for the MAP_PRIVATE/COW case). That
    work is underway and builds on what this patch starts.

    Huge page shared memory segments are simpler and still maintain their
    commit on shmget semantics.

    Signed-off-by: Adam Litke
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adam Litke
     
  • Below is a patch to implement demand faulting for huge pages. The main
    motivation for changing from prefaulting to demand faulting is so that huge
    page memory areas can be allocated according to NUMA policy.

    Thanks to consolidated hugetlb code, switching the behavior requires changing
    only one fault handler. The bulk of the patch just moves the logic from
    hugelb_prefault() to hugetlb_pte_fault() and find_get_huge_page().

    Signed-off-by: Adam Litke
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adam Litke
     
  • Clean up some repeated code related to HugeTLB. hugetlb_zero_setup would
    have already allocated the file->f_op.

    Signed-off-by: Krishnakumar. R
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Krishnakumar R
     
  • Reformat hugelbfs_forget_inode and add the missing but harmless
    write_inode_now call. It looks the same as generic_forget_inode now except
    for the call to truncate_hugepages instead of truncate_inode_pages.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • hugetlbfs_do_delete_inode is the same as generic_delete_inode now, so remove
    it in favour of the latter.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Make hugetlbfs looks the same as generic_detelte_inode, fixing a bunch of
    missing updates to it at the same time. Rename it to
    hugetlbfs_do_delete_inode and add a real hugetlbfs_delete_inode that
    implements ->delete_inode.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Move hugetlbfs accounting into ->alloc_inode / ->destroy_inode. This keeps
    the code simpler, fixes a loeak where a failing inode allocation wouldn't
    decrement the counter and moves hugetlbfs_delete_inode and
    hugetlbfs_forget_inode closer to their generic counterparts.

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     
  • Updated several references to page_table_lock in common code comments.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • A couple of oddities were guarded by page_table_lock, no longer properly
    guarded when that is split.

    The mm_counters of file_rss and anon_rss: make those an atomic_t, or an
    atomic64_t if the architecture supports it, in such a case. Definitions by
    courtesy of Christoph Lameter: who spent considerable effort on more scalable
    ways of counting, but found insufficient benefit in practice.

    And adding an mm with swap to the mmlist for swapoff: the list is well-
    guarded by its own lock, but the list_empty check now has to be repeated
    inside it.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Christoph Lameter demonstrated very poor scalability on the SGI 512-way, with
    a many-threaded application which concurrently initializes different parts of
    a large anonymous area.

    This patch corrects that, by using a separate spinlock per page table page, to
    guard the page table entries in that page, instead of using the mm's single
    page_table_lock. (But even then, page_table_lock is still used to guard page
    table allocation, and anon_vma allocation.)

    In this implementation, the spinlock is tucked inside the struct page of the
    page table page: with a BUILD_BUG_ON in case it overflows - which it would in
    the case of 32-bit PA-RISC with spinlock debugging enabled.

    Splitting the lock is not quite for free: another cacheline access. Ideally,
    I suppose we would use split ptlock only for multi-threaded processes on
    multi-cpu machines; but deciding that dynamically would have its own costs.
    So for now enable it by config, at some number of cpus - since the Kconfig
    language doesn't support inequalities, let preprocessor compare that with
    NR_CPUS. But I don't think it's worth being user-configurable: for good
    testing of both split and unsplit configs, split now at 4 cpus, and perhaps
    change that to 8 later.

    There is a benefit even for singly threaded processes: kswapd can be attacking
    one part of the mm while another part is busy faulting.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • In worrying over the various pte operations in different architectures, I came
    across some unused functions in UML: remove mprotect_kernel_vm,
    protect_vm_page and addr_pte.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • There's usually a good reason when a pte is examined without the lock; but it
    makes me nervous when the pointer is dereferenced more than once.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • The cris v32 switch_mm guards get_mmu_context with next->page_table_lock: good
    it's not really SMP yet, since get_mmu_context messes with global variables
    affecting other mms. Replace by global mmu_context_lock.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins