26 Sep, 2006

2 commits

  • Remove the atomic counter for slab_reclaim_pages and replace the counter
    and NR_SLAB with two ZVC counter that account for unreclaimable and
    reclaimable slab pages: NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE.

    Change the check in vmscan.c to refer to to NR_SLAB_RECLAIMABLE. The
    intend seems to be to check for slab pages that could be freed.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Tracking of dirty pages in shared writeable mmap()s.

    The idea is simple: write protect clean shared writeable pages, catch the
    write-fault, make writeable and set dirty. On page write-back clean all the
    PTE dirty bits and write protect them once again.

    The implementation is a tad harder, mainly because the default
    backing_dev_info capabilities were too loosely maintained. Hence it is not
    enough to test the backing_dev_info for cap_account_dirty.

    The current heuristic is as follows, a VMA is eligible when:
    - its shared writeable
    (vm_flags & (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED)
    - it is not a 'special' mapping
    (vm_flags & (VM_PFNMAP|VM_INSERTPAGE)) == 0
    - the backing_dev_info is cap_account_dirty
    mapping_cap_account_dirty(vma->vm_file->f_mapping)
    - f_op->mmap() didn't change the default page protection

    Page from remap_pfn_range() are explicitly excluded because their COW
    semantics are already horrid enough (see vm_normal_page() in do_wp_page()) and
    because they don't have a backing store anyway.

    mprotect() is taught about the new behaviour as well. However it overrides
    the last condition.

    Cleaning the pages on write-back is done with page_mkclean() a new rmap call.
    It can be called on any page, but is currently only implemented for mapped
    pages, if the page is found the be of a VMA that accounts dirty pages it will
    also wrprotect the PTE.

    Finally, in fs/buffers.c:try_to_free_buffers(); remove clear_page_dirty() from
    under ->private_lock. This seems to be safe, since ->private_lock is used to
    serialize access to the buffers, not the page itself. This is needed because
    clear_page_dirty() will call into page_mkclean() and would thereby violate
    locking order.

    [dhowells@redhat.com: Provide a page_mkclean() implementation for NOMMU]
    Signed-off-by: Peter Zijlstra
    Cc: Hugh Dickins
    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

23 Sep, 2006

1 commit

  • * master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart:
    [AGPGART] Rework AGPv3 modesetting fallback.
    [AGPGART] Add suspend callback for i965
    [AGPGART] Fix number of aperture sizes in 830 gart structs.
    [AGPGART] Intel 965 Express support.
    [AGPGART] agp.h: constify struct agp_bridge_data::version
    [AGPGART] const'ify VIA AGP PCI table.
    [AGPGART] CONFIG_PM=n slim: drivers/char/agp/intel-agp.c
    [AGPGART] CONFIG_PM=n slim: drivers/char/agp/efficeon-agp.c
    [AGPGART] Const'ify the agpgart driver version.
    [AGPGART] remove private page protection map

    Linus Torvalds
     

08 Sep, 2006

1 commit

  • This prevents cross-region mappings on IA64 and SPARC which could lead
    to system crash. They were correctly trapped for normal mmap() calls,
    but not for the kernel internal calls generated by executable loading.

    This code just moves the architecture-specific cross-region checks into
    an arch-specific "arch_mmap_check()" macro, and defines that for the
    architectures that needed it (ia64, sparc and sparc64).

    Architectures that don't have any special requirements can just ignore
    the new cross-region check, since the mmap() code will just notice on
    its own when the macro isn't defined.

    Signed-off-by: Pavel Emelianov
    Signed-off-by: Kirill Korotaev
    Acked-by: David Miller
    Signed-off-by: Greg Kroah-Hartman
    [ Cleaned up to not affect architectures that don't need it ]
    Signed-off-by: Linus Torvalds

    Kirill Korotaev
     

27 Jul, 2006

1 commit


01 Jul, 2006

1 commit

  • Currently a single atomic variable is used to establish the size of the page
    cache in the whole machine. The zoned VM counters have the same method of
    implementation as the nr_pagecache code but also allow the determination of
    the pagecache size per zone.

    Remove the special implementation for nr_pagecache and make it a zoned counter
    named NR_FILE_PAGES.

    Updates of the page cache counters are always performed with interrupts off.
    We can therefore use the __ variant here.

    Signed-off-by: Christoph Lameter
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     

23 Jun, 2006

1 commit

  • Add a new VMA operation to notify a filesystem or other driver about the
    MMU generating a fault because userspace attempted to write to a page
    mapped through a read-only PTE.

    This facility permits the filesystem or driver to:

    (*) Implement storage allocation/reservation on attempted write, and so to
    deal with problems such as ENOSPC more gracefully (perhaps by generating
    SIGBUS).

    (*) Delay making the page writable until the contents have been written to a
    backing cache. This is useful for NFS/AFS when using FS-Cache/CacheFS.
    It permits the filesystem to have some guarantee about the state of the
    cache.

    (*) Account and limit number of dirty pages. This is one piece of the puzzle
    needed to make shared writable mapping work safely in FUSE.

    Needed by cachefs (Or is it cachefiles? Or fscache? ).

    At least four other groups have stated an interest in it or a desire to use
    the functionality it provides: FUSE, OCFS2, NTFS and JFFS2. Also, things like
    EXT3 really ought to use it to deal with the case of shared-writable mmap
    encountering ENOSPC before we permit the page to be dirtied.

    From: Peter Zijlstra

    get_user_pages(.write=1, .force=1) can generate COW hits on read-only
    shared mappings, this patch traps those as mkpage_write candidates and fails
    to handle them the old way.

    Signed-off-by: David Howells
    Cc: Miklos Szeredi
    Cc: Joel Becker
    Cc: Mark Fasheh
    Cc: Anton Altaparmakov
    Cc: David Woodhouse
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

11 Apr, 2006

2 commits

  • This patch is an enhancement of OVERCOMMIT_GUESS algorithm in
    __vm_enough_memory() in mm/mmap.c.

    When the OVERCOMMIT_GUESS algorithm calculates the number of free pages,
    the algorithm subtracts the number of reserved pages from the result
    nr_free_pages().

    Signed-off-by: Hideo Aoki
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hideo AOKI
     
  • The code checks for newbrk with oldbrk which are page aligned before making
    a check for the memory limit set of data segment. If the memory limit is
    not page aligned in that case it bypasses the test for the limit if the
    memory allocation is still for the same page.

    Signed-off-by: Ram Gupta
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ram Gupta
     

01 Apr, 2006

1 commit


26 Mar, 2006

1 commit


22 Mar, 2006

1 commit

  • Now that it's madvisable, remove two pieces of VM_DONTCOPY bogosity:

    1. There was and is no logical reason why VM_DONTCOPY should be in the
    list of flags which forbid vma merging (and those drivers which set
    it are also setting VM_IO, which itself forbids the merge).

    2. It's hard to understand the purpose of the VM_HUGETLB, VM_DONTCOPY
    block in vm_stat_account: but never mind, it's under CONFIG_HUGETLB,
    which (unlike CONFIG_HUGETLB_PAGE or CONFIG_HUGETLBFS) has never been
    defined.

    Signed-off-by: Hugh Dickins
    Cc: William Lee Irwin III
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

12 Jan, 2006

1 commit

  • - Move capable() from sched.h to capability.h;

    - Use where capable() is used
    (in include/, block/, ipc/, kernel/, a few drivers/,
    mm/, security/, & sound/;
    many more drivers/ to go)

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy.Dunlap
     

17 Dec, 2005

1 commit


23 Nov, 2005

1 commit

  • The PageReserved removal in 2.6.15-rc1 issued a "deprecated" message when you
    tried to mmap or mprotect MAP_PRIVATE PROT_WRITE a VM_RESERVED, and failed
    with -EACCES: because do_wp_page lacks the refinement to COW pages in those
    areas, nor do we expect to find anonymous pages in them; and it seemed just
    bloat to add code for handling such a peculiar case. But immediately it
    caused vbetool and ddcprobe (using lrmi) to fail.

    So revert the "deprecated" messages, letting mmap and mprotect succeed. But
    leave do_wp_page's BUG_ON(vma->vm_flags & VM_RESERVED) in place until we've
    added the code to do it right: so this particular patch is only good if the
    app doesn't really need to write to that private area.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

19 Nov, 2005

1 commit


07 Nov, 2005

1 commit


31 Oct, 2005

1 commit

  • This patch is a rewrite of the one submitted on October 1st, using modules
    (http://marc.theaimsgroup.com/?l=linux-kernel&m=112819093522998&w=2).

    This rewrite adds a tristate CONFIG_RCU_TORTURE_TEST, which enables an
    intense torture test of the RCU infratructure. This is needed due to the
    continued changes to the RCU infrastructure to accommodate dynamic ticks,
    CPU hotplug, realtime, and so on. Most of the code is in a separate file
    that is compiled only if the CONFIG variable is set. Documentation on how
    to run the test and interpret the output is also included.

    This code has been tested on i386 and ppc64, and an earlier version of the
    code has received extensive testing on a number of architectures as part of
    the PREEMPT_RT patchset.

    Signed-off-by: "Paul E. McKenney"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Paul E. McKenney
     

30 Oct, 2005

9 commits

  • Remove the page_table_lock from around the calls to unmap_vmas, and replace
    the pte_offset_map in zap_pte_range by pte_offset_map_lock: all callers are
    now safe to descend without page_table_lock.

    Don't attempt fancy locking for hugepages, just take page_table_lock in
    unmap_hugepage_range. Which makes zap_hugepage_range, and the hugetlb test in
    zap_page_range, redundant: unmap_vmas calls unmap_hugepage_range anyway. Nor
    does unmap_vmas have much use for its mm arg now.

    The tlb_start_vma and tlb_end_vma in unmap_page_range are now called without
    page_table_lock: if they're implemented at all, they typically come down to
    flush_cache_range (usually done outside page_table_lock) and flush_tlb_range
    (which we already audited for the mprotect case).

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • In most places the descent from pgd to pud to pmd to pte holds mmap_sem
    (exclusively or not), which ensures that free_pgtables cannot be freeing page
    tables from any level at the same time. But truncation and reverse mapping
    descend without mmap_sem.

    No problem: just make sure that a vma is unlinked from its prio_tree (or
    nonlinear list) and from its anon_vma list, after zapping the vma, but before
    freeing its page tables. Then neither vmtruncate nor rmap can reach that vma
    whose page tables are now volatile (nor do they need to reach it, since all
    its page entries have been zapped by this stage).

    The i_mmap_lock and anon_vma->lock already serialize this correctly; but the
    locking hierarchy is such that we cannot take them while holding
    page_table_lock. Well, we're trying to push that down anyway. So in this
    patch, move anon_vma_unlink and unlink_file_vma into free_pgtables, at the
    same time as moving page_table_lock around calls to unmap_vmas.

    tlb_gather_mmu and tlb_finish_mmu then fall outside the page_table_lock, but
    we made them preempt_disable and preempt_enable earlier; and a long source
    audit of all the architectures has shown no problem with removing
    page_table_lock from them. free_pgtables doesn't need page_table_lock for
    itself, nor for what it calls; tlb->mm->nr_ptes is usually protected by
    page_table_lock, but partly by non-exclusive mmap_sem - here it's decremented
    with exclusive mmap_sem, or mm_users 0. update_hiwater_rss and
    vm_unacct_memory don't need page_table_lock either.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • ia64 has expand_backing_store function for growing its Register Backing Store
    vma upwards. But more complete code for this purpose is found in the
    CONFIG_STACK_GROWSUP part of mm/mmap.c. Uglify its #ifdefs further to provide
    expand_upwards for ia64 as well as expand_stack for parisc.

    The Register Backing Store vma should be marked VM_ACCOUNT. Implement the
    intention of growing it only a page at a time, instead of passing an address
    outside of the vma to handle_mm_fault, with unknown consequences.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • update_mem_hiwater has attracted various criticisms, in particular from those
    concerned with mm scalability. Originally it was called whenever rss or
    total_vm got raised. Then many of those callsites were replaced by a timer
    tick call from account_system_time. Now Frank van Maarseveen reports that to
    be found inadequate. How about this? Works for Frank.

    Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
    update_hiwater_rss and update_hiwater_vm. Don't attempt to keep
    mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
    by 1): those are hot paths. Do the opposite, update only when about to lower
    rss (usually by many), or just before final accounting in do_exit. Handle
    mm->hiwater_vm in the same way, though it's much less of an issue. Demand
    that whoever collects these hiwater statistics do the work of taking the
    maximum with rss or total_vm.

    And there has been no collector of these hiwater statistics in the tree. The
    new convention needs an example, so match Frank's usage by adding a VmPeak
    line above VmSize to /proc//status, and also a VmHWM line above VmRSS
    (High-Water-Mark or High-Water-Memory).

    There was a particular anomaly during mremap move, that hiwater_vm might be
    captured too high. A fleeting such anomaly remains, but it's quickly
    corrected now, whereas before it would stick.

    What locking? None: if the app is racy then these statistics will be racy,
    it's not worth any overhead to make them exact. But whenever it suits,
    hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
    page_table_lock (for now) or with preemption disabled (later on): without
    going to any trouble, minimize the time between reading current values and
    updating, to minimize those occasions when a racing thread bumps a count up
    and back down in between.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Remove PageReserved() calls from core code by tightening VM_RESERVED
    handling in mm/ to cover PageReserved functionality.

    PageReserved special casing is removed from get_page and put_page.

    All setting and clearing of PageReserved is retained, and it is now flagged
    in the page_alloc checks to help ensure we don't introduce any refcount
    based freeing of Reserved pages.

    MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
    deprecated. We never completely handled it correctly anyway, and is be
    reintroduced in future if required (Hugh has a proof of concept).

    Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
    arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
    be trivially removed.

    Last real user of PageReserved is swsusp, which uses PageReserved to
    determine whether a struct page points to valid memory or not. This still
    needs to be addressed (a generic page_is_ram() should work).

    A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
    thus mapcounted and count towards shared rss). These writes to the struct
    page could cause excessive cacheline bouncing on big systems. There are a
    number of ways this could be addressed if it is an issue.

    Signed-off-by: Nick Piggin

    Refcount bug fix for filemap_xip.c

    Signed-off-by: Carsten Otte
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • exit_mmap resets various mm_struct fields, but the mm is well on its way out,
    and none of those fields matter by this point.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Divide remove_vm_struct into two parts: first anon_vma_unlink plus
    unlink_file_vma, to unlink the vma from the list and tree by which rmap or
    vmtruncate might find it; then remove_vma to close, fput and free.

    The intention here is to do the anon_vma_unlink and unlink_file_vma earlier,
    in free_pgtables before freeing any page tables: so we can be sure that any
    page tables traversed by rmap and vmtruncate are stable (and other, ordinary
    cases are stabilized by holding mmap_sem).

    This will be crucial to traversing pgd,pud,pmd without page_table_lock. But
    testing the split-out patch showed that lifting the page_table_lock is
    symbiotically necessary to make this change - the lock ordering is wrong to
    move those unlinks into free_pgtables while it's under ptlock.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • unmap_vma doesn't amount to much, let's put it inside unmap_vma_list. Except
    it doesn't unmap anything, unmap_region just did the unmapping: rename it to
    remove_vma_list.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • The original vm_stat_account has fallen into disuse, with only one user, and
    only one user of vm_stat_unaccount. It's easier to keep track if we convert
    them all to __vm_stat_account, then free it from its __shackles.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

22 Sep, 2005

1 commit


15 Sep, 2005

1 commit

  • Pavel Emelianov and Kirill Korotaev observe that fs and arch users of
    security_vm_enough_memory tend to forget to vm_unacct_memory when a
    failure occurs further down (typically in setup_arg_pages variants).

    These are all users of insert_vm_struct, and that reservation will only
    be unaccounted on exit if the vma is marked VM_ACCOUNT: which in some
    cases it is (hidden inside VM_STACK_FLAGS) and in some cases it isn't.

    So x86_64 32-bit and ppc64 vDSO ELFs have been leaking memory into
    Committed_AS each time they're run. But don't add VM_ACCOUNT to them,
    it's inappropriate to reserve against the very unlikely case that gdb
    be used to COW a vDSO page - we ought to do something about that in
    do_wp_page, but there are yet other inconsistencies to be resolved.

    The safe and economical way to fix this is to let insert_vm_struct do
    the security_vm_enough_memory check when it finds VM_ACCOUNT is set.

    And the MIPS irix_brk has been calling security_vm_enough_memory before
    calling do_brk which repeats it, doubly accounting and so also leaking.
    Remove that, and all the fs and arch calls to security_vm_enough_memory:
    give it a less misleading name later on.

    Signed-off-by: Hugh Dickins
    Signed-Off-By: Kirill Korotaev
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

08 Sep, 2005

2 commits


05 Aug, 2005

1 commit

  • We have found what seems to be a small bug in __vm_enough_memory() when
    sysctl_overcommit_memory is set to OVERCOMMIT_NEVER.

    When this bug occurs the systems fails to boot, with /sbin/init whining
    about fork() returning ENOMEM.

    We hunted down the problem to this:

    The deferred update mecanism used in vm_acct_memory(), on a SMP system,
    allows the vm_committed_space counter to have a negative value.

    This should not be a problem since this counter is known to be inaccurate.

    But in __vm_enough_memory() this counter is compared to the `allowed'
    variable, which is an unsigned long. This comparison is broken since it
    will consider the negative values of vm_committed_space to be huge positive
    values, resulting in a memory allocation failure.

    Signed-off-by:
    Signed-off-by:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Simon Derr
     

22 Jun, 2005

2 commits

  • The topdown changes in 2.6.12-rc1 can cause large allocations with large
    stack limit to fail, despite there being space available. The
    mmap_base-len is only valid when len >= mmap_base. However, nothing in
    topdown allocator checks this. It's only (now) caught at higher level,
    which will cause allocation to simply fail. The following change restores
    the fallback to bottom-up path, which will allow large allocations with
    large stack limit to potentially still succeed.

    Signed-off-by: Chris Wright
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Wright
     
  • Ingo recently introduced a great speedup for allocating new mmaps using the
    free_area_cache pointer which boosts the specweb SSL benchmark by 4-5% and
    causes huge performance increases in thread creation.

    The downside of this patch is that it does lead to fragmentation in the
    mmap-ed areas (visible via /proc/self/maps), such that some applications
    that work fine under 2.4 kernels quickly run out of memory on any 2.6
    kernel.

    The problem is twofold:

    1) the free_area_cache is used to continue a search for memory where
    the last search ended. Before the change new areas were always
    searched from the base address on.

    So now new small areas are cluttering holes of all sizes
    throughout the whole mmap-able region whereas before small holes
    tended to close holes near the base leaving holes far from the base
    large and available for larger requests.

    2) the free_area_cache also is set to the location of the last
    munmap-ed area so in scenarios where we allocate e.g. five regions of
    1K each, then free regions 4 2 3 in this order the next request for 1K
    will be placed in the position of the old region 3, whereas before we
    appended it to the still active region 1, placing it at the location
    of the old region 2. Before we had 1 free region of 2K, now we only
    get two free regions of 1K -> fragmentation.

    The patch addresses thes issues by introducing yet another cache descriptor
    cached_hole_size that contains the largest known hole size below the
    current free_area_cache. If a new request comes in the size is compared
    against the cached_hole_size and if the request can be filled with a hole
    below free_area_cache the search is started from the base instead.

    The results look promising: Whereas 2.6.12-rc4 fragments quickly and my
    (earlier posted) leakme.c test program terminates after 50000+ iterations
    with 96 distinct and fragmented maps in /proc/self/maps it performs nicely
    (as expected) with thread creation, Ingo's test_str02 with 20000 threads
    requires 0.7s system time.

    Taking out Ingo's patch (un-patch available per request) by basically
    deleting all mentions of free_area_cache from the kernel and starting the
    search for new memory always at the respective bases we observe: leakme
    terminates successfully with 11 distinctive hardly fragmented areas in
    /proc/self/maps but thread creating is gringdingly slow: 30+s(!) system
    time for Ingo's test_str02 with 20000 threads.

    Now - drumroll ;-) the appended patch works fine with leakme: it ends with
    only 7 distinct areas in /proc/self/maps and also thread creation seems
    sufficiently fast with 0.71s for 20000 threads.

    Signed-off-by: Wolfgang Wander
    Credit-to: "Richard Purdie"
    Signed-off-by: Ken Chen
    Acked-by: Ingo Molnar (partly)
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wolfgang Wander
     

20 May, 2005

1 commit


19 May, 2005

1 commit

  • Prevent the topdown allocator from allocating mmap areas all the way
    down to address zero.

    We still allow a MAP_FIXED mapping of page 0 (needed for various things,
    ranging from Wine and DOSEMU to people who want to allow speculative
    loads off a NULL pointer).

    Tested by Chris Wright.

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

01 May, 2005

2 commits

  • Always use page counts when doing RLIMIT_MEMLOCK checking to avoid possible
    overflow.

    Signed-off-by: Chris Wright
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chris Wright
     
  • Address bug #4508: there's potential for wraparound in the various places
    where we perform RLIMIT_AS checking.

    (I'm a bit worried about acct_stack_growth(). Are we sure that vma->vm_mm is
    always equal to current->mm? If not, then we're comparing some other
    process's total_vm with the calling process's rlimits).

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    akpm@osdl.org
     

20 Apr, 2005

2 commits

  • Once all the MMU architectures define FIRST_USER_ADDRESS, remove hack from
    mmap.c which derived it from FIRST_USER_PGD_NR.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • The patches to free_pgtables by vma left problems on any architectures which
    leave some user address page table entries unencapsulated by vma. Andi has
    fixed the 32-bit vDSO on x86_64 to use a vma. Now fix arm (and arm26), whose
    first PAGE_SIZE is reserved (perhaps) for machine vectors.

    Our calls to free_pgtables must not touch that area, and exit_mmap's
    BUG_ON(nr_ptes) must allow that arm's get_pgd_slow may (or may not) have
    allocated an extra page table, which its free_pgd_slow would free later.

    FIRST_USER_PGD_NR has misled me and others: until all the arches define
    FIRST_USER_ADDRESS instead, a hack in mmap.c to derive one from t'other. This
    patch fixes the bugs, the remaining patches just clean it up.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins