28 Apr, 2008

3 commits

  • This patch renames mpol_copy() to mpol_dup() because, well, that's what it
    does. Like, e.g., strdup() for strings, mpol_dup() takes a pointer to an
    existing mempolicy, allocates a new one and copies the contents.

    In a later patch, I want to use the name mpol_copy() to copy the contents from
    one mempolicy to another like, e.g., strcpy() does for strings.

    Signed-off-by: Lee Schermerhorn
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Mel Gorman
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • This is a change that was requested some time ago by Mel Gorman. Makes sense
    to me, so here it is.

    Note: I retain the name "mpol_free_shared_policy()" because it actually does
    free the shared_policy, which is NOT a reference counted object. However, ...

    The mempolicy object[s] referenced by the shared_policy are reference counted,
    so mpol_put() is used to release the reference held by the shared_policy. The
    mempolicy might not be freed at this time, because some task attached to the
    shared object associated with the shared policy may be in the process of
    allocating a page based on the mempolicy. In that case, the task performing
    the allocation will hold a reference on the mempolicy, obtained via
    mpol_shared_policy_lookup(). The mempolicy will be freed when all tasks
    holding such a reference have called mpol_put() for the mempolicy.

    Signed-off-by: Lee Schermerhorn
    Cc: Christoph Lameter
    Cc: David Rientjes
    Cc: Mel Gorman
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Lee Schermerhorn
     
  • It is not easy to actually understand the "if (!file || !vma_merge())"
    code, turn it into "if (file && vma_merge())". This makes immediately
    obvious that the subsequent "if (file)" is superfluous.

    As Hugh Dickins pointed out, we can also factor out the ->i_writecount
    corrections, and add a small comment about that.

    Signed-off-by: Oleg Nesterov
    Cc: Miklos Szeredi
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Oleg Nesterov
     

09 Feb, 2008

1 commit

  • Convert special mapping install from nopage to fault.

    Because the "vm_file" is NULL for the special mapping, the generic VM
    code has messed up "vm_pgoff" thinking that it's an anonymous mapping
    and the offset does't matter. For that reason, we need to undo the
    vm_pgoff offset that got added into vmf->pgoff.

    [ We _really_ should clean that up - either by making this whole special
    mapping code just use a real file entry rather than that ugly array of
    "struct page" pointers, or by just making the VM code realize that
    even if vm_file is NULL it may not be a regular anonymous mmap.
    - Linus ]

    Signed-off-by: Nick Piggin
    Cc: linux-mm@kvack.org
    Cc: linux-kernel@vger.kernel.org
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

07 Feb, 2008

1 commit

  • There is a check in sys_brk(), that tries to make sure that we do not
    underflow the area that is dedicated to brk heap.

    The check is however wrong, as it assumes that brk area starts immediately
    after the end of the code (+bss), which is wrong for example in
    environments with randomized brk start. The proper way is to check whether
    the address is not below the start_brk address.

    Signed-off-by: Jiri Kosina
    Signed-off-by: Ingo Molnar

    Jiri Kosina
     

06 Feb, 2008

1 commit

  • In order to change the layout of the page tables after an mmap has crossed the
    adress space limit of the current page table layout a architecture hook in
    get_unmapped_area is needed. The arguments are the address of the new mapping
    and the length of it.

    Cc: Benjamin Herrenschmidt
    Signed-off-by: Martin Schwidefsky
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Schwidefsky
     

04 Feb, 2008

1 commit

  • Drivers that register a ->fault handler, but do not range-check the
    offset argument, must set VM_DONTEXPAND in the vm_flags in order to
    prevent an expanding mremap from overflowing the resource.

    I've audited the tree and attempted to fix these problems (usually by
    adding VM_DONTEXPAND where it is not obvious).

    Signed-off-by: Nick Piggin
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

30 Jan, 2008

1 commit

  • Randomize the location of the heap (brk) for i386 and x86_64. The range is
    randomized in the range starting at current brk location up to 0x02000000
    offset for both architectures. This, together with
    pie-executable-randomization.patch and
    pie-executable-randomization-fix.patch, should make the address space
    randomization on i386 and x86_64 complete.

    Arjan says:

    This is known to break older versions of some emacs variants, whose dumper
    code assumed that the last variable declared in the program is equal to the
    start of the dynamically allocated memory region.

    (The dumper is the code where emacs effectively dumps core at the end of it's
    compilation stage; this coredump is then loaded as the main program during
    normal use)

    iirc this was 5 years or so; we found this way back when I was at RH and we
    first did the security stuff there (including this brk randomization). It
    wasn't all variants of emacs, and it got fixed as a result (I vaguely remember
    that emacs already had code to deal with it for other archs/oses, just
    ifdeffed wrongly).

    It's a rare and wrong assumption as a general thing, just on x86 it mostly
    happened to be true (but to be honest, it'll break too if gcc does
    something fancy or if the linker does a non-standard order). Still its
    something we should at least document.

    Note 2: afaik it only broke the emacs *build*. I'm not 100% sure about that
    (it IS 5 years ago) though.

    [ akpm@linux-foundation.org: deuglification ]

    Signed-off-by: Jiri Kosina
    Cc: Arjan van de Ven
    Cc: Roland McGrath
    Cc: Jakub Jelinek
    Signed-off-by: Andrew Morton
    Signed-off-by: Ingo Molnar
    Signed-off-by: Thomas Gleixner

    Jiri Kosina
     

25 Jan, 2008

1 commit


05 Dec, 2007

3 commits

  • Given a specifically crafted binary do_brk() can be used to get low
    pages available in userspace virtually memory and can thus be used to
    circumvent the mmap_min_addr low memory protection. Add security checks
    in do_brk().

    Signed-off-by: Eric Paris
    Acked-by: Alan Cox
    Signed-off-by: James Morris

    Eric Paris
     
  • If mmap_min_addr is set and a process attempts to mmap (not fixed) with a
    non-null hint address less than mmap_min_addr the mapping will fail the
    security checks. Since this is just a hint address this patch will round
    such a hint address above mmap_min_addr.

    gcj was found to try to be very frugal with vm usage and give hint addresses
    in the 8k-32k range. Without this patch all such programs failed and with
    the patch they happily get a higher address.

    This patch is wrappad in CONFIG_SECURITY since mmap_min_addr doesn't exist
    without it and there would be no security check possible no matter what. So
    we should not bother compiling in this rounding if it is just a waste of
    time.

    Signed-off-by: Eric Paris
    Signed-off-by: James Morris

    Eric Paris
     
  • Add security checks to make sure we are not attempting to expand the
    stack into memory protected by mmap_min_addr

    Signed-off-by: Eric Paris
    Signed-off-by: James Morris

    Eric Paris
     

23 Oct, 2007

1 commit

  • Fix mprotect bug in recent commit 3ed75eb8f1cd89565966599c4f77d2edb086d5b0
    (setup vma->vm_page_prot by vm_get_page_prot()): the vma_wants_writenotify
    case was setting the same prot as when not.

    Nothing wrong with the use of protection_map[] in mmap_region(),
    but use vm_get_page_prot() there too in the same ~VM_SHARED way.

    Signed-off-by: Hugh Dickins
    Cc: Coly Li
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

20 Oct, 2007

1 commit

  • This patch uses vm_get_page_prot() to setup vma->vm_page_prot.

    Though inside vm_get_page_prot() the protection flags is AND with
    (VM_READ|VM_WRITE|VM_EXEC|VM_SHARED), it does not hurt correct code.

    Signed-off-by: Coly Li
    Cc: Hugh Dickins
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Coly Li
     

17 Oct, 2007

2 commits

  • This patch contains the following cleanups that are now possible:
    - remove the unused security_operations->inode_xattr_getsuffix
    - remove the no longer used security_operations->unregister_security
    - remove some no longer required exit code
    - remove a bunch of no longer used exports

    Signed-off-by: Adrian Bunk
    Acked-by: James Morris
    Cc: Chris Wright
    Cc: Stephen Smalley
    Cc: Serge Hallyn
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adrian Bunk
     
  • mm.h doesn't use directly anything from mutex.h and backing-dev.h, so
    remove them and add them back to files which need them.

    Cross-compile tested on many configs and archs.

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

23 Aug, 2007

1 commit

  • The new exec code inserts an accounted vma into an mm struct which is not
    current->mm. The existing memory check code has a hard coded assumption
    that this does not happen as does the security code.

    As the correct mm is known we pass the mm to the security method and the
    helper function. A new security test is added for the case where we need
    to pass the mm and the existing one is modified to pass current->mm to
    avoid the need to change large amounts of code.

    (Thanks to Tobias for fixing rejects and testing)

    Signed-off-by: Alan Cox
    Cc: WU Fengguang
    Cc: James Morris
    Cc: Tobias Diedrich
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alan Cox
     

30 Jul, 2007

1 commit

  • Remove fs.h from mm.h. For this,
    1) Uninline vma_wants_writenotify(). It's pretty huge anyway.
    2) Add back fs.h or less bloated headers (err.h) to files that need it.

    As result, on x86_64 allyesconfig, fs.h dependencies cut down from 3929 files
    rebuilt down to 3444 (-12.3%).

    Cross-compile tested without regressions on my two usual configs and (sigh):

    alpha arm-mx1ads mips-bigsur powerpc-ebony
    alpha-allnoconfig arm-neponset mips-capcella powerpc-g5
    alpha-defconfig arm-netwinder mips-cobalt powerpc-holly
    alpha-up arm-netx mips-db1000 powerpc-iseries
    arm arm-ns9xxx mips-db1100 powerpc-linkstation
    arm-assabet arm-omap_h2_1610 mips-db1200 powerpc-lite5200
    arm-at91rm9200dk arm-onearm mips-db1500 powerpc-maple
    arm-at91rm9200ek arm-picotux200 mips-db1550 powerpc-mpc7448_hpc2
    arm-at91sam9260ek arm-pleb mips-ddb5477 powerpc-mpc8272_ads
    arm-at91sam9261ek arm-pnx4008 mips-decstation powerpc-mpc8313_rdb
    arm-at91sam9263ek arm-pxa255-idp mips-e55 powerpc-mpc832x_mds
    arm-at91sam9rlek arm-realview mips-emma2rh powerpc-mpc832x_rdb
    arm-ateb9200 arm-realview-smp mips-excite powerpc-mpc834x_itx
    arm-badge4 arm-rpc mips-fulong powerpc-mpc834x_itxgp
    arm-carmeva arm-s3c2410 mips-ip22 powerpc-mpc834x_mds
    arm-cerfcube arm-shannon mips-ip27 powerpc-mpc836x_mds
    arm-clps7500 arm-shark mips-ip32 powerpc-mpc8540_ads
    arm-collie arm-simpad mips-jazz powerpc-mpc8544_ds
    arm-corgi arm-spitz mips-jmr3927 powerpc-mpc8560_ads
    arm-csb337 arm-trizeps4 mips-malta powerpc-mpc8568mds
    arm-csb637 arm-versatile mips-mipssim powerpc-mpc85xx_cds
    arm-ebsa110 i386 mips-mpc30x powerpc-mpc8641_hpcn
    arm-edb7211 i386-allnoconfig mips-msp71xx powerpc-mpc866_ads
    arm-em_x270 i386-defconfig mips-ocelot powerpc-mpc885_ads
    arm-ep93xx i386-up mips-pb1100 powerpc-pasemi
    arm-footbridge ia64 mips-pb1500 powerpc-pmac32
    arm-fortunet ia64-allnoconfig mips-pb1550 powerpc-ppc64
    arm-h3600 ia64-bigsur mips-pnx8550-jbs powerpc-prpmc2800
    arm-h7201 ia64-defconfig mips-pnx8550-stb810 powerpc-ps3
    arm-h7202 ia64-gensparse mips-qemu powerpc-pseries
    arm-hackkit ia64-sim mips-rbhma4200 powerpc-up
    arm-integrator ia64-sn2 mips-rbhma4500 s390
    arm-iop13xx ia64-tiger mips-rm200 s390-allnoconfig
    arm-iop32x ia64-up mips-sb1250-swarm s390-defconfig
    arm-iop33x ia64-zx1 mips-sead s390-up
    arm-ixp2000 m68k mips-tb0219 sparc
    arm-ixp23xx m68k-amiga mips-tb0226 sparc-allnoconfig
    arm-ixp4xx m68k-apollo mips-tb0287 sparc-defconfig
    arm-jornada720 m68k-atari mips-workpad sparc-up
    arm-kafa m68k-bvme6000 mips-wrppmc sparc64
    arm-kb9202 m68k-hp300 mips-yosemite sparc64-allnoconfig
    arm-ks8695 m68k-mac parisc sparc64-defconfig
    arm-lart m68k-mvme147 parisc-allnoconfig sparc64-up
    arm-lpd270 m68k-mvme16x parisc-defconfig um-x86_64
    arm-lpd7a400 m68k-q40 parisc-up x86_64
    arm-lpd7a404 m68k-sun3 powerpc x86_64-allnoconfig
    arm-lubbock m68k-sun3x powerpc-cell x86_64-defconfig
    arm-lusl7200 mips powerpc-celleb x86_64-up
    arm-mainstone mips-atlas powerpc-chrp32

    Signed-off-by: Alexey Dobriyan
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     

20 Jul, 2007

2 commits

  • Remove the arg+env limit of MAX_ARG_PAGES by copying the strings directly from
    the old mm into the new mm.

    We create the new mm before the binfmt code runs, and place the new stack at
    the very top of the address space. Once the binfmt code runs and figures out
    where the stack should be, we move it downwards.

    It is a bit peculiar in that we have one task with two mm's, one of which is
    inactive.

    [a.p.zijlstra@chello.nl: limit stack size]
    Signed-off-by: Ollie Wild
    Signed-off-by: Peter Zijlstra
    Cc:
    Cc: Hugh Dickins
    [bunk@stusta.de: unexport bprm_mm_init]
    Signed-off-by: Adrian Bunk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ollie Wild
     
  • Nonlinear mappings are (AFAIKS) simply a virtual memory concept that encodes
    the virtual address -> file offset differently from linear mappings.

    ->populate is a layering violation because the filesystem/pagecache code
    should need to know anything about the virtual memory mapping. The hitch here
    is that the ->nopage handler didn't pass down enough information (ie. pgoff).
    But it is more logical to pass pgoff rather than have the ->nopage function
    calculate it itself anyway (because that's a similar layering violation).

    Having the populate handler install the pte itself is likewise a nasty thing
    to be doing.

    This patch introduces a new fault handler that replaces ->nopage and
    ->populate and (later) ->nopfn. Most of the old mechanism is still in place
    so there is a lot of duplication and nice cleanups that can be removed if
    everyone switches over.

    The rationale for doing this in the first place is that nonlinear mappings are
    subject to the pagefault vs invalidate/truncate race too, and it seemed stupid
    to duplicate the synchronisation logic rather than just consolidate the two.

    After this patch, MAP_NONBLOCK no longer sets up ptes for pages present in
    pagecache. Seems like a fringe functionality anyway.

    NOPAGE_REFAULT is removed. This should be implemented with ->fault, and no
    users have hit mainline yet.

    [akpm@linux-foundation.org: cleanup]
    [randy.dunlap@oracle.com: doc. fixes for readahead]
    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Nick Piggin
    Signed-off-by: Randy Dunlap
    Cc: Mark Fasheh
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

17 Jul, 2007

1 commit

  • This is a straightforward split of do_mmap_pgoff() into two functions:

    - do_mmap_pgoff() checks the parameters, and calculates the vma
    flags. Then it calls

    - mmap_region(), which does the actual mapping

    Signed-off-by: Miklos Szeredi
    Acked-by: Peter Zijlstra
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Miklos Szeredi
     

12 Jul, 2007

1 commit

  • Add a new security check on mmap operations to see if the user is attempting
    to mmap to low area of the address space. The amount of space protected is
    indicated by the new proc tunable /proc/sys/vm/mmap_min_addr and defaults to
    0, preserving existing behavior.

    This patch uses a new SELinux security class "memprotect." Policy already
    contains a number of allow rules like a_t self:process * (unconfined_t being
    one of them) which mean that putting this check in the process class (its
    best current fit) would make it useless as all user processes, which we also
    want to protect against, would be allowed. By taking the memprotect name of
    the new class it will also make it possible for us to move some of the other
    memory protect permissions out of 'process' and into the new class next time
    we bump the policy version number (which I also think is a good future idea)

    Acked-by: Stephen Smalley
    Acked-by: Chris Wright
    Signed-off-by: Eric Paris
    Signed-off-by: James Morris

    Eric Paris
     

22 Jun, 2007

1 commit

  • Function expand_upwards() did not guarded against wrapping
    around to address 0. This fixes the adjtimex02 testcase from
    the Linux Test Project on a 32bit PARISC kernel.

    [expand_upwards is only used on parisc and ia64; it looks like it does
    the right thing on both. --kyle]

    Signed-off-by: Helge Deller
    Cc: Tony Luck
    Signed-off-by: Kyle McMartin

    Helge Deller
     

09 May, 2007

2 commits


08 May, 2007

2 commits

  • Remove the hugetlbfs specific hacks in toplevel get_unmapped_area() now that
    all archs and hugetlbfs itself do the right thing for both cases.

    Signed-off-by: Benjamin Herrenschmidt
    Acked-by: William Irwin
    Cc: Paul Mackerras
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Russell King
    Cc: David Howells
    Cc: Andi Kleen
    Cc: "Luck, Tony"
    Cc: Kyle McMartin
    Cc: Grant Grundler
    Cc: Matthew Wilcox
    Cc: "David S. Miller"
    Cc: Adam Litke
    Cc: David Gibson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin Herrenschmidt
     
  • generic arch_get_unmapped_area() now handles MAP_FIXED. Now that all
    implementations have been fixed, change the toplevel get_unmapped_area() to
    call into arch or drivers for the MAP_FIXED case.

    Signed-off-by: Benjamin Herrenschmidt
    Cc: Paul Mackerras
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Russell King
    Cc: David Howells
    Cc: Andi Kleen
    Cc: "Luck, Tony"
    Cc: Kyle McMartin
    Cc: Grant Grundler
    Cc: Matthew Wilcox
    Cc: "David S. Miller"
    Cc: William Irwin
    Cc: Adam Litke
    Cc: David Gibson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Benjamin Herrenschmidt
     

03 May, 2007

1 commit

  • Add hooks to allow a paravirt implementation to track the lifetime of
    an mm. Paravirtualization requires three hooks, but only two are
    needed in common code. They are:

    arch_dup_mmap, which is called when a new mmap is created at fork

    arch_exit_mmap, which is called when the last process reference to an
    mm is dropped, which typically happens on exit and exec.

    The third hook is activate_mm, which is called from the arch-specific
    activate_mm() macro/function, and so doesn't need stub versions for
    other architectures. It's called when an mm is first used.

    Signed-off-by: Jeremy Fitzhardinge
    Signed-off-by: Andi Kleen
    Cc: linux-arch@vger.kernel.org
    Cc: James Bottomley
    Acked-by: Ingo Molnar

    Jeremy Fitzhardinge
     

02 Mar, 2007

1 commit

  • The code is seemingly trying to make sure that rb_next() brings us to
    successive increasing vma entries.

    But the two variables, prev and pend, used to perform these checks, are
    never advanced.

    Signed-off-by: David S. Miller
    Cc: Andrea Arcangeli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Miller
     

10 Feb, 2007

1 commit

  • This patch adds a utility function install_special_mapping, for creating a
    special vma using a fixed set of preallocated pages as backing, such as for a
    vDSO. This consolidates some nearly identical code used for vDSO mapping
    reimplemented for different architectures.

    Signed-off-by: Roland McGrath
    Cc: Ingo Molnar
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Roland McGrath
     

31 Jan, 2007

1 commit

  • When expanding the stack, we don't currently check if the VMA will cross
    into an area of the address space that is reserved for hugetlb pages.
    Subsequent faults on the expanded portion of such a VMA will confuse the
    low-level MMU code, resulting in an OOPS. Check for this.

    Signed-off-by: Adam Litke
    Cc: David Gibson
    Cc: William Lee Irwin III
    Cc: Hugh Dickins
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Adam Litke
     

09 Dec, 2006

1 commit


08 Dec, 2006

1 commit


15 Nov, 2006

3 commits

  • Commit cb07c9a1864a8eac9f3123e428100d5b2a16e65a causes the wrong return
    value. is_hugepage_only_range() is a boolean, so we should return
    -EINVAL rather than 1.

    Also - we can use "mm" instead of looking up "current->mm" again.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Unlike mmap(), the codepath for brk() creates a vma without first checking
    that it doesn't touch a region exclusively reserved for hugepages. On
    powerpc, this can allow it to create a normal page vma in a hugepage
    region, causing oopses and other badness.

    Add a test to prevent this. With this patch, brk() will simply fail if it
    attempts to move the break into a hugepage reserved region.

    Signed-off-by: David Gibson
    Cc: Adam Litke
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Gibson
     
  • (David:)

    If hugetlbfs_file_mmap() returns a failure to do_mmap_pgoff() - for example,
    because the given file offset is not hugepage aligned - then do_mmap_pgoff
    will go to the unmap_and_free_vma backout path.

    But at this stage the vma hasn't been marked as hugepage, and the backout path
    will call unmap_region() on it. That will eventually call down to the
    non-hugepage version of unmap_page_range(). On ppc64, at least, that will
    cause serious problems if there are any existing hugepage pagetable entries in
    the vicinity - for example if there are any other hugepage mappings under the
    same PUD. unmap_page_range() will trigger a bad_pud() on the hugepage pud
    entries. I suspect this will also cause bad problems on ia64, though I don't
    have a machine to test it on.

    (Hugh:)

    prepare_hugepage_range() should check file offset alignment when it checks
    virtual address and length, to stop MAP_FIXED with a bad huge offset from
    unmapping before it fails further down. PowerPC should apply the same
    prepare_hugepage_range alignment checks as ia64 and all the others do.

    Then none of the alignment checks in hugetlbfs_file_mmap are required (nor
    is the check for too small a mapping); but even so, move up setting of
    VM_HUGETLB and add a comment to warn of what David Gibson discovered - if
    hugetlbfs_file_mmap fails before setting it, do_mmap_pgoff's unmap_region
    when unwinding from error will go the non-huge way, which may cause bad
    behaviour on architectures (powerpc and ia64) which segregate their huge
    mappings into a separate region of the address space.

    Signed-off-by: Hugh Dickins
    Cc: "Luck, Tony"
    Cc: "David S. Miller"
    Acked-by: Adam Litke
    Acked-by: David Gibson
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

16 Oct, 2006

1 commit

  • .. and clean up the file mapping code while at it. No point in having a
    "if (file)" repeated twice, and generally doing similar checks in two
    different sections of the same code

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

26 Sep, 2006

2 commits

  • Remove the atomic counter for slab_reclaim_pages and replace the counter
    and NR_SLAB with two ZVC counter that account for unreclaimable and
    reclaimable slab pages: NR_SLAB_RECLAIMABLE and NR_SLAB_UNRECLAIMABLE.

    Change the check in vmscan.c to refer to to NR_SLAB_RECLAIMABLE. The
    intend seems to be to check for slab pages that could be freed.

    Signed-off-by: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Tracking of dirty pages in shared writeable mmap()s.

    The idea is simple: write protect clean shared writeable pages, catch the
    write-fault, make writeable and set dirty. On page write-back clean all the
    PTE dirty bits and write protect them once again.

    The implementation is a tad harder, mainly because the default
    backing_dev_info capabilities were too loosely maintained. Hence it is not
    enough to test the backing_dev_info for cap_account_dirty.

    The current heuristic is as follows, a VMA is eligible when:
    - its shared writeable
    (vm_flags & (VM_WRITE|VM_SHARED)) == (VM_WRITE|VM_SHARED)
    - it is not a 'special' mapping
    (vm_flags & (VM_PFNMAP|VM_INSERTPAGE)) == 0
    - the backing_dev_info is cap_account_dirty
    mapping_cap_account_dirty(vma->vm_file->f_mapping)
    - f_op->mmap() didn't change the default page protection

    Page from remap_pfn_range() are explicitly excluded because their COW
    semantics are already horrid enough (see vm_normal_page() in do_wp_page()) and
    because they don't have a backing store anyway.

    mprotect() is taught about the new behaviour as well. However it overrides
    the last condition.

    Cleaning the pages on write-back is done with page_mkclean() a new rmap call.
    It can be called on any page, but is currently only implemented for mapped
    pages, if the page is found the be of a VMA that accounts dirty pages it will
    also wrprotect the PTE.

    Finally, in fs/buffers.c:try_to_free_buffers(); remove clear_page_dirty() from
    under ->private_lock. This seems to be safe, since ->private_lock is used to
    serialize access to the buffers, not the page itself. This is needed because
    clear_page_dirty() will call into page_mkclean() and would thereby violate
    locking order.

    [dhowells@redhat.com: Provide a page_mkclean() implementation for NOMMU]
    Signed-off-by: Peter Zijlstra
    Cc: Hugh Dickins
    Signed-off-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Peter Zijlstra
     

23 Sep, 2006

1 commit

  • * master.kernel.org:/pub/scm/linux/kernel/git/davej/agpgart:
    [AGPGART] Rework AGPv3 modesetting fallback.
    [AGPGART] Add suspend callback for i965
    [AGPGART] Fix number of aperture sizes in 830 gart structs.
    [AGPGART] Intel 965 Express support.
    [AGPGART] agp.h: constify struct agp_bridge_data::version
    [AGPGART] const'ify VIA AGP PCI table.
    [AGPGART] CONFIG_PM=n slim: drivers/char/agp/intel-agp.c
    [AGPGART] CONFIG_PM=n slim: drivers/char/agp/efficeon-agp.c
    [AGPGART] Const'ify the agpgart driver version.
    [AGPGART] remove private page protection map

    Linus Torvalds