17 Nov, 2011

1 commit

  • When mapping a foreign page with xenbus_map_ring_valloc() with the
    GNTTABOP_map_grant_ref hypercall, set the GNTMAP_contains_pte flag and
    pass a pointer to the PTE (in init_mm).

    After the page is mapped, the usual fault mechanism can be used to
    update additional MMs. This allows the vmalloc_sync_all() to be
    removed from alloc_vm_area().

    Signed-off-by: David Vrabel
    Acked-by: Andrew Morton
    [v1: Squashed fix by Michal for no-mmu case]
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Michal Simek

    David Vrabel
     

31 Oct, 2011

1 commit


26 Jul, 2011

1 commit

  • - shmem pages are not immediately available, but they are not
    potentially available either, even if we swap them out, they will just
    relocate from memory into swap, total amount of immediate and
    potentially available memory is not going to be affected, so we
    shouldn't count them as potentially free in the first place.

    - nr_free_pages() is not an expensive operation anymore, there is no
    need to split the decision making in two halves and repeat code.

    Signed-off-by: Dmitry Fink
    Reviewed-by: Minchan Kim
    Acked-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Fink
     

23 Jul, 2011

1 commit

  • * 'ptrace' of git://git.kernel.org/pub/scm/linux/kernel/git/oleg/misc: (39 commits)
    ptrace: do_wait(traced_leader_killed_by_mt_exec) can block forever
    ptrace: fix ptrace_signal() && STOP_DEQUEUED interaction
    connector: add an event for monitoring process tracers
    ptrace: dont send SIGSTOP on auto-attach if PT_SEIZED
    ptrace: mv send-SIGSTOP from do_fork() to ptrace_init_task()
    ptrace_init_task: initialize child->jobctl explicitly
    has_stopped_jobs: s/task_is_stopped/SIGNAL_STOP_STOPPED/
    ptrace: make former thread ID available via PTRACE_GETEVENTMSG after PTRACE_EVENT_EXEC stop
    ptrace: wait_consider_task: s/same_thread_group/ptrace_reparented/
    ptrace: kill real_parent_is_ptracer() in in favor of ptrace_reparented()
    ptrace: ptrace_reparented() should check same_thread_group()
    redefine thread_group_leader() as exit_signal >= 0
    do not change dead_task->exit_signal
    kill task_detached()
    reparent_leader: check EXIT_DEAD instead of task_detached()
    make do_notify_parent() __must_check, update the callers
    __ptrace_detach: avoid task_detached(), check do_notify_parent()
    kill tracehook_notify_death()
    make do_notify_parent() return bool
    ptrace: s/tracehook_tracer_task()/ptrace_parent()/
    ...

    Linus Torvalds
     

09 Jul, 2011

1 commit

  • remap_pfn_range() means map physical address pfn<vm_start = pfn << PAGE_SHIFT which
    is wrong acroding the original meaning of this function. And some driver
    developer using remap_pfn_range() with correct parameter will get
    unexpected result because vm_start is changed. It should be implementd
    like addr = pfn << PAGE_SHIFT but which is meanless on nommu arch, this
    patch just make it simply return.

    Parameter name and setting of vma->vm_flags also be fixed.

    Signed-off-by: Bob Liu
    Cc: Geert Uytterhoeven
    Cc: David Howells
    Acked-by: Greg Ungerer
    Cc: Mike Frysinger
    Cc: Bob Liu
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Liu
     

23 Jun, 2011

1 commit

  • At this point, tracehooks aren't useful to mainline kernel and mostly
    just add an extra layer of obfuscation. Although they have comments,
    without actual in-kernel users, it is difficult to tell what are their
    assumptions and they're actually trying to achieve. To mainline
    kernel, they just aren't worth keeping around.

    This patch kills the following trivial tracehooks.

    * Ones testing whether task is ptraced. Replace with ->ptrace test.

    tracehook_expect_breakpoints()
    tracehook_consider_ignored_signal()
    tracehook_consider_fatal_signal()

    * ptrace_event() wrappers. Call directly.

    tracehook_report_exec()
    tracehook_report_exit()
    tracehook_report_vfork_done()

    * ptrace_release_task() wrapper. Call directly.

    tracehook_finish_release_task()

    * noop

    tracehook_prepare_release_task()
    tracehook_report_death()

    This doesn't introduce any behavior change.

    Signed-off-by: Tejun Heo
    Cc: Christoph Hellwig
    Cc: Martin Schwidefsky
    Signed-off-by: Oleg Nesterov

    Tejun Heo
     

25 May, 2011

8 commits

  • Currently on nommu arch mmap(),mremap() and munmap() doesn't do
    page_align() which isn't consist with mmu arch and cause some issues.

    First, some drivers' mmap() function depends on vma->vm_end - vma->start
    is page aligned which is true on mmu arch but not on nommu. eg: uvc
    camera driver.

    Second munmap() may return -EINVAL[split file] error in cases when end is
    not page aligned(passed into from userspace) but vma->vm_end is aligned
    dure to split or driver's mmap() ops.

    Add page alignment to fix those issues.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Bob Liu
    Cc: David Howells
    Cc: Paul Mundt
    Cc: Greg Ungerer
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bob Liu
     
  • Because 'ret' is declared as int, not unsigned long, no need to cast the
    error contants into unsigned long. If you compile this code on a 64-bit
    machine somehow, you'll see following warning:

    CC mm/nommu.o
    mm/nommu.c: In function `do_mmap_pgoff':
    mm/nommu.c:1411: warning: overflow in implicit constant conversion

    Signed-off-by: Namhyung Kim
    Acked-by: Greg Ungerer
    Cc: David Howells
    Cc: Paul Mundt
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Namhyung Kim
     
  • If f_op->read() fails and sysctl_nr_trim_pages > 1, there could be a
    memory leak between @region->vm_end and @region->vm_top.

    Signed-off-by: Namhyung Kim
    Acked-by: Greg Ungerer
    Cc: David Howells
    Cc: Paul Mundt
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Namhyung Kim
     
  • Now we have the sorted vma list, use it in do_munmap() to check that we
    have an exact match.

    Signed-off-by: Namhyung Kim
    Acked-by: Greg Ungerer
    Cc: David Howells
    Cc: Paul Mundt
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Namhyung Kim
     
  • Now we have the sorted vma list, use it in the find_vma[_exact]() rather
    than doing linear search on the rb-tree.

    Signed-off-by: Namhyung Kim
    Acked-by: Greg Ungerer
    Cc: David Howells
    Cc: Paul Mundt
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Namhyung Kim
     
  • Since commit 297c5eee3724 ("mm: make the vma list be doubly linked") made
    it a doubly linked list, we don't need to scan the list when deleting
    @vma.

    And the original code didn't update the prev pointer. Fix it too.

    Signed-off-by: Namhyung Kim
    Acked-by: Greg Ungerer
    Cc: David Howells
    Cc: Paul Mundt
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Namhyung Kim
     
  • When I was reading nommu code, I found that it handles the vma list/tree
    in an unusual way. IIUC, because there can be more than one
    identical/overrapped vmas in the list/tree, it sorts the tree more
    strictly and does a linear search on the tree. But it doesn't applied to
    the list (i.e. the list could be constructed in a different order than
    the tree so that we can't use the list when finding the first vma in that
    order).

    Since inserting/sorting a vma in the tree and link is done at the same
    time, we can easily construct both of them in the same order. And linear
    searching on the tree could be more costly than doing it on the list, it
    can be converted to use the list.

    Also, after the commit 297c5eee3724 ("mm: make the vma list be doubly
    linked") made the list be doubly linked, there were a couple of code need
    to be fixed to construct the list properly.

    Patch 1/6 is a preparation. It maintains the list sorted same as the tree
    and construct doubly-linked list properly. Patch 2/6 is a simple
    optimization for the vma deletion. Patch 3/6 and 4/6 convert tree
    traversal to list traversal and the rest are simple fixes and cleanups.

    This patch:

    @vma added into @mm should be sorted by start addr, end addr and VMA
    struct addr in that order because we may get identical VMAs in the @mm.
    However this was true only for the rbtree, not for the list.

    This patch fixes this by remembering 'rb_prev' during the tree traversal
    like find_vma_prepare() does and linking the @vma via __vma_link_list().
    After this patch, we can iterate the whole VMAs in correct order simply by
    using @mm->mmap list.

    [akpm@linux-foundation.org: avoid duplicating __vma_link_list()]
    Signed-off-by: Namhyung Kim
    Acked-by: Greg Ungerer
    Cc: David Howells
    Cc: Paul Mundt
    Cc: Geert Uytterhoeven
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Namhyung Kim
     
  • Architectures that implement their own show_mem() function did not pass
    the filter argument to show_free_areas() to appropriately avoid emitting
    the state of nodes that are disallowed in the current context. This patch
    now passes the filter argument to show_free_areas() so those nodes are now
    avoided.

    This patch also removes the show_free_areas() wrapper around
    __show_free_areas() and converts existing callers to pass an empty filter.

    ia64 emits additional information for each node, so skip_free_areas_zone()
    must be made global to filter disallowed nodes and it is converted to use
    a nid argument rather than a zone for this use case.

    Signed-off-by: David Rientjes
    Cc: Russell King
    Cc: Tony Luck
    Cc: Fenghua Yu
    Cc: Kyle McMartin
    Cc: Helge Deller
    Cc: James Bottomley
    Cc: "David S. Miller"
    Cc: Guan Xuetao
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Rientjes
     

29 Mar, 2011

1 commit

  • Recent vm changes brought in a new function which the core procfs code
    utilizes. So implement it for nommu systems too to avoid link failures.

    Signed-off-by: Mike Frysinger
    Signed-off-by: David Howells
    Tested-by: Simon Horman
    Tested-by: Ithamar Adema
    Acked-by: Greg Ungerer

    Mike Frysinger
     

25 Mar, 2011

1 commit

  • * 'for-2.6.39/core' of git://git.kernel.dk/linux-2.6-block: (65 commits)
    Documentation/iostats.txt: bit-size reference etc.
    cfq-iosched: removing unnecessary think time checking
    cfq-iosched: Don't clear queue stats when preempt.
    blk-throttle: Reset group slice when limits are changed
    blk-cgroup: Only give unaccounted_time under debug
    cfq-iosched: Don't set active queue in preempt
    block: fix non-atomic access to genhd inflight structures
    block: attempt to merge with existing requests on plug flush
    block: NULL dereference on error path in __blkdev_get()
    cfq-iosched: Don't update group weights when on service tree
    fs: assign sb->s_bdi to default_backing_dev_info if the bdi is going away
    block: Require subsystems to explicitly allocate bio_set integrity mempool
    jbd2: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
    jbd: finish conversion from WRITE_SYNC_PLUG to WRITE_SYNC and explicit plugging
    fs: make fsync_buffers_list() plug
    mm: make generic_writepages() use plugging
    blk-cgroup: Add unaccounted time to timeslice_used.
    block: fixup plugging stubs for !CONFIG_BLOCK
    block: remove obsolete comments for blkdev_issue_zeroout.
    blktrace: Use rq->cmd_flags directly in blk_add_trace_rq.
    ...

    Fix up conflicts in fs/{aio.c,super.c}

    Linus Torvalds
     

24 Mar, 2011

1 commit


10 Mar, 2011

1 commit

  • Code has been converted over to the new explicit on-stack plugging,
    and delay users have been converted to use the new API for that.
    So lets kill off the old plugging along with aops->sync_page().

    Signed-off-by: Jens Axboe

    Jens Axboe
     

14 Jan, 2011

1 commit

  • __get_user_pages gets a new 'nonblocking' parameter to signal that the
    caller is prepared to re-acquire mmap_sem and retry the operation if
    needed. This is used to split off long operations if they are going to
    block on a disk transfer, or when we detect contention on the mmap_sem.

    [akpm@linux-foundation.org: remove ref to rwsem_is_contended()]
    Signed-off-by: Michel Lespinasse
    Cc: Hugh Dickins
    Cc: Rik van Riel
    Cc: Peter Zijlstra
    Cc: Nick Piggin
    Cc: KOSAKI Motohiro
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: Thomas Gleixner
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michel Lespinasse
     

24 Dec, 2010

2 commits


25 Nov, 2010

1 commit

  • Depending on processor speed, page size, and the amount of memory a
    process is allowed to amass, cleanup of a large VM may freeze the system
    for many seconds. This can result in a watchdog timeout.

    Make sure other tasks receive some service when cleaning up large VMs.

    Signed-off-by: Steven J. Magnani
    Cc: Greg Ungerer
    Reviewed-by: KOSAKI Motohiro
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Steven J. Magnani
     

30 Oct, 2010

1 commit

  • Normal syscall audit doesn't catch 5th argument of syscall. It also
    doesn't catch the contents of userland structures pointed to be
    syscall argument, so for both old and new mmap(2) ABI it doesn't
    record the descriptor we are mapping. For old one it also misses
    flags.

    Signed-off-by: Al Viro

    Al Viro
     

27 Oct, 2010

1 commit

  • Add vzalloc() and vzalloc_node() to encapsulate the
    vmalloc-then-memset-zero operation.

    Use __GFP_ZERO to zero fill the allocated memory.

    Signed-off-by: Dave Young
    Cc: Christoph Lameter
    Acked-by: Greg Ungerer
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Young
     

21 Aug, 2010

1 commit

  • It's a really simple list, and several of the users want to go backwards
    in it to find the previous vma. So rather than have to look up the
    previous entry with 'find_vma_prev()' or something similar, just make it
    doubly linked instead.

    Tested-by: Ian Campbell
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

14 Aug, 2010

1 commit

  • Remove an extraneous no_printk() in mm/nommu.c that got missed when the
    function got generalised from several things that used it in commit
    12fdff3fc248 ("Add a dummy printk function for the maintenance of unused
    printks").

    Without this, the following error is observed:

    mm/nommu.c:41: error: conflicting types for 'no_printk'
    include/linux/kernel.h:314: error: previous definition of 'no_printk' was here

    Reported-by: Michal Simek
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     

26 May, 2010

1 commit

  • Slightly rearrange the logic that determines capabilities and vm_flags.
    Disable BDI_CAP_MAP_DIRECT in all cases if the device can't support the
    protections. Allow private readonly mappings of readonly backing devices.

    Signed-off-by: Bernd Schmidt
    Signed-off-by: Mike Frysinger
    Acked-by: David McCullough
    Acked-by: Greg Ungerer
    Acked-by: Paul Mundt
    Acked-by: David Howells
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Bernd Schmidt
     

26 Mar, 2010

2 commits

  • Fix __get_user_pages() to make it pin the last page on a buffer that doesn't
    begin at the start of a page, but is a multiple of PAGE_SIZE in size.

    The problem is that __get_user_pages() advances the pointer too much when it
    iterates to the next page if the page it's currently looking at isn't used from
    the first byte. This can cause the end of a short VMA to be reached
    prematurely, resulting in the last page being lost.

    Signed-off-by: Steven J. Magnani
    Signed-off-by: David Howells
    Signed-off-by: Linus Torvalds

    David Howells
     
  • Revert the following patch:

    commit c08c6e1f54c85fc299cf9f88cf330d6dd28a9a1d
    Author: Steven J. Magnani
    Date: Fri Mar 5 13:42:24 2010 -0800

    nommu: get_user_pages(): pin last page on non-page-aligned start

    As it assumes that the mappings begin at the start of pages - something that
    isn't necessarily true on NOMMU systems. On NOMMU systems, it is possible for
    a mapping to only occupy part of the page, and not necessarily touch either end
    of it; in fact it's also possible for multiple non-overlapping mappings to
    coexist on one page (consider direct mappings of ROMFS files, for example).

    Signed-off-by: David Howells
    Acked-by: Steven J. Magnani
    Signed-off-by: Linus Torvalds

    David Howells
     

25 Mar, 2010

1 commit


13 Mar, 2010

1 commit

  • Add a generic implementation of the old mmap() syscall, which expects its
    argument in a memory block and switch all architectures over to use it.

    Signed-off-by: Christoph Hellwig
    Cc: Ralf Baechle
    Cc: Benjamin Herrenschmidt
    Cc: Paul Mundt
    Cc: Jeff Dike
    Cc: Hirokazu Takata
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Reviewed-by: H. Peter Anvin
    Cc: Al Viro
    Cc: Arnd Bergmann
    Cc: Heiko Carstens
    Cc: Martin Schwidefsky
    Cc: "Luck, Tony"
    Cc: James Morris
    Cc: Andreas Schwab
    Acked-by: Jesper Nilsson
    Acked-by: Russell King
    Acked-by: Greg Ungerer
    Acked-by: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Hellwig
     

07 Mar, 2010

2 commits

  • The noMMU version of get_user_pages() fails to pin the last page when the
    start address isn't page-aligned. The patch fixes this in a way that
    makes find_extend_vma() congruent to its MMU cousin.

    Signed-off-by: Steven J. Magnani
    Acked-by: Paul Mundt
    Cc: David Howells
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Steven J. Magnani
     
  • The old anon_vma code can lead to scalability issues with heavily forking
    workloads. Specifically, each anon_vma will be shared between the parent
    process and all its child processes.

    In a workload with 1000 child processes and a VMA with 1000 anonymous
    pages per process that get COWed, this leads to a system with a million
    anonymous pages in the same anon_vma, each of which is mapped in just one
    of the 1000 processes. However, the current rmap code needs to walk them
    all, leading to O(N) scanning complexity for each page.

    This can result in systems where one CPU is walking the page tables of
    1000 processes in page_referenced_one, while all other CPUs are stuck on
    the anon_vma lock. This leads to catastrophic failure for a benchmark
    like AIM7, where the total number of processes can reach in the tens of
    thousands. Real workloads are still a factor 10 less process intensive
    than AIM7, but they are catching up.

    This patch changes the way anon_vmas and VMAs are linked, which allows us
    to associate multiple anon_vmas with a VMA. At fork time, each child
    process gets its own anon_vmas, in which its COWed pages will be
    instantiated. The parents' anon_vma is also linked to the VMA, because
    non-COWed pages could be present in any of the children.

    This reduces rmap scanning complexity to O(1) for the pages of the 1000
    child processes, with O(N) complexity for at most 1/N pages in the system.
    This reduces the average scanning cost in heavily forking workloads from
    O(N) to 2.

    The only real complexity in this patch stems from the fact that linking a
    VMA to anon_vmas now involves memory allocations. This means vma_adjust
    can fail, if it needs to attach a VMA to anon_vma structures. This in
    turn means error handling needs to be added to the calling functions.

    A second source of complexity is that, because there can be multiple
    anon_vmas, the anon_vma linking in vma_adjust can no longer be done under
    "the" anon_vma lock. To prevent the rmap code from walking up an
    incomplete VMA, this patch introduces the VM_LOCK_RMAP VMA flag. This bit
    flag uses the same slot as the NOMMU VM_MAPPED_COPY, with an ifdef in mm.h
    to make sure it is impossible to compile a kernel that needs both symbolic
    values for the same bitflag.

    Some test results:

    Without the anon_vma changes, when AIM7 hits around 9.7k users (on a test
    box with 16GB RAM and not quite enough IO), the system ends up running
    >99% in system time, with every CPU on the same anon_vma lock in the
    pageout code.

    With these changes, AIM7 hits the cross-over point around 29.7k users.
    This happens with ~99% IO wait time, there never seems to be any spike in
    system time. The anon_vma lock contention appears to be resolved.

    [akpm@linux-foundation.org: cleanups]
    Signed-off-by: Rik van Riel
    Cc: KOSAKI Motohiro
    Cc: Larry Woodman
    Cc: Lee Schermerhorn
    Cc: Minchan Kim
    Cc: Andrea Arcangeli
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rik van Riel
     

17 Jan, 2010

4 commits

  • Fix a problem in NOMMU mmap with ramfs whereby a shared mmap can happen
    over the end of a truncation. The problem is that
    ramfs_nommu_check_mappings() checks that the reduced file size against the
    VMA tree, but not the vm_region tree.

    The following sequence of events can cause the problem:

    fd = open("/tmp/x", O_RDWR|O_TRUNC|O_CREAT, 0600);
    ftruncate(fd, 32 * 1024);
    a = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
    b = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
    munmap(a, 32 * 1024);
    ftruncate(fd, 16 * 1024);
    c = mmap(NULL, 32 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

    Mapping 'a' creates a vm_region covering 32KB of the file. Mapping 'b'
    sees that the vm_region from 'a' is covering the region it wants and so
    shares it, pinning it in memory.

    Mapping 'a' then goes away and the file is truncated to the end of VMA
    'b'. However, the region allocated by 'a' is still in effect, and has
    _not_ been reduced.

    Mapping 'c' is then created, and because there's a vm_region covering the
    desired region, get_unmapped_area() is _not_ called to repeat the check,
    and the mapping is granted, even though the pages from the latter half of
    the mapping have been discarded.

    However:

    d = mmap(NULL, 16 * 1024, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

    Mapping 'd' should work, and should end up sharing the region allocated by
    'a'.

    To deal with this, we shrink the vm_region struct during the truncation,
    lest do_mmap_pgoff() take it as licence to share the full region
    automatically without calling the get_unmapped_area() file op again.

    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Greg Ungerer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • get_unmapped_area() is unnecessary for NOMMU as no-one calls it.

    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Greg Ungerer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • In split_vma(), there's no need to check if the VMA being split has a
    region that's in use by more than one VMA because:

    (1) The preceding test prohibits splitting of non-anonymous VMAs and regions
    (eg: file or chardev backed VMAs).

    (2) Anonymous regions can't be mapped multiple times because there's no handle
    by which to refer to the already existing region.

    (3) If a VMA has previously been split, then the region backing it has also
    been split into two regions, each of usage 1.

    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Greg Ungerer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     
  • The vm_usage count field in struct vm_region does not need to be atomic as
    it's only even modified whilst nommu_region_sem is write locked.

    Signed-off-by: David Howells
    Acked-by: Al Viro
    Cc: Greg Ungerer
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Howells
     

07 Jan, 2010

2 commits

  • The MMU code uses the copy_*_user_page() variants in access_process_vm()
    rather than copy_*_user() as the former includes an icache flush. This
    is important when doing things like setting software breakpoints with
    gdb. So switch the NOMMU code over to do the same.

    This patch makes the reasonable assumption that copy_from_user_page()
    won't fail - which is probably fine, as we've checked the VMA from which
    we're copying is usable, and the copy is not allowed to cross VMAs. The
    one case where it might go wrong is if the VMA is a device rather than
    RAM, and that device returns an error which - in which case rubbish will
    be returned rather than EIO.

    Signed-off-by: Jie Zhang
    Signed-off-by: Mike Frysinger
    Signed-off-by: David Howells
    Acked-by: David McCullough
    Acked-by: Paul Mundt
    Acked-by: Greg Ungerer
    Signed-off-by: Linus Torvalds

    Jie Zhang
     
  • When working with FDPIC, there are many shared mappings of read-only
    code regions between applications (the C library, applet packages like
    busybox, etc.), but the current do_mmap_pgoff() function will issue an
    icache flush whenever a VMA is added to an MM instead of only doing it
    when the map is initially created.

    The flush can instead be done when a region is first mmapped PROT_EXEC.
    Note that we may not rely on the first mapping of a region being
    executable - it's possible for it to be PROT_READ only, so we have to
    remember whether we've flushed the region or not, and then flush the
    entire region when a bit of it is made executable.

    However, this also affects the brk area. That will no longer be
    executable. We can mprotect() it to PROT_EXEC on MPU-mode kernels, but
    for NOMMU mode kernels, when it increases the brk allocation, making
    sys_brk() flush the extra from the icache should suffice. The brk area
    probably isn't used by NOMMU programs since the brk area can only use up
    the leavings from the stack allocation, where the stack allocation is
    larger than requested.

    Signed-off-by: David Howells
    Signed-off-by: Mike Frysinger
    Signed-off-by: Linus Torvalds

    Mike Frysinger
     

31 Dec, 2009

1 commit

  • Move sys_mmap_pgoff() from mm/util.c to mm/mmap.c and mm/nommu.c,
    where we'd expect to find such code: especially now that it contains
    the MAP_HUGETLB handling. Revert mm/util.c to how it was in 2.6.32.

    This patch just ignores MAP_HUGETLB in the nommu case, as in 2.6.32,
    whereas 2.6.33-rc2 reported -ENOSYS. Perhaps validate_mmap_request()
    should reject it with -EINVAL? Add that later if necessary.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Hugh Dickins