09 Sep, 2015

1 commit

  • Now that we have hole punching support for hugetlbfs, we can also
    support the MADV_REMOVE interface to it.

    Signed-off-by: Dave Hansen
    Signed-off-by: Mike Kravetz
    Reviewed-by: Naoya Horiguchi
    Acked-by: Hillf Danton
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Davidlohr Bueso
    Cc: Aneesh Kumar
    Cc: Christoph Hellwig
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Kravetz
     

05 Sep, 2015

2 commits

  • This makes the madvise_bahaviour_valid() function return bool due to
    this particular function always returning the value of either one or
    zero as its return value.

    Signed-off-by: Nicholas Krause
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nicholas Krause
     
  • vma->vm_userfaultfd_ctx is yet another vma parameter that vma_merge
    must be aware about so that we can merge vmas back like they were
    originally before arming the userfaultfd on some memory range.

    Signed-off-by: Andrea Arcangeli
    Acked-by: Pavel Emelyanov
    Cc: Sanidhya Kashyap
    Cc: zhang.zhanghailiang@huawei.com
    Cc: "Kirill A. Shutemov"
    Cc: Andres Lagar-Cavilla
    Cc: Dave Hansen
    Cc: Paolo Bonzini
    Cc: Rik van Riel
    Cc: Mel Gorman
    Cc: Andy Lutomirski
    Cc: Hugh Dickins
    Cc: Peter Feiner
    Cc: "Dr. David Alan Gilbert"
    Cc: Johannes Weiner
    Cc: "Huangpeng (Peter)"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     

02 Jun, 2015

1 commit

  • With the planned cgroup writeback support, backing-dev related
    declarations will be more widely used across block and cgroup;
    unfortunately, including backing-dev.h from include/linux/blkdev.h
    makes cyclic include dependency quite likely.

    This patch separates out backing-dev-defs.h which only has the
    essential definitions and updates blkdev.h to include it. c files
    which need access to more backing-dev details now include
    backing-dev.h directly. This takes backing-dev.h off the common
    include dependency chain making it a lot easier to use it across block
    and cgroup.

    v2: fs/fat build failure fixed.

    Signed-off-by: Tejun Heo
    Reviewed-by: Jan Kara
    Cc: Jens Axboe
    Signed-off-by: Jens Axboe

    Tejun Heo
     

17 Feb, 2015

1 commit

  • All callers of get_xip_mem() are now gone. Remove checks for it,
    initialisers of it, documentation of it and the only implementation of it.
    Also remove mm/filemap_xip.c as it is now empty. Also remove
    documentation of the long-gone get_xip_page().

    Signed-off-by: Matthew Wilcox
    Cc: Andreas Dilger
    Cc: Boaz Harrosh
    Cc: Christoph Hellwig
    Cc: Dave Chinner
    Cc: Jan Kara
    Cc: Jens Axboe
    Cc: Kirill A. Shutemov
    Cc: Mathieu Desnoyers
    Cc: Randy Dunlap
    Cc: Ross Zwisler
    Cc: Theodore Ts'o
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

13 Feb, 2015

1 commit

  • Pull backing device changes from Jens Axboe:
    "This contains a cleanup of how the backing device is handled, in
    preparation for a rework of the life time rules. In this part, the
    most important change is to split the unrelated nommu mmap flags from
    it, but also removing a backing_dev_info pointer from the
    address_space (and inode), and a cleanup of other various minor bits.

    Christoph did all the work here, I just fixed an oops with pages that
    have a swap backing. Arnd fixed a missing export, and Oleg killed the
    lustre backing_dev_info from staging. Last patch was from Al,
    unexporting parts that are now no longer needed outside"

    * 'for-3.20/bdi' of git://git.kernel.dk/linux-block:
    Make super_blocks and sb_lock static
    mtd: export new mtd_mmap_capabilities
    fs: make inode_to_bdi() handle NULL inode
    staging/lustre/llite: get rid of backing_dev_info
    fs: remove default_backing_dev_info
    fs: don't reassign dirty inodes to default_backing_dev_info
    nfs: don't call bdi_unregister
    ceph: remove call to bdi_unregister
    fs: remove mapping->backing_dev_info
    fs: export inode_to_bdi and use it in favor of mapping->backing_dev_info
    nilfs2: set up s_bdi like the generic mount_bdev code
    block_dev: get bdev inode bdi directly from the block device
    block_dev: only write bdev inode on close
    fs: introduce f_op->mmap_capabilities for nommu mmap support
    fs: kill BDI_CAP_SWAP_BACKED
    fs: deduplicate noop_backing_dev_info

    Linus Torvalds
     

11 Feb, 2015

2 commits

  • One bit in ->vm_flags is unused now!

    Signed-off-by: Kirill A. Shutemov
    Cc: Dan Carpenter
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     
  • We have remap_file_pages(2) emulation in -mm tree for few release cycles
    and we plan to have it mainline in v3.20. This patchset removes rest of
    VM_NONLINEAR infrastructure.

    Patches 1-8 take care about generic code. They are pretty
    straight-forward and can be applied without other of patches.

    Rest patches removes pte_file()-related stuff from architecture-specific
    code. It usually frees up one bit in non-present pte. I've tried to reuse
    that bit for swap offset, where I was able to figure out how to do that.

    For obvious reason I cannot test all that arch-specific code and would
    like to see acks from maintainers.

    In total, remap_file_pages(2) required about 1.4K lines of not-so-trivial
    kernel code. That's too much for functionality nobody uses.

    Tested-by: Felipe Balbi

    This patch (of 38):

    We don't create non-linear mappings anymore. Let's drop code which
    handles them on unmap/zap.

    Signed-off-by: Kirill A. Shutemov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

21 Jan, 2015

1 commit

  • This bdi flag isn't too useful - we can determine that a vma is backed by
    either swap or shmem trivially in the caller.

    This also allows removing the backing_dev_info instaces for swap and shmem
    in favor of noop_backing_dev_info.

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Tejun Heo
    Reviewed-by: Jan Kara
    Signed-off-by: Jens Axboe

    Christoph Hellwig
     

08 Nov, 2014

1 commit

  • This function needs to be exported so it can be used by the NFSD module
    when responding to the new ALLOCATE and DEALLOCATE operations in NFS
    v4.2. Christoph Hellwig suggested renaming the function to stay
    consistent with how other vfs functions are named.

    Signed-off-by: Anna Schumaker
    Signed-off-by: J. Bruce Fields

    Anna Schumaker
     

07 Aug, 2014

1 commit


24 May, 2014

1 commit

  • MADV_WILLNEED currently does not read swapped out shmem pages back in.

    Commit 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page
    cache radix trees") made find_get_page() filter exceptional radix tree
    entries but failed to convert all find_get_page() callers that WANT
    exceptional entries over to find_get_entry(). One of them is shmem swap
    readahead in madvise, which now skips over any swap-out records.

    Convert it to find_get_entry().

    Fixes: 0cd6144aadd2 ("mm + fs: prepare for non-page entries in page cache radix trees")
    Signed-off-by: Johannes Weiner
    Reported-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

01 Oct, 2013

1 commit

  • madvise_hwpoison won't check if the page is small page or huge page and
    traverses in small page granularity against the range unconditionally,
    which result in a printk flood "MCE xxx: already hardware poisoned" if
    the page is a huge page.

    This patch fixes it by using compound_order(compound_head(page)) for
    huge page iterator.

    Testcase:

    #define _GNU_SOURCE
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    #define PAGES_TO_TEST 3
    #define PAGE_SIZE 4096 * 512

    int main(void)
    {
    char *mem;
    int i;

    mem = mmap(NULL, PAGES_TO_TEST * PAGE_SIZE,
    PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB, 0, 0);

    if (madvise(mem, PAGES_TO_TEST * PAGE_SIZE, MADV_HWPOISON) == -1)
    return -1;

    munmap(mem, PAGES_TO_TEST * PAGE_SIZE);

    return 0;
    }

    Signed-off-by: Wanpeng Li
    Reviewed-by: Naoya Horiguchi
    Acked-by: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wanpeng Li
     

12 Sep, 2013

5 commits

  • madvise_hwpoison() has two locals called "ret". Fix it all up.

    Cc: Wanpeng Li
    Cc: Naoya Horiguchi
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • The return value outside for loop is always zero which means
    madvise_hwpoison return success, however, this is not truth for
    soft_offline_page w/ failure return value.

    Signed-off-by: Wanpeng Li
    Reviewed-by: Naoya Horiguchi
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wanpeng Li
     
  • madvise hwpoison inject will poison the read-only empty zero page if there
    is no write access before poison. Empty zero page reference count will be
    increased for hwpoison, subsequent poison zero page will return directly
    since page has already been set PG_hwpoison, however, page reference count
    is still increased by get_user_pages_fast. The unpoison process will
    unpoison the empty zero page and decrease the reference count successfully
    for the fist time, however, subsequent unpoison empty zero page will
    return directly since page has already been unpoisoned and without
    decrease the page reference count of empty zero page.

    This patch fixes it by make madvise_hwpoison() put a page and return
    immediately (without calling memory_failure() or soft_offline_page()) when
    the page is already hwpoisoned.

    Testcase:

    #define _GNU_SOURCE
    #include
    #include
    #include
    #include
    #include
    #include
    #include

    #define PAGES_TO_TEST 3
    #define PAGE_SIZE 4096

    int main(void)
    {
    char *mem;
    int i;

    mem = mmap(NULL, PAGES_TO_TEST * PAGE_SIZE,
    PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, 0, 0);

    if (madvise(mem, PAGES_TO_TEST * PAGE_SIZE, MADV_HWPOISON) == -1)
    return -1;

    munmap(mem, PAGES_TO_TEST * PAGE_SIZE);

    return 0;
    }

    Add printk to dump page reference count:

    [ 93.075959] Injecting memory failure for page 0x19d0 at 0xb77d8000
    [ 93.076207] MCE 0x19d0: non LRU page recovery: Ignored
    [ 93.076209] pfn 0x19d0, page count = 1 after memory failure
    [ 93.076220] Injecting memory failure for page 0x19d0 at 0xb77d9000
    [ 93.076221] MCE 0x19d0: already hardware poisoned
    [ 93.076222] pfn 0x19d0, page count = 2 after memory failure
    [ 93.076224] Injecting memory failure for page 0x19d0 at 0xb77da000
    [ 93.076224] MCE 0x19d0: already hardware poisoned
    [ 93.076225] pfn 0x19d0, page count = 3 after memory failure

    Signed-off-by: Wanpeng Li
    Suggested-by: Naoya Horiguchi
    Cc: Andi Kleen
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wanpeng Li
     
  • Add '#' to madvise_hwpoison.

    Before patch:

    [ 95.892866] Injecting memory failure for page 19d0 at b7786000
    [ 95.893151] MCE 0x19d0: non LRU page recovery: Ignored

    After patch:

    [ 95.892866] Injecting memory failure for page 0x19d0 at 0xb7786000
    [ 95.893151] MCE 0x19d0: non LRU page recovery: Ignored

    Signed-off-by: Wanpeng Li
    Reviewed-by: Naoya Horiguchi
    Cc: Andi Kleen
    Cc: Tony Luck
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wanpeng Li
     
  • This fixes following errors:
    - ERROR: "(foo*)" should be "(foo *)"
    - ERROR: "foo ** bar" should be "foo **bar"

    Signed-off-by: Vladimir Cernov
    Reviewed-by: Pekka Enberg
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Cernov
     

30 Apr, 2013

1 commit

  • In madvise(), there doesn't seem to be any reason for taking the
    ¤t->mm->mmap_sem before start and len_in have been validated.
    Incidentally, this removes the need for the out: label.

    [akpm@linux-foundation.org: s/out_plug/out/, per David]
    Signed-off-by: Rasmus Villemoes
    Acked-by: KOSAKI Motohiro
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Rasmus Villemoes
     

24 Feb, 2013

1 commit

  • Make madvise(MADV_WILLNEED) support swap file prefetch. If memory is
    swapout, this syscall can do swapin prefetch. It has no impact if the
    memory isn't swapout.

    [akpm@linux-foundation.org: fix CONFIG_SWAP=n build]
    [sasha.levin@oracle.com: fix BUG on madvise early failure]
    Signed-off-by: Shaohua Li
    Cc: Hugh Dickins
    Cc: Rik van Riel
    Signed-off-by: Sasha Levin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Shaohua Li
     

09 Oct, 2012

1 commit

  • Rename VM_NODUMP into VM_DONTDUMP: this name matches other negative flags:
    VM_DONTEXPAND, VM_DONTCOPY. Currently this flag used only for
    sys_madvise. The next patch will use it for replacing the outdated flag
    VM_RESERVED.

    Also forbid madvise(MADV_DODUMP) for special kernel mappings VM_SPECIAL
    (VM_IO | VM_DONTEXPAND | VM_RESERVED | VM_PFNMAP)

    Signed-off-by: Konstantin Khlebnikov
    Cc: Alexander Viro
    Cc: Carsten Otte
    Cc: Chris Metcalf
    Cc: Cyrill Gorcunov
    Cc: Eric Paris
    Cc: H. Peter Anvin
    Cc: Hugh Dickins
    Cc: Ingo Molnar
    Cc: James Morris
    Cc: Jason Baron
    Cc: Kentaro Takeda
    Cc: Matt Helsley
    Cc: Nick Piggin
    Cc: Oleg Nesterov
    Cc: Peter Zijlstra
    Cc: Robert Richter
    Cc: Suresh Siddha
    Cc: Tetsuo Handa
    Cc: Venkatesh Pallipadi
    Acked-by: Linus Torvalds
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konstantin Khlebnikov
     

07 Jul, 2012

1 commit

  • Otherwise the code races with munmap (causing a use-after-free
    of the vma) or with close (causing a use-after-free of the struct
    file).

    The bug was introduced by commit 90ed52ebe481 ("[PATCH] holepunch: fix
    mmap_sem i_mutex deadlock")

    Cc: Hugh Dickins
    Cc: Miklos Szeredi
    Cc: Badari Pulavarty
    Cc: Nick Piggin
    Cc: stable@vger.kernel.org
    Signed-off-by: Andy Lutomirski
    Signed-off-by: Linus Torvalds

    Andy Lutomirski
     

30 May, 2012

1 commit

  • Now tmpfs supports hole-punching via fallocate(), switch madvise_remove()
    to use do_fallocate() instead of vmtruncate_range(): which extends
    madvise(,,MADV_REMOVE) support from tmpfs to ext4, ocfs2 and xfs.

    There is one more user of vmtruncate_range() in our tree,
    staging/android's ashmem_shrink(): convert it to use do_fallocate() too
    (but if its unpinned areas are already unmapped - I don't know - then it
    would do better to use shmem_truncate_range() directly).

    Based-on-patch-by: Cong Wang
    Signed-off-by: Hugh Dickins
    Cc: Christoph Hellwig
    Cc: Al Viro
    Cc: Colin Cross
    Cc: John Stultz
    Cc: Greg Kroah-Hartman
    Cc: "Theodore Ts'o"
    Cc: Andreas Dilger
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Dave Chinner
    Cc: Ben Myers
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

24 Mar, 2012

1 commit

  • Since we no longer need the VM_ALWAYSDUMP flag, let's use the freed bit
    for 'VM_NODUMP' flag. The idea is is to add a new madvise() flag:
    MADV_DONTDUMP, which can be set by applications to specifically request
    memory regions which should not dump core.

    The specific application I have in mind is qemu: we can add a flag there
    that wouldn't dump all of guest memory when qemu dumps core. This flag
    might also be useful for security sensitive apps that want to absolutely
    make sure that parts of memory are not dumped. To clear the flag use:
    MADV_DODUMP.

    [akpm@linux-foundation.org: s/MADV_NODUMP/MADV_DONTDUMP/, s/MADV_CLEAR_NODUMP/MADV_DODUMP/, per Roland]
    [akpm@linux-foundation.org: fix up the architectures which broke]
    Signed-off-by: Jason Baron
    Acked-by: Roland McGrath
    Cc: Chris Metcalf
    Cc: Avi Kivity
    Cc: Ralf Baechle
    Cc: Richard Henderson
    Cc: Ivan Kokshaysky
    Cc: Matt Turner
    Cc: "James E.J. Bottomley"
    Cc: Helge Deller
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jason Baron
     

04 Jan, 2012

1 commit

  • There is only one caller of memory_failure(), all other users call
    __memory_failure() and pass in the flags argument explicitly. The
    lone user of memory_failure() will soon need to pass flags too.

    Add flags argument to the callsite in mce.c. Delete the old memory_failure()
    function, and then rename __memory_failure() without the leading "__".

    Provide clearer message when action optional memory errors are ignored.

    Acked-by: Borislav Petkov
    Signed-off-by: Tony Luck

    Tony Luck
     

21 Jul, 2011

1 commit

  • i_alloc_sem is a rather special rw_semaphore. It's the last one that may
    be released by a non-owner, and it's write side is always mirrored by
    real exclusion. It's intended use it to wait for all pending direct I/O
    requests to finish before starting a truncate.

    Replace it with a hand-grown construct:

    - exclusion for truncates is already guaranteed by i_mutex, so it can
    simply fall way
    - the reader side is replaced by an i_dio_count member in struct inode
    that counts the number of pending direct I/O requests. Truncate can't
    proceed as long as it's non-zero
    - when i_dio_count reaches non-zero we wake up a pending truncate using
    wake_up_bit on a new bit in i_flags
    - new references to i_dio_count can't appear while we are waiting for
    it to read zero because the direct I/O count always needs i_mutex
    (or an equivalent like XFS's i_iolock) for starting a new operation.

    This scheme is much simpler, and saves the space of a spinlock_t and a
    struct list_head in struct inode (typically 160 bits on a non-debug 64-bit
    system).

    Signed-off-by: Christoph Hellwig
    Signed-off-by: Al Viro

    Christoph Hellwig
     

14 Jan, 2011

3 commits

  • MADV_HUGEPAGE and MADV_NOHUGEPAGE were fully effective only if run after
    mmap and before touching the memory. While this is enough for most
    usages, it's little effort to make madvise more dynamic at runtime on an
    existing mapping by making khugepaged aware about madvise.

    MADV_HUGEPAGE: register in khugepaged immediately without waiting a page
    fault (that may not ever happen if all pages are already mapped and the
    "enabled" knob was set to madvise during the initial page faults).

    MADV_NOHUGEPAGE: skip vmas marked VM_NOHUGEPAGE in khugepaged to stop
    collapsing pages where not needed.

    [akpm@linux-foundation.org: tweak comment]
    Signed-off-by: Andrea Arcangeli
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • Add madvise MADV_NOHUGEPAGE to mark regions that are not important to be
    hugepage backed. Return -EINVAL if the vma is not of an anonymous type,
    or the feature isn't built into the kernel. Never silently return
    success.

    Signed-off-by: Andrea Arcangeli
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     
  • Add madvise MADV_HUGEPAGE to mark regions that are important to be
    hugepage backed. Return -EINVAL if the vma is not of an anonymous type,
    or the feature isn't built into the kernel. Never silently return
    success.

    Signed-off-by: Andrea Arcangeli
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     

16 Dec, 2009

4 commits


24 Sep, 2009

1 commit

  • * 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (21 commits)
    HWPOISON: Enable error_remove_page on btrfs
    HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs
    HWPOISON: Add madvise() based injector for hardware poisoned pages v4
    HWPOISON: Enable error_remove_page for NFS
    HWPOISON: Enable .remove_error_page for migration aware file systems
    HWPOISON: The high level memory error handler in the VM v7
    HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process
    HWPOISON: shmem: call set_page_dirty() with locked page
    HWPOISON: Define a new error_remove_page address space op for async truncation
    HWPOISON: Add invalidate_inode_page
    HWPOISON: Refactor truncate to allow direct truncating of page v2
    HWPOISON: check and isolate corrupted free pages v2
    HWPOISON: Handle hardware poisoned pages in try_to_unmap
    HWPOISON: Use bitmask/action code for try_to_unmap behaviour
    HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2
    HWPOISON: Add poison check to page fault handling
    HWPOISON: Add basic support for poisoned pages in fault handler v3
    HWPOISON: Add new SIGBUS error codes for hardware poison signals
    HWPOISON: Add support for poison swap entries v2
    HWPOISON: Export some rmap vma locking to outside world
    ...

    Linus Torvalds
     

22 Sep, 2009

2 commits

  • This patch presents the mm interface to a dummy version of ksm.c, for
    better scrutiny of that interface: the real ksm.c follows later.

    When CONFIG_KSM is not set, madvise(2) reject MADV_MERGEABLE and
    MADV_UNMERGEABLE with EINVAL, since that seems more helpful than
    pretending that they can be serviced. But when CONFIG_KSM=y, accept them
    even if KSM is not currently running, and even on areas which KSM will not
    touch (e.g. hugetlb or shared file or special driver mappings).

    Like other madvices, report ENOMEM despite success if any area in the
    range is unmapped, and use EAGAIN to report out of memory.

    Define vma flag VM_MERGEABLE to identify an area on which KSM may try
    merging pages: leave it to ksm_madvise() to decide whether to set it.
    Define mm flag MMF_VM_MERGEABLE to identify an mm which might contain
    VM_MERGEABLE areas, to minimize callouts when forking or exiting.

    Based upon earlier patches by Chris Wright and Izik Eidus.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Chris Wright
    Signed-off-by: Izik Eidus
    Cc: Michael Kerrisk
    Cc: Andrea Arcangeli
    Cc: Rik van Riel
    Cc: Wu Fengguang
    Cc: Balbir Singh
    Cc: Hugh Dickins
    Cc: KAMEZAWA Hiroyuki
    Cc: Lee Schermerhorn
    Cc: Avi Kivity
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • madvise.c has several levels of switch statements, what to do in which?
    Move MADV_DOFORK code down from madvise_vma() to madvise_behavior(), so
    madvise_vma() can be a simple router, to madvise_behavior() by default.

    vma->vm_flags is an unsigned long so use the same type for new_flags. Add
    missing comment lines to describe MADV_DONTFORK and MADV_DOFORK.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Chris Wright
    Signed-off-by: Izik Eidus
    Cc: Andrea Arcangeli
    Cc: Rik van Riel
    Cc: Wu Fengguang
    Cc: Balbir Singh
    Cc: Hugh Dickins
    Cc: KAMEZAWA Hiroyuki
    Cc: Lee Schermerhorn
    Cc: Avi Kivity
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

16 Sep, 2009

1 commit

  • Impact: optional, useful for debugging

    Add a new madvice sub command to inject poison for some
    pages in a process' address space. This is useful for
    testing the poison page handling.

    This patch can allow root to tie up large amounts of memory.
    I got feedback from container developers and they didn't see any
    problem.

    v2: Use write flag for get_user_pages to make sure to always get
    a fresh page
    v3: Don't request write mapping (Fengguang Wu)
    v4: Move MADV_* number to avoid conflict with KSM (Hugh Dickins)

    Signed-off-by: Andi Kleen

    Andi Kleen
     

17 Jun, 2009

2 commits

  • The posix_madvise() function succeeds (and does nothing) when called with
    parameters (NULL, 0, -1); according to LSB tests, it should fail with
    EINVAL because -1 is not a valid flag.

    When called with a valid address and size, it correctly fails.

    So perform an initial check for valid flags first.

    Reported-by: Jiri Dluhos
    Signed-off-by: Nick Piggin
    Reviewed-and-Tested-by: WANG Cong
    Cc: Michael Kerrisk
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • Impact: code simplification.

    Cc: Nick Piggin
    Signed-off-by: Wu Fengguang
    Cc: Ying Han
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Wu Fengguang
     

13 May, 2009

1 commit