24 Nov, 2006

1 commit

  • find_min_pfn_for_node() and find_min_pfn_with_active_regions() both
    depend on a sorted early_node_map[]. However, sort_node_map() is being
    called after fin_min_pfn_with_active_regions() in
    free_area_init_nodes().

    In most cases, this is ok, but on at least one x86_64, the SRAT table
    caused the E820 ranges to be registered out of order. This gave the
    wrong values for the min PFN range resulting in some pages not being
    initialised.

    This patch sorts the early_node_map in find_min_pfn_for_node(). It has
    been boot tested on x86, x86_64, ppc64 and ia64.

    Signed-off-by: Mel Gorman
    Acked-by: Andre Noll
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

17 Nov, 2006

1 commit

  • Recently, __get_vm_area_node() was changed like following

    if (unlikely(!area))
    return NULL;

    - if (unlikely(!size)) {
    - kfree (area);
    + if (unlikely(!size))
    return NULL;
    - }

    It is leaking `area', also original code seems strange already.
    Probably, we wanted to do this patch.

    Signed-off-by: OGAWA Hirofumi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    OGAWA Hirofumi
     

15 Nov, 2006

3 commits

  • Commit cb07c9a1864a8eac9f3123e428100d5b2a16e65a causes the wrong return
    value. is_hugepage_only_range() is a boolean, so we should return
    -EINVAL rather than 1.

    Also - we can use "mm" instead of looking up "current->mm" again.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Unlike mmap(), the codepath for brk() creates a vma without first checking
    that it doesn't touch a region exclusively reserved for hugepages. On
    powerpc, this can allow it to create a normal page vma in a hugepage
    region, causing oopses and other badness.

    Add a test to prevent this. With this patch, brk() will simply fail if it
    attempts to move the break into a hugepage reserved region.

    Signed-off-by: David Gibson
    Cc: Adam Litke
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Gibson
     
  • (David:)

    If hugetlbfs_file_mmap() returns a failure to do_mmap_pgoff() - for example,
    because the given file offset is not hugepage aligned - then do_mmap_pgoff
    will go to the unmap_and_free_vma backout path.

    But at this stage the vma hasn't been marked as hugepage, and the backout path
    will call unmap_region() on it. That will eventually call down to the
    non-hugepage version of unmap_page_range(). On ppc64, at least, that will
    cause serious problems if there are any existing hugepage pagetable entries in
    the vicinity - for example if there are any other hugepage mappings under the
    same PUD. unmap_page_range() will trigger a bad_pud() on the hugepage pud
    entries. I suspect this will also cause bad problems on ia64, though I don't
    have a machine to test it on.

    (Hugh:)

    prepare_hugepage_range() should check file offset alignment when it checks
    virtual address and length, to stop MAP_FIXED with a bad huge offset from
    unmapping before it fails further down. PowerPC should apply the same
    prepare_hugepage_range alignment checks as ia64 and all the others do.

    Then none of the alignment checks in hugetlbfs_file_mmap are required (nor
    is the check for too small a mapping); but even so, move up setting of
    VM_HUGETLB and add a comment to warn of what David Gibson discovered - if
    hugetlbfs_file_mmap fails before setting it, do_mmap_pgoff's unmap_region
    when unwinding from error will go the non-huge way, which may cause bad
    behaviour on architectures (powerpc and ia64) which segregate their huge
    mappings into a separate region of the address space.

    Signed-off-by: Hugh Dickins
    Cc: "Luck, Tony"
    Cc: "David S. Miller"
    Acked-by: Adam Litke
    Acked-by: David Gibson
    Cc: Paul Mackerras
    Cc: Benjamin Herrenschmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

13 Nov, 2006

1 commit

  • - reorder 'struct vm_struct' to speedup lookups on CPUS with small cache
    lines. The fields 'next,addr,size' should be now in the same cache line,
    to speedup lookups.

    - One minor cleanup in __get_vm_area_node()

    - Bugfixes in vmalloc_user() and vmalloc_32_user() NULL returns from
    __vmalloc() and __find_vm_area() were not tested.

    [akpm@osdl.org: remove redundant BUG_ONs]
    Signed-off-by: Eric Dumazet
    Cc: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

04 Nov, 2006

4 commits

  • sys_move_pages() uses vmalloc() to allocate an array of structures that is
    fills with information passed from user mode and then passes to
    do_stat_pages() (in the case the node list is NULL). do_stat_pages()
    depends on a marker in the node field of the structure to decide how large
    the array is and this marker is correctly inserted into the last element of
    the array. However, vmalloc() doesn't zero the memory it allocates and if
    the user passes NULL for the node list, then the node fields are not filled
    in (except for the end marker). If the memory the vmalloc() returned
    happend to have a word with the marker value in it in just the right place,
    do_pages_stat will fail to fill the status field of part of the array and
    we will return (random) kernel data to user mode.

    Signed-off-by: Stephen Rothwell
    Cc: Christoph Lameter
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Stephen Rothwell
     
  • It looks like there is a bug in init_reap_node() in slab.c that can cause
    multiple oops's on certain ES7000 configurations. The variable reap_node
    is defined per cpu, but only initialized on a single CPU. This causes an
    oops in next_reap_node() when __get_cpu_var(reap_node) returns the wrong
    value. Fix is below.

    Signed-off-by: Dan Yeisley
    Cc: Andi Kleen
    Acked-by: Christoph Lameter
    Cc: Pekka Enberg
    Cc: Manfred Spraul
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Yeisley
     
  • Current read_pages() assume ->readpages() frees the passed pages.

    This patch free the pages in ->read_pages(), if those were remaining in the
    pages_list. So, readpages() just can ignore the remaining pages in
    pages_list.

    Signed-off-by: OGAWA Hirofumi
    Cc: Steven French
    Cc: Miklos Szeredi
    Cc: Steven Whitehouse
    Cc: Trond Myklebust
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    OGAWA Hirofumi
     
  • Un-needed add-store operation wastes a few bytes.
    8 bytes wasted with -O2, on a ppc.

    Signed-off-by: nkalmala
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    nkalmala
     

30 Oct, 2006

1 commit

  • As reported by Martin J. Bligh , we let through some
    non-slab bits to slab allocation through __get_vm_area_node when doing a
    vmalloc.

    I haven't been able to reproduce this, although I understand why it
    happens: vmalloc allocates memory with

    GFP_KERNEL | __GFP_HIGHMEM

    and commit 52fd24ca1db3a741f144bbc229beefe044202cac resulted in the same
    flags are passed down to cache_alloc_refill, causing the BUG. The
    following patch fixes it.

    Note that when calling kmalloc_node, I am masking off __GFP_HIGHMEM with
    GFP_LEVEL_MASK, whereas __vmalloc_area_node does the same with

    ~(__GFP_HIGHMEM | __GFP_ZERO).

    IMHO, using GFP_LEVEL_MASK is preferable, but either should fix this
    problem.

    Signed-off-by: Giridhar Pemmasani (pgiri@yahoo.com)
    Cc: Martin J. Bligh
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Giridhar Pemmasani
     

29 Oct, 2006

7 commits

  • absent_pages_in_range() made the assumption that users of the
    arch-independent zone-sizing API would not care about holes beyound the end
    of physical memory. This was not the case and was "fixed" in a patch
    called "Account for holes that are outside the range of physical memory".
    However, when given a range that started before a hole in "real" memory and
    ended beyond the end of memory, it would get the result wrong. The bug is
    in mainline but a patch is below.

    It has been tested successfully on a number of machines and architectures.
    Additional credit to Keith Mannthey for discovering the problem, helping
    identify the correct fix and confirming it Worked For Him.

    Signed-off-by: Mel Gorman
    Cc: keith mannthey
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     
  • If you truncated an mmap'ed hugetlbfs file, then faulted on the truncated
    area, /proc/meminfo's HugePages_Rsvd wrapped hugely "negative". Reinstate my
    preliminary i_size check before attempting to allocate the page (though this
    only fixes the most obvious case: more work will be needed here).

    Signed-off-by: Hugh Dickins
    Cc: Adam Litke
    Cc: David Gibson
    Cc: "Chen, Kenneth W"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • If __vmalloc is called to allocate memory with GFP_ATOMIC in atomic
    context, the chain of calls results in __get_vm_area_node allocating memory
    for vm_struct with GFP_KERNEL, causing the 'sleeping from invalid context'
    warning. This patch fixes it by passing the gfp flags along so
    __get_vm_area_node allocates memory for vm_struct with the same flags.

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Giridhar Pemmasani
     
  • Add __GFP_NOWARN flag to calling of __alloc_pages() in
    __kmalloc_section_memmap(). It can reduce noisy failure message.

    In ia64, section size is 1 GB, this means that order 8 pages are necessary
    for each section's memmap. It is often very hard requirement under heavy
    memory pressure as you know. So, __alloc_pages() gives up allocation and
    shows many noisy stack traces which means no page for each sections.
    (Current my environment shows 32 times of stack trace....)

    But, __kmalloc_section_memmap() calls vmalloc() after failure of it, and it
    can succeed allocation of memmap. So, its stack trace warning becomes just
    noisy. I suppose it shouldn't be shown.

    Signed-off-by: Yasunori Goto
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Yasunori Goto
     
  • If try_to_free_pages / balance_pgdat are called with a gfp_mask specifying
    GFP_IO and/or GFP_FS, they will reclaim the requisite number of pages, and the
    reset prev_priority to DEF_PRIORITY (or to some other high (ie: unurgent)
    value).

    However, another reclaimer without those gfp_mask flags set (say, GFP_NOIO)
    may still be struggling to reclaim pages. The concurrent overwrite of
    zone->prev_priority will cause this GFP_NOIO thread to unexpectedly cease
    deactivating mapped pages, thus causing reclaim difficulties.

    Fix this is to key the distress calculation not off zone->prev_priority, but
    also take into account the local caller's priority by using
    min(zone->prev_priority, sc->priority)

    Signed-off-by: Martin J. Bligh
    Cc: Nick Piggin
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Bligh
     
  • The temp_priority field in zone is racy, as we can walk through a reclaim
    path, and just before we copy it into prev_priority, it can be overwritten
    (say with DEF_PRIORITY) by another reclaimer.

    The same bug is contained in both try_to_free_pages and balance_pgdat, but
    it is fixed slightly differently. In balance_pgdat, we keep a separate
    priority record per zone in a local array. In try_to_free_pages there is
    no need to do this, as the priority level is the same for all zones that we
    reclaim from.

    Impact of this bug is that temp_priority is copied into prev_priority, and
    setting this artificially high causes reclaimers to set distress
    artificially low. They then fail to reclaim mapped pages, when they are,
    in fact, under severe memory pressure (their priority may be as low as 0).
    This causes the OOM killer to fire incorrectly.

    From: Andrew Morton

    __zone_reclaim() isn't modifying zone->prev_priority. But zone->prev_priority
    is used in the decision whether or not to bring mapped pages onto the inactive
    list. Hence there's a risk here that __zone_reclaim() will fail because
    zone->prev_priority ir large (ie: low urgency) and lots of mapped pages end up
    stuck on the active list.

    Fix that up by decreasing (ie making more urgent) zone->prev_priority as
    __zone_reclaim() scans the zone's pages.

    This bug perhaps explains why ZONE_RECLAIM_PRIORITY was created. It should be
    possible to remove that now, and to just start out at DEF_PRIORITY?

    Cc: Nick Piggin
    Cc: Christoph Lameter
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Martin Bligh
     
  • - Consolidate page_cache_alloc

    - Fix splice: only the pagecache pages and filesystem data need to use
    mapping_gfp_mask.

    - Fix grab_cache_page_nowait: same as splice, also honour NUMA placement.

    Signed-off-by: Nick Piggin
    Cc: Jens Axboe
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

22 Oct, 2006

3 commits

  • The zonelist may contain zones of nodes that have not been bootstrapped and
    we will oops if we try to allocate from those zones. So check if the node
    information for the slab and the node have been setup before attempting an
    allocation. If it has not been setup then skip that zone.

    Usually we will not encounter this situation since the slab bootstrap code
    avoids falling back before we have setup the respective nodes but we seem
    to have a special needs for pppc.

    Signed-off-by: Christoph Lameter
    Acked-by: Andy Whitcroft
    Cc: Paul Mackerras
    Cc: Mike Kravetz
    Cc: Benjamin Herrenschmidt
    Acked-by: Mel Gorman
    Acked-by: Will Schmidt
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Christoph Lameter
     
  • Reintroduce NODES_SPAN_OTHER_NODES for powerpc

    Revert "[PATCH] Remove SPAN_OTHER_NODES config definition"
    This reverts commit f62859bb6871c5e4a8e591c60befc8caaf54db8c.
    Revert "[PATCH] mm: remove arch independent NODES_SPAN_OTHER_NODES"
    This reverts commit a94b3ab7eab4edcc9b2cb474b188f774c331adf7.

    Also update the comments to indicate that this is still required
    and where its used.

    Signed-off-by: Andy Whitcroft
    Cc: Paul Mackerras
    Cc: Mike Kravetz
    Cc: Benjamin Herrenschmidt
    Acked-by: Mel Gorman
    Acked-by: Will Schmidt
    Cc: Christoph Lameter
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andy Whitcroft
     
  • * 'splice' of git://brick.kernel.dk/data/git/linux-2.6-block:
    [PATCH] Remove SUID when splicing into an inode
    [PATCH] Add lockless helpers for remove_suid()
    [PATCH] Introduce generic_file_splice_write_nolock()
    [PATCH] Take i_mutex in splice_from_pipe()

    Linus Torvalds
     

21 Oct, 2006

6 commits

  • Clarify lockorder comments now that sys_msync dropps mmap_sem before
    calling do_fsync.

    Signed-off-by: Nick Piggin
    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • --=-=-=

    from mm/memory.c:
    1434 static inline void cow_user_page(struct page *dst, struct page *src, unsigned long va)
    1435 {
    1436 /*
    1437 * If the source page was a PFN mapping, we don't have
    1438 * a "struct page" for it. We do a best-effort copy by
    1439 * just copying from the original user address. If that
    1440 * fails, we just zero-fill it. Live with it.
    1441 */
    1442 if (unlikely(!src)) {
    1443 void *kaddr = kmap_atomic(dst, KM_USER0);
    1444 void __user *uaddr = (void __user *)(va & PAGE_MASK);
    1445
    1446 /*
    1447 * This really shouldn't fail, because the page is there
    1448 * in the page tables. But it might just be unreadable,
    1449 * in which case we just give up and fill the result with
    1450 * zeroes.
    1451 */
    1452 if (__copy_from_user_inatomic(kaddr, uaddr, PAGE_SIZE))
    1453 memset(kaddr, 0, PAGE_SIZE);
    1454 kunmap_atomic(kaddr, KM_USER0);
    #### D-cache have to be flushed here.
    #### It seems it is just forgotten.

    1455 return;
    1456
    1457 }
    1458 copy_user_highpage(dst, src, va);
    #### Ok here. flush_dcache_page() called from this func if arch need it
    1459 }

    Following is the patch fix this issue:

    Signed-off-by: Dmitriy Monakhov
    Cc: "David S. Miller"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitriy Monakhov
     
  • Qooting Adrian:

    - net/sunrpc/svc.c uses highest_possible_node_id()

    - include/linux/nodemask.h says highest_possible_node_id() is
    out-of-line #if MAX_NUMNODES > 1

    - the out-of-line highest_possible_node_id() is in lib/cpumask.c

    - lib/Makefile: lib-$(CONFIG_SMP) += cpumask.o
    CONFIG_ARCH_DISCONTIGMEM_ENABLE=y, CONFIG_SMP=n, CONFIG_SUNRPC=y

    -> highest_possible_node_id() is used in net/sunrpc/svc.c
    CONFIG_NODES_SHIFT defined and > 0

    -> include/linux/numa.h: MAX_NUMNODES > 1

    -> compile error

    The bug is not present on architectures where ARCH_DISCONTIGMEM_ENABLE
    depends on NUMA (but m32r isn't the only affected architecture).

    So move the function into page_alloc.c

    Cc: Adrian Bunk
    Cc: Paul Jackson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Despite mm.h is not being exported header, it does contain one thing
    which is part of userspace ABI -- value disabling OOM killer for given
    process. So,
    a) create and export include/linux/oom.h
    b) move OOM_DISABLE define there.
    c) turn bounding values of /proc/$PID/oom_adj into defines and export
    them too.

    Note: mass __KERNEL__ removal will be done later.

    Signed-off-by: Alexey Dobriyan
    Cc: Nick Piggin
    Cc: David Woodhouse
    Cc: Christoph Hellwig
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Dobriyan
     
  • Separate out the concept of "queue congestion" from "backing-dev congestion".
    Congestion is a backing-dev concept, not a queue concept.

    The blk_* congestion functions are retained, as wrappers around the core
    backing-dev congestion functions.

    This proper layering is needed so that NFS can cleanly use the congestion
    functions, and so that CONFIG_BLOCK=n actually links.

    Cc: "Thomas Maier"
    Cc: "Jens Axboe"
    Cc: Trond Myklebust
    Cc: David Howells
    Cc: Peter Osterlund
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • When direct-io falls back to buffered write, it will just leave the dirty data
    floating about in pagecache, pending regular writeback.

    But normal direct-io semantics are that IO is synchronous, and that it leaves
    no pagecache behind.

    So change the fallback-to-buffered-write code to sync the file region and to
    then strip away the pagecache, just as a regular direct-io write would do.

    Acked-by: Jeff Moyer
    Cc: Zach Brown
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jeff Moyer
     

20 Oct, 2006

1 commit

  • Right now users have to grab i_mutex before calling remove_suid(), in the
    unlikely event that a call to ->setattr() may be needed. Split up the
    function in two parts:

    - One to check if we need to remove suid
    - One to actually remove it

    The first we can call lockless.

    Signed-off-by: Jens Axboe

    Jens Axboe
     

17 Oct, 2006

3 commits

  • A recent change to the vmalloc() code accidentally resulted in us passing
    __GFP_ZERO into the slab allocator. But we only wanted __GFP_ZERO for the
    actual pages whcih are being vmalloc()ed, and passing __GFP_ZERO into slab is
    not a rational thing to ask for.

    Cc: Jonathan Corbet
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • We need to encode a decode the 'file' part of a handle. We simply use the
    inode number and generation number to construct the filehandle.

    The generation number is the time when the file was created. As inode numbers
    cycle through the full 32 bits before being reused, there is no real chance of
    the same inum being allocated to different files in the same second so this is
    suitably unique. Using time-of-day rather than e.g. jiffies makes it less
    likely that the same filehandle can be created after a reboot.

    In order to be able to decode a filehandle we need to be able to lookup by
    inum, which means that the inode needs to be added to the inode hash table
    (tmpfs doesn't currently hash inodes as there is never a need to lookup by
    inum). To avoid overhead when not exporting, we only hash an inode when it is
    first exported. This requires a lock to ensure it isn't hashed twice.

    This code is separate from the patch posted in June06 from Atal Shargorodsky
    which provided the same functionality, but does borrow slightly from it.

    Locking comment: Most filesystems that hash their inodes do so at the point
    where the 'struct inode' is initialised, and that has suitable locking
    (I_NEW). Here in shmem, we are hashing the inode later, the first time we
    need an NFS file handle for it. We no longer have I_NEW to ensure only one
    thread tries to add it to the hash table.

    Cc: Atal Shargorodsky
    Cc: Gilad Ben-Yossef
    Signed-off-by: David M. Grimes
    Signed-off-by: Neil Brown
    Acked-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David M. Grimes
     
  • If remove_mapping() failed to remove the page from its mapping, don't go and
    mark it not uptodate! Makes kernel go dead.

    (Actually, I don't think the ClearPageUptodate is needed there at all).

    Says Nick Piggin:

    "Right, it isn't needed because at this point the page is guaranteed
    by remove_mapping to have no references (except us) and cannot pick
    up any new ones because it is removed from pagecache.

    We can delete it."

    Signed-off-by: Andrew Morton
    Acked-by: Nick Piggin
    Signed-off-by: Linus Torvalds

    Andrew Morton
     

16 Oct, 2006

1 commit

  • .. and clean up the file mapping code while at it. No point in having a
    "if (file)" repeated twice, and generally doing similar checks in two
    different sections of the same code

    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

12 Oct, 2006

8 commits

  • Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Aneesh Kumar
     
  • If try_to_release_page() is called with a zero gfp mask, then the
    filesystem is effectively denied the possibility of sleeping while
    attempting to release the page. There doesn't appear to be any valid
    reason why this should be banned, given that we're not calling this from a
    memory allocation context.

    For this reason, change the gfp_mask argument of the call to GFP_KERNEL.

    Signed-off-by: Trond Myklebust
    Cc: Steve Dickson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Trond Myklebust
     
  • A failure in invalidate_inode_pages2_range() can result in unpleasant things
    happening in NFS (at least). Stick a WARN_ON_ONCE() in there so we can find
    out if it happens, and maybe why.

    (akpm: might be a -mm-only patch, we'll see..)

    Cc: Chuck Lever
    Cc: Trond Myklebust
    Cc: Steve Dickson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrew Morton
     
  • Move the lock debug checks below the page reserved checks. Also, having
    debug_check_no_locks_freed in kernel_map_pages is wrong.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • After the PG_reserved check was added, arch_free_page was being called in the
    wrong place (it could be called for a page we don't actually want to free).
    Fix that.

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     
  • With CONFIG_MIGRATION=n

    mm/mempolicy.c: In function 'do_mbind':
    mm/mempolicy.c:796: warning: passing argument 2 of 'migrate_pages' from incompatible pointer type

    Signed-off-by: Keith Owens
    Cc: Christoph Lameter
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Keith Owens
     
  • We have a persistent dribble of reports of this BUG triggering. Its extended
    diagnostics were recently made conditional on CONFIG_DEBUG_VM, which was a bad
    idea - we want to know about it.

    Signed-off-by: Dave Jones
    Cc: Nick Piggin
    Cc: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Jones
     
  • commit fe1668ae5bf0145014c71797febd9ad5670d5d05 causes kernel to oops with
    libhugetlbfs test suite. The problem is that hugetlb pages can be shared
    by multiple mappings. Multiple threads can fight over page->lru in the
    unmap path and bad things happen. We now serialize __unmap_hugepage_range
    to void concurrent linked list manipulation. Such serialization is also
    needed for shared page table page on hugetlb area. This patch will fixed
    the bug and also serve as a prepatch for shared page table.

    Signed-off-by: Ken Chen
    Cc: Hugh Dickins
    Cc: David Gibson
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Chen, Kenneth W