26 Jul, 2011

4 commits

  • Commit bae9c19bf1 ("thp: split_huge_page_mm/vma") changed locking behavior
    of walk_page_range(). Thus this patch changes the comment too.

    Signed-off-by: KOSAKI Motohiro
    Cc: Naoya Horiguchi
    Cc: Hiroyuki Kamezawa
    Cc: Andrea Arcangeli
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Originally, walk_hugetlb_range() didn't require a caller take any lock.
    But commit d33b9f45bd ("mm: hugetlb: fix hugepage memory leak in
    walk_page_range") changed its rule. Because it added find_vma() call in
    walk_hugetlb_range().

    Any locking-rule change commit should write a doc too.

    [akpm@linux-foundation.org: clarify comment]
    Signed-off-by: KOSAKI Motohiro
    Cc: Naoya Horiguchi
    Cc: Hiroyuki Kamezawa
    Cc: Andrea Arcangeli
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • Currently, walk_page_range() calls find_vma() every page table for walk
    iteration. but it's completely unnecessary if walk->hugetlb_entry is
    unused. And we don't have to assume find_vma() is a lightweight
    operation. So this patch checks the walk->hugetlb_entry and avoids the
    find_vma() call if possible.

    This patch also makes some cleanups. 1) remove ugly uninitialized_var()
    and 2) #ifdef in function body.

    Signed-off-by: KOSAKI Motohiro
    Cc: Naoya Horiguchi
    Cc: Hiroyuki Kamezawa
    Cc: Andrea Arcangeli
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     
  • The doc of find_vma() says,

    /* Look up the first VMA which satisfies addr < vm_end, NULL if none. */
    struct vm_area_struct *find_vma(struct mm_struct *mm, unsigned long addr)
    {
    (snip)

    Thus, caller should confirm whether the returned vma matches a desired one.

    Signed-off-by: KOSAKI Motohiro
    Cc: Naoya Horiguchi
    Cc: Hiroyuki Kamezawa
    Cc: Andrea Arcangeli
    Cc: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    KOSAKI Motohiro
     

23 Mar, 2011

1 commit

  • Right now, if a mm_walk has either ->pte_entry or ->pmd_entry set, it will
    unconditionally split any transparent huge pages it runs in to. In
    practice, that means that anyone doing a

    cat /proc/$pid/smaps

    will unconditionally break down every huge page in the process and depend
    on khugepaged to re-collapse it later. This is fairly suboptimal.

    This patch changes that behavior. It teaches each ->pmd_entry handler
    (there are five) that they must break down the THPs themselves. Also, the
    _generic_ code will never break down a THP unless a ->pte_entry handler is
    actually set.

    This means that the ->pmd_entry handlers can now choose to deal with THPs
    without breaking them down.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Dave Hansen
    Acked-by: Mel Gorman
    Acked-by: David Rientjes
    Reviewed-by: Eric B Munson
    Tested-by: Eric B Munson
    Cc: Michael J Wolf
    Cc: Andrea Arcangeli
    Cc: Johannes Weiner
    Cc: Matt Mackall
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     

14 Jan, 2011

1 commit

  • split_huge_page_pmd compat code. Each one of those would need to be
    expanded to hundred of lines of complex code without a fully reliable
    split_huge_page_pmd design.

    Signed-off-by: Andrea Arcangeli
    Acked-by: Rik van Riel
    Acked-by: Mel Gorman
    Signed-off-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     

25 Nov, 2010

1 commit

  • Commit d33b9f45 ("mm: hugetlb: fix hugepage memory leak in
    walk_page_range()") introduces a check if a vma is a hugetlbfs one and
    later in 5dc37642 ("mm hugetlb: add hugepage support to pagemap") it is
    moved under #ifdef CONFIG_HUGETLB_PAGE but a needless find_vma call is
    left behind and its result is not used anywhere else in the function.

    The side-effect of caching vma for @addr inside walk->mm is neither
    utilized in walk_page_range() nor in called functions.

    Signed-off-by: David Sterba
    Reviewed-by: Naoya Horiguchi
    Acked-by: Andi Kleen
    Cc: Andy Whitcroft
    Cc: David Rientjes
    Cc: Hugh Dickins
    Cc: Lee Schermerhorn
    Cc: Matt Mackall
    Acked-by: Mel Gorman
    Cc: Wu Fengguang
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Sterba
     

07 Apr, 2010

1 commit

  • When we look into pagemap using page-types with option -p, the value of
    pfn for hugepages looks wrong (see below.) This is because pte was
    evaluated only once for one vma although it should be updated for each
    hugepage. This patch fixes it.

    $ page-types -p 3277 -Nl -b huge
    voffset offset len flags
    7f21e8a00 11e400 1 ___U___________H_G________________
    7f21e8a01 11e401 1ff ________________TG________________
    ^^^
    7f21e8c00 11e400 1 ___U___________H_G________________
    7f21e8c01 11e401 1ff ________________TG________________
    ^^^

    One hugepage contains 1 head page and 511 tail pages in x86_64 and each
    two lines represent each hugepage. Voffset and offset mean virtual
    address and physical address in the page unit, respectively. The
    different hugepages should not have the same offset value.

    With this patch applied:

    $ page-types -p 3386 -Nl -b huge
    voffset offset len flags
    7fec7a600 112c00 1 ___UD__________H_G________________
    7fec7a601 112c01 1ff ________________TG________________
    ^^^
    7fec7a800 113200 1 ___UD__________H_G________________
    7fec7a801 113201 1ff ________________TG________________
    ^^^
    OK

    More info:

    - This patch modifies walk_page_range()'s hugepage walker. But the
    change only affects pagemap_read(), which is the only caller of hugepage
    callback.

    - Without this patch, hugetlb_entry() callback is called per vma, that
    doesn't match the natural expectation from its name.

    - With this patch, hugetlb_entry() is called per hugepte entry and the
    callback can become much simpler.

    Signed-off-by: Naoya Horiguchi
    Signed-off-by: KAMEZAWA Hiroyuki
    Acked-by: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

16 Dec, 2009

2 commits

  • This patch enables extraction of the pfn of a hugepage from
    /proc/pid/pagemap in an architecture independent manner.

    Details
    -------
    My test program (leak_pagemap) works as follows:
    - creat() and mmap() a file on hugetlbfs (file size is 200MB == 100 hugepages,)
    - read()/write() something on it,
    - call page-types with option -p,
    - munmap() and unlink() the file on hugetlbfs

    Without my patches
    ------------------
    $ ./leak_pagemap
    flags page-count MB symbolic-flags long-symbolic-flags
    0x0000000000000000 1 0 __________________________________
    0x0000000000000804 1 0 __R________M______________________ referenced,mmap
    0x000000000000086c 81 0 __RU_lA____M______________________ referenced,uptodate,lru,active,mmap
    0x0000000000005808 5 0 ___U_______Ma_b___________________ uptodate,mmap,anonymous,swapbacked
    0x0000000000005868 12 0 ___U_lA____Ma_b___________________ uptodate,lru,active,mmap,anonymous,swapbacked
    0x000000000000586c 1 0 __RU_lA____Ma_b___________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
    total 101 0

    The output of page-types don't show any hugepage.

    With my patches
    ---------------
    $ ./leak_pagemap
    flags page-count MB symbolic-flags long-symbolic-flags
    0x0000000000000000 1 0 __________________________________
    0x0000000000030000 51100 199 ________________TG________________ compound_tail,huge
    0x0000000000028018 100 0 ___UD__________H_G________________ uptodate,dirty,compound_head,huge
    0x0000000000000804 1 0 __R________M______________________ referenced,mmap
    0x000000000000080c 1 0 __RU_______M______________________ referenced,uptodate,mmap
    0x000000000000086c 80 0 __RU_lA____M______________________ referenced,uptodate,lru,active,mmap
    0x0000000000005808 4 0 ___U_______Ma_b___________________ uptodate,mmap,anonymous,swapbacked
    0x0000000000005868 12 0 ___U_lA____Ma_b___________________ uptodate,lru,active,mmap,anonymous,swapbacked
    0x000000000000586c 1 0 __RU_lA____Ma_b___________________ referenced,uptodate,lru,active,mmap,anonymous,swapbacked
    total 51300 200

    The output of page-types shows 51200 pages contributing to hugepages,
    containing 100 head pages and 51100 tail pages as expected.

    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Naoya Horiguchi
    Cc: Andi Kleen
    Cc: Wu Fengguang
    Cc: Hugh Dickins
    Cc: Mel Gorman
    Cc: Lee Schermerhorn
    Cc: Andy Whitcroft
    Cc: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     
  • Most callers of pmd_none_or_clear_bad() check whether the target page is
    in a hugepage or not, but walk_page_range() do not check it. So if we
    read /proc/pid/pagemap for the hugepage on x86 machine, the hugepage
    memory is leaked as shown below. This patch fixes it.

    Details
    =======
    My test program (leak_pagemap) works as follows:
    - creat() and mmap() a file on hugetlbfs (file size is 200MB == 100 hugepages,)
    - read()/write() something on it,
    - call page-types with option -p (walk around the page tables),
    - munmap() and unlink() the file on hugetlbfs

    Without my patches
    ------------------
    $ cat /proc/meminfo |grep "HugePage"
    HugePages_Total: 1000
    HugePages_Free: 1000
    HugePages_Rsvd: 0
    HugePages_Surp: 0
    $ ./leak_pagemap
    [snip output]
    $ cat /proc/meminfo |grep "HugePage"
    HugePages_Total: 1000
    HugePages_Free: 900
    HugePages_Rsvd: 0
    HugePages_Surp: 0
    $ ls /hugetlbfs/
    $

    100 hugepages are accounted as used while there is no file on hugetlbfs.

    With my patches
    ---------------
    $ cat /proc/meminfo |grep "HugePage"
    HugePages_Total: 1000
    HugePages_Free: 1000
    HugePages_Rsvd: 0
    HugePages_Surp: 0
    $ ./leak_pagemap
    [snip output]
    $ cat /proc/meminfo |grep "HugePage"
    HugePages_Total: 1000
    HugePages_Free: 1000
    HugePages_Rsvd: 0
    HugePages_Surp: 0
    $ ls /hugetlbfs
    $

    No memory leaks.

    Signed-off-by: Naoya Horiguchi
    Cc: Andi Kleen
    Cc: Wu Fengguang
    Cc: Hugh Dickins
    Cc: Mel Gorman
    Cc: Lee Schermerhorn
    Cc: Andy Whitcroft
    Cc: David Rientjes
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

13 Jun, 2008

1 commit

  • We need this at least for huge page detection for now, because powerpc
    needs the vm_area_struct to be able to determine whether a virtual address
    is referring to a huge page (its pmd_huge() doesn't work).

    It might also come in handy for some of the other users.

    Signed-off-by: Dave Hansen
    Acked-by: Matt Mackall
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dave Hansen
     

28 Apr, 2008

1 commit

  • After the loop in walk_pte_range() pte might point to the first address after
    the pmd it walks. The pte_unmap() is then applied to something bad.

    Spotted by Roel Kluin and Andreas Schwab.

    Signed-off-by: Johannes Weiner
    Cc: Roel Kluin
    Cc: Andreas Schwab
    Acked-by: Matt Mackall
    Acked-by: Mikael Pettersson
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

20 Mar, 2008

1 commit

  • Fix various kernel-doc notation in mm/:

    filemap.c: add function short description; convert 2 to kernel-doc
    fremap.c: change parameter 'prot' to @prot
    pagewalk.c: change "-" in function parameters to ":"
    slab.c: fix short description of kmem_ptr_validate()
    swap.c: fix description & parameters of put_pages_list()
    swap_state.c: fix function parameters
    vmalloc.c: change "@returns" to "Returns:" since that is not a parameter

    Signed-off-by: Randy Dunlap
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Randy Dunlap
     

06 Feb, 2008

1 commit