04 Aug, 2011

2 commits

  • Make the radix_tree exceptional cases, mostly in filemap.c, clearer.

    It's hard to devise a suitable snappy name that illuminates the use by
    shmem/tmpfs for swap, while keeping filemap/pagecache/radix_tree
    generality. And akpm points out that /* radix_tree_deref_retry(page) */
    comments look like calls that have been commented out for unknown
    reason.

    Skirt the naming difficulty by rearranging these blocks to handle the
    transient radix_tree_deref_retry(page) case first; then just explain the
    remaining shmem/tmpfs swap case in a comment.

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     
  • Remove PageSwapBacked (!page_is_file_cache) cases from
    add_to_page_cache_locked() and add_to_page_cache_lru(): those pages now
    go through shmem_add_to_page_cache().

    Remove a comment on maximum tmpfs size from fsstack_copy_inode_size(),
    and add a comment on swap entries to invalidate_mapping_pages().

    And mincore_page() uses find_get_page() on what might be shmem or a
    tmpfs file: allow for a radix_tree_exceptional_entry(), and proceed to
    find_get_page() on swapper_space if so (oh, swapper_space needs #ifdef).

    Signed-off-by: Hugh Dickins
    Acked-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

14 Jan, 2011

2 commits

  • Handle transparent huge page pmd entries natively instead of splitting
    them into subpages.

    Signed-off-by: Johannes Weiner
    Signed-off-by: Andrea Arcangeli
    Reviewed-by: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • split_huge_page_pmd compat code. Each one of those would need to be
    expanded to hundred of lines of complex code without a fully reliable
    split_huge_page_pmd design.

    Signed-off-by: Andrea Arcangeli
    Acked-by: Rik van Riel
    Acked-by: Mel Gorman
    Signed-off-by: Johannes Weiner
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Andrea Arcangeli
     

25 May, 2010

4 commits

  • Do page table walks with the well-known nested loops we use in several
    other places already.

    This avoids doing full page table walks after every pte range and also
    allows to handle unmapped areas bigger than one pte range in one go.

    Signed-off-by: Johannes Weiner
    Cc: Andrea Arcangeli
    Cc: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Instead of passing a start address and a number of pages into the helper
    functions, convert them to use a start and an end address.

    Signed-off-by: Johannes Weiner
    Cc: Andrea Arcangeli
    Cc: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • Split out functions to handle hugetlb ranges, pte ranges and unmapped
    ranges, to improve readability but also to prepare the file structure for
    nested page table walks.

    No semantic changes intended.

    Signed-off-by: Johannes Weiner
    Cc: Andrea Arcangeli
    Cc: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     
  • This fixes some minor issues that bugged me while going over the code:

    o adjust argument order of do_mincore() to match the syscall
    o simplify range length calculation
    o drop superfluous shift in huge tlb calculation, address is page aligned
    o drop dead nr_huge calculation
    o check pte_none() before pte_present()
    o comment and whitespace fixes

    No semantic changes intended.

    Signed-off-by: Johannes Weiner
    Cc: Andrea Arcangeli
    Cc: Naoya Horiguchi
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

16 Dec, 2009

1 commit

  • Most callers of pmd_none_or_clear_bad() check whether the target page is
    in a hugepage or not, but mincore() and walk_page_range() do not check it.
    So if we use mincore() on a hugepage on x86 machine, the hugepage memory
    is leaked as shown below. This patch fixes it by extending mincore()
    system call to support hugepages.

    Details
    =======
    My test program (leak_mincore) works as follows:
    - creat() and mmap() a file on hugetlbfs (file size is 200MB == 100 hugepages,)
    - read()/write() something on it,
    - call mincore() for first ten pages and printf() the values of *vec
    - munmap() and unlink() the file on hugetlbfs

    Without my patch
    ----------------
    $ cat /proc/meminfo| grep "HugePage"
    HugePages_Total: 1000
    HugePages_Free: 1000
    HugePages_Rsvd: 0
    HugePages_Surp: 0
    $ ./leak_mincore
    vec[0] 0
    vec[1] 0
    vec[2] 0
    vec[3] 0
    vec[4] 0
    vec[5] 0
    vec[6] 0
    vec[7] 0
    vec[8] 0
    vec[9] 0
    $ cat /proc/meminfo |grep "HugePage"
    HugePages_Total: 1000
    HugePages_Free: 999
    HugePages_Rsvd: 0
    HugePages_Surp: 0
    $ ls /hugetlbfs/
    $

    Return values in *vec from mincore() are set to 0, while the hugepage
    should be in memory, and 1 hugepage is still accounted as used while
    there is no file on hugetlbfs.

    With my patch
    -------------
    $ cat /proc/meminfo| grep "HugePage"
    HugePages_Total: 1000
    HugePages_Free: 1000
    HugePages_Rsvd: 0
    HugePages_Surp: 0
    $ ./leak_mincore
    vec[0] 1
    vec[1] 1
    vec[2] 1
    vec[3] 1
    vec[4] 1
    vec[5] 1
    vec[6] 1
    vec[7] 1
    vec[8] 1
    vec[9] 1
    $ cat /proc/meminfo |grep "HugePage"
    HugePages_Total: 1000
    HugePages_Free: 1000
    HugePages_Rsvd: 0
    HugePages_Surp: 0
    $ ls /hugetlbfs/
    $

    Return value in *vec set to 1 and no memory leaks.

    [akpm@linux-foundation.org: cleanup]
    [akpm@linux-foundation.org: build fix]
    Signed-off-by: Naoya Horiguchi
    Cc: Andi Kleen
    Cc: Wu Fengguang
    Cc: Hugh Dickins
    Cc: Mel Gorman
    Cc: Lee Schermerhorn
    Cc: Andy Whitcroft
    Cc: David Rientjes
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Naoya Horiguchi
     

14 Jan, 2009

1 commit


28 Apr, 2008

1 commit

  • Nothing in the tree uses nopage any more. Remove support for it in the
    core mm code and documentation (and a few stray references to it in
    comments).

    Signed-off-by: Nick Piggin
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

16 Feb, 2007

3 commits


13 Feb, 2007

1 commit

  • Make mincore work for anon mappings, nonlinear, and migration entries.
    Based on patch from Linus Torvalds .

    Signed-off-by: Nick Piggin
    Acked-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Nick Piggin
     

18 Dec, 2006

1 commit


17 Dec, 2006

2 commits

  • Hugh Dickins correctly points out that mincore() is actually _supposed_
    to fail on an unmapped hole in the user address space, rather than
    return valid ("empty") information about the hole. This just simplifies
    the problem further (I had been misled by our previous confusing and
    complicated way of doing mincore()).

    Also, in the unlikely situation that we can't allocate a temporary
    kernel buffer, we should actually return EAGAIN, not ENOMEM, to keep the
    "unmapped hole" and "allocation failure" error cases separate.

    Finally, add a comment about our stupid historical lack of support for
    anonymous mappings. I'll fix that if somebody reminds me after 2.6.20
    is out.

    Acked-by: Hugh Dickins
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     
  • Doug Chapman noticed that mincore() will doa "copy_to_user()" of the
    result while holding the mmap semaphore for reading, which is a big
    no-no. While a recursive read-lock on a semaphore in the case of a page
    fault happens to work, we don't actually allow them due to deadlock
    schenarios with writers due to fairness issues.

    Doug and Marcel sent in a patch to fix it, but I decided to just rewrite
    the mess instead - not just fixing the locking problem, but making the
    code smaller and (imho) much easier to understand.

    Cc: Doug Chapman
    Cc: Marcel Holtmann
    Cc: Hugh Dickins
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

20 Apr, 2005

1 commit

  • Remove use of FIRST_USER_PGD_NR from sys_mincore: it's inconsistent (no other
    syscall refers to it), unnecessary (sys_mincore loops over vmas further down)
    and incorrect (misses user addresses in ARM's first pgd).

    Signed-off-by: Hugh Dickins
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Hugh Dickins
     

17 Apr, 2005

1 commit

  • Initial git repository build. I'm not bothering with the full history,
    even though we have it. We can create a separate "historical" git
    archive of that later if we want to, and in the meantime it's about
    3.2GB when imported into git - space that would just make the early
    git days unnecessarily complicated, when we don't have a lot of good
    infrastructure for it.

    Let it rip!

    Linus Torvalds