25 Sep, 2019

1 commit

  • Transparent Huge Pages are currently stored in i_pages as pointers to
    consecutive subpages. This patch changes that to storing consecutive
    pointers to the head page in preparation for storing huge pages more
    efficiently in i_pages.

    Large parts of this are "inspired" by Kirill's patch
    https://lore.kernel.org/lkml/20170126115819.58875-2-kirill.shutemov@linux.intel.com/

    Kirill and Huang Ying contributed several fixes.

    [willy@infradead.org: use compound_nr, squish uninit-var warning]
    Link: http://lkml.kernel.org/r/20190731210400.7419-1-willy@infradead.org
    Signed-off-by: Matthew Wilcox
    Acked-by: Jan Kara
    Reviewed-by: Kirill Shutemov
    Reviewed-by: Song Liu
    Tested-by: Song Liu
    Tested-by: William Kucharski
    Reviewed-by: William Kucharski
    Tested-by: Qian Cai
    Tested-by: Mikhail Gavrilov
    Cc: Hugh Dickins
    Cc: Chris Wilson
    Cc: Song Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox (Oracle)
     

06 Jul, 2019

1 commit

  • This reverts commit 5fd4ca2d84b249f0858ce28cf637cf25b61a398f.

    Mikhail Gavrilov reports that it causes the VM_BUG_ON_PAGE() in
    __delete_from_swap_cache() to trigger:

    page:ffffd6d34dff0000 refcount:1 mapcount:1 mapping:ffff97812323a689 index:0xfecec363
    anon
    flags: 0x17fffe00080034(uptodate|lru|active|swapbacked)
    raw: 0017fffe00080034 ffffd6d34c67c508 ffffd6d3504b8d48 ffff97812323a689
    raw: 00000000fecec363 0000000000000000 0000000100000000 ffff978433ace000
    page dumped because: VM_BUG_ON_PAGE(entry != page)
    page->mem_cgroup:ffff978433ace000
    ------------[ cut here ]------------
    kernel BUG at mm/swap_state.c:170!
    invalid opcode: 0000 [#1] SMP NOPTI
    CPU: 1 PID: 221 Comm: kswapd0 Not tainted 5.2.0-0.rc2.git0.1.fc31.x86_64 #1
    Hardware name: System manufacturer System Product Name/ROG STRIX X470-I GAMING, BIOS 2202 04/11/2019
    RIP: 0010:__delete_from_swap_cache+0x20d/0x240
    Code: 30 65 48 33 04 25 28 00 00 00 75 4a 48 83 c4 38 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 c7 c6 2f dc 0f 8a 48 89 c7 e8 93 1b fd ff 0b 48 c7 c6 a8 74 0f 8a e8 85 1b fd ff 0f 0b 48 c7 c6 a8 7d 0f
    RSP: 0018:ffffa982036e7980 EFLAGS: 00010046
    RAX: 0000000000000021 RBX: 0000000000000040 RCX: 0000000000000006
    RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff97843d657900
    RBP: 0000000000000001 R08: ffffa982036e7835 R09: 0000000000000535
    R10: ffff97845e21a46c R11: ffffa982036e7835 R12: ffff978426387120
    R13: 0000000000000000 R14: ffffd6d34dff0040 R15: ffffd6d34dff0000
    FS: 0000000000000000(0000) GS:ffff97843d640000(0000) knlGS:0000000000000000
    CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00002cba88ef5000 CR3: 000000078a97c000 CR4: 00000000003406e0
    Call Trace:
    delete_from_swap_cache+0x46/0xa0
    try_to_free_swap+0xbc/0x110
    swap_writepage+0x13/0x70
    pageout.isra.0+0x13c/0x350
    shrink_page_list+0xc14/0xdf0
    shrink_inactive_list+0x1e5/0x3c0
    shrink_node_memcg+0x202/0x760
    shrink_node+0xe0/0x470
    balance_pgdat+0x2d1/0x510
    kswapd+0x220/0x420
    kthread+0xfb/0x130
    ret_from_fork+0x22/0x40

    and it's not immediately obvious why it happens. It's too late in the
    rc cycle to do anything but revert for now.

    Link: https://lore.kernel.org/lkml/CABXGCsN9mYmBD-4GaaeW_NrDu+FDXLzr_6x+XNxfmFV6QkYCDg@mail.gmail.com/
    Reported-and-bisected-by: Mikhail Gavrilov
    Suggested-by: Jan Kara
    Cc: Michal Hocko
    Cc: Vlastimil Babka
    Cc: Matthew Wilcox
    Cc: Kirill Shutemov
    Cc: William Kucharski
    Cc: Andrew Morton
    Signed-off-by: Linus Torvalds

    Linus Torvalds
     

15 May, 2019

1 commit

  • Transparent Huge Pages are currently stored in i_pages as pointers to
    consecutive subpages. This patch changes that to storing consecutive
    pointers to the head page in preparation for storing huge pages more
    efficiently in i_pages.

    Large parts of this are "inspired" by Kirill's patch
    https://lore.kernel.org/lkml/20170126115819.58875-2-kirill.shutemov@linux.intel.com/

    [willy@infradead.org: fix swapcache pages]
    Link: http://lkml.kernel.org/r/20190324155441.GF10344@bombadil.infradead.org
    [kirill@shutemov.name: hugetlb stores pages in page cache differently]
    Link: http://lkml.kernel.org/r/20190404134553.vuvhgmghlkiw2hgl@kshutemo-mobl1
    Link: http://lkml.kernel.org/r/20190307153051.18815-1-willy@infradead.org
    Signed-off-by: Matthew Wilcox
    Acked-by: Jan Kara
    Reviewed-by: Kirill Shutemov
    Reviewed-and-tested-by: Song Liu
    Tested-by: William Kucharski
    Reviewed-by: William Kucharski
    Tested-by: Qian Cai
    Cc: Hugh Dickins
    Cc: Song Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Matthew Wilcox
     

06 Mar, 2019

1 commit

  • Android uses ashmem for sharing memory regions. We are looking forward
    to migrating all usecases of ashmem to memfd so that we can possibly
    remove the ashmem driver in the future from staging while also
    benefiting from using memfd and contributing to it. Note staging
    drivers are also not ABI and generally can be removed at anytime.

    One of the main usecases Android has is the ability to create a region
    and mmap it as writeable, then add protection against making any
    "future" writes while keeping the existing already mmap'ed
    writeable-region active. This allows us to implement a usecase where
    receivers of the shared memory buffer can get a read-only view, while
    the sender continues to write to the buffer. See CursorWindow
    documentation in Android for more details:

    https://developer.android.com/reference/android/database/CursorWindow

    This usecase cannot be implemented with the existing F_SEAL_WRITE seal.
    To support the usecase, this patch adds a new F_SEAL_FUTURE_WRITE seal
    which prevents any future mmap and write syscalls from succeeding while
    keeping the existing mmap active.

    A better way to do F_SEAL_FUTURE_WRITE seal was discussed [1] last week
    where we don't need to modify core VFS structures to get the same
    behavior of the seal. This solves several side-effects pointed by Andy.
    self-tests are provided in later patch to verify the expected semantics.

    [1] https://lore.kernel.org/lkml/20181111173650.GA256781@google.com/

    Thanks a lot to Andy for suggestions to improve code.

    Link: http://lkml.kernel.org/r/20190112203816.85534-2-joel@joelfernandes.org
    Signed-off-by: Joel Fernandes (Google)
    Acked-by: John Stultz
    Cc: Andy Lutomirski
    Cc: Minchan Kim
    Cc: Jann Horn
    Cc: Al Viro
    Cc: Andy Lutomirski
    Cc: Hugh Dickins
    Cc: J. Bruce Fields
    Cc: Jeff Layton
    Cc: Marc-Andr Lureau
    Cc: Matthew Wilcox
    Cc: Mike Kravetz
    Cc: Shuah Khan
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Joel Fernandes (Google)
     

21 Oct, 2018

2 commits


12 Jul, 2018

1 commit


08 Jun, 2018

1 commit

  • With the addition of memfd hugetlbfs support, we now have the situation
    where memfd depends on TMPFS -or- HUGETLBFS. Previously, memfd was only
    supported on tmpfs, so it made sense that the code resided in shmem.c.
    In the current code, memfd is only functional if TMPFS is defined. If
    HUGETLFS is defined and TMPFS is not defined, then memfd functionality
    will not be available for hugetlbfs. This does not cause BUGs, just a
    lack of potentially desired functionality.

    Code is restructured in the following way:
    - include/linux/memfd.h is a new file containing memfd specific
    definitions previously contained in shmem_fs.h.
    - mm/memfd.c is a new file containing memfd specific code previously
    contained in shmem.c.
    - memfd specific code is removed from shmem_fs.h and shmem.c.
    - A new config option MEMFD_CREATE is added that is defined if TMPFS
    or HUGETLBFS is defined.

    No functional changes are made to the code: restructuring only.

    Link: http://lkml.kernel.org/r/20180415182119.4517-4-mike.kravetz@oracle.com
    Signed-off-by: Mike Kravetz
    Reviewed-by: Khalid Aziz
    Cc: Andrea Arcangeli
    Cc: David Herrmann
    Cc: Hugh Dickins
    Cc: Marc-Andr Lureau
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mike Kravetz