14 Feb, 2023
14 commits
-
Every caller of restore_reserve_on_error() is now passing in &folio->page,
change the function to take in a folio directly and clean up the call
sites.Link: https://lkml.kernel.org/r/20230125170537.96973-6-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Cc: Gerald Schaefer
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Mike Kravetz
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Change alloc_huge_page() to alloc_hugetlb_folio() by changing all callers
to handle the now folio return type of the function. In this conversion,
alloc_huge_page_vma() is also changed to alloc_hugetlb_folio_vma() and
hugepage_add_new_anon_rmap() is changed to take in a folio directly. Many
additions of '&folio->page' are cleaned up in subsequent patches.hugetlbfs_fallocate() is also refactored to use the RCU +
page_cache_next_miss() API.Link: https://lkml.kernel.org/r/20230125170537.96973-5-sidhartha.kumar@oracle.com
Suggested-by: Mike Kravetz
Reported-by: kernel test robot
Signed-off-by: Sidhartha Kumar
Cc: Gerald Schaefer
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Convert putback_active_hugepage() to folio_putback_active_hugetlb(), this
removes one user of the Huge Page macros which take in a page. The
callers in migrate.c are also cleaned up by being able to directly use the
src and dst folio variables.Link: https://lkml.kernel.org/r/20230125170537.96973-4-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Reviewed-by: Mike Kravetz
Cc: Gerald Schaefer
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Refactor hugetlbfs_pagecache_present() to avoid getting and dropping a
refcount on a page. Use RCU and page_cache_next_miss() instead.Link: https://lkml.kernel.org/r/20230125170537.96973-3-sidhartha.kumar@oracle.com
Suggested-by: Matthew Wilcox
Signed-off-by: Sidhartha Kumar
Cc: Gerald Schaefer
Cc: John Hubbard
Cc: kernel test robot
Cc: Mike Kravetz
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Patch series "convert hugetlb fault functions to folios", v2.
This series converts the hugetlb page faulting functions to operate on
folios. These include hugetlb_no_page(), hugetlb_wp(),
copy_hugetlb_page_range(), and hugetlb_mcopy_atomic_pte().This patch (of 8):
Change hugetlb_install_page() to hugetlb_install_folio(). This reduces
one user of the Huge Page flag macros which take in a page.Link: https://lkml.kernel.org/r/20230125170537.96973-1-sidhartha.kumar@oracle.com
Link: https://lkml.kernel.org/r/20230125170537.96973-2-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Reviewed-by: Mike Kravetz
Cc: Gerald Schaefer
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Change demote_free_huge_page to demote_free_hugetlb_folio() and change
demote_pool_huge_page() pass in a folio.Link: https://lkml.kernel.org/r/20230113223057.173292-9-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Mike Kravetz
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Use the hugetlb folio flag macros inside restore_reserve_on_error() and
update the comments to reflect the use of folios.Link: https://lkml.kernel.org/r/20230113223057.173292-8-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Reviewed-by: Mike Kravetz
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Change alloc_huge_page_nodemask() to alloc_hugetlb_folio_nodemask() and
alloc_migrate_huge_page() to alloc_migrate_hugetlb_folio(). Both
functions now return a folio rather than a page.Link: https://lkml.kernel.org/r/20230113223057.173292-7-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Reviewed-by: Mike Kravetz
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Change hugetlb_cgroup_commit_charge{,_rsvd}(), dequeue_huge_page_vma() and
alloc_buddy_huge_page_with_mpol() to use folios so alloc_huge_page() is
cleaned by operating on folios until its return.Link: https://lkml.kernel.org/r/20230113223057.173292-6-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Reviewed-by: Mike Kravetz
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Change alloc_surplus_huge_page() to alloc_surplus_hugetlb_folio() and
update its callers.Link: https://lkml.kernel.org/r/20230113223057.173292-5-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Reviewed-by: Mike Kravetz
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Muchun Song
Signed-off-by: Andrew Morton -
dequeue_huge_page_node_exact() is changed to dequeue_hugetlb_folio_node_
exact() and dequeue_huge_page_nodemask() is changed to dequeue_hugetlb_
folio_nodemask(). Update their callers to pass in a folio.Link: https://lkml.kernel.org/r/20230113223057.173292-4-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Mike Kravetz
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Change __update_and_free_page() to __update_and_free_hugetlb_folio() by
changing its callers to pass in a folio.Link: https://lkml.kernel.org/r/20230113223057.173292-3-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Reviewed-by: Mike Kravetz
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Muchun Song
Signed-off-by: Andrew Morton -
Patch series "continue hugetlb folio conversion", v3.
This series continues the conversion of core hugetlb functions to use
folios. This series converts many helper funtions in the hugetlb fault
path. This is in preparation for another series to convert the hugetlb
fault code paths to operate on folios.This patch (of 8):
Convert isolate_hugetlb() to take in a folio and convert its callers to
pass a folio. Use page_folio() to convert the callers to use a folio is
safe as isolate_hugetlb() operates on a head page.Link: https://lkml.kernel.org/r/20230113223057.173292-1-sidhartha.kumar@oracle.com
Link: https://lkml.kernel.org/r/20230113223057.173292-2-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar
Reviewed-by: Mike Kravetz
Cc: John Hubbard
Cc: Matthew Wilcox
Cc: Mike Kravetz
Cc: Muchun Song
Signed-off-by: Andrew Morton -
release_pte_pages() converts from a pfn to a folio by using pfn_folio().
If the pte is not mapped, pfn_folio() will result in undefined behavior
which ends up causing a kernel panic[1].Only call pfn_folio() once we have validated that the pte is both valid
and mapped to fix the issue.[1] https://lore.kernel.org/linux-mm/ff300770-afe9-908d-23ed-d23e0796e899@samsung.com/
Link: https://lkml.kernel.org/r/20230213214324.34215-1-vishal.moola@gmail.com
Signed-off-by: Vishal Moola (Oracle)
Fixes: 9bdfeea46f49 ("mm/khugepaged: convert release_pte_pages() to use folios")
Reported-by: Marek Szyprowski
Tested-by: Marek Szyprowski
Debugged-by: Alexandre Ghiti
Cc: Matthew Wilcox
Signed-off-by: Andrew Morton
11 Feb, 2023
1 commit
-
To pick up depended-upon changes
10 Feb, 2023
25 commits
-
commit a4574f63edc6 ("mm/memremap_pages: convert to 'struct range'")
converted res to range, update the comment correspondingly.Link: https://lkml.kernel.org/r/1675751220-2-1-git-send-email-lizhijian@fujitsu.com
Signed-off-by: Li Zhijian
Cc: Dan Williams
Signed-off-by: Andrew Morton -
Since commit ee6d3dd4ed48 ("driver core: make kobj_type constant.") the
driver core allows the usage of const struct kobj_type.Take advantage of this to constify the structure definitions to prevent
modification at runtime.Link: https://lkml.kernel.org/r/20230207-kobj_type-damon-v1-1-9d4fea6a465b@weissschuh.net
Signed-off-by: Thomas Weißschuh
Reviewed-by: SeongJae Park
Signed-off-by: Andrew Morton -
Move the flags that should not/are not used outside gup.c and related into
mm/internal.h to discourage driver abuse.To make this more maintainable going forward compact the two FOLL ranges
with new bit numbers from 0 to 11 and 16 to 21, using shifts so it is
explicit.Switch to an enum so the whole thing is easier to read.
Link: https://lkml.kernel.org/r/13-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Reviewed-by: John Hubbard
Acked-by: David Hildenbrand
Cc: David Howells
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: Alistair Popple
Cc: Mike Rapoport (IBM)
Signed-off-by: Andrew Morton -
This function is only used in gup.c and closely related. It touches
FOLL_PIN so it must be moved before the next patch.Link: https://lkml.kernel.org/r/12-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Reviewed-by: John Hubbard
Reviewed-by: David Hildenbrand
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Howells
Cc: Mike Rapoport (IBM)
Signed-off-by: Andrew Morton -
There are only two callers, both can handle the common return code:
- get_user_page_fast_only() checks == 1
- gfn_to_page_many_atomic() already returns -1, and the only caller
checks for negative return valuesRemove the restriction against returning negative values.
Link: https://lkml.kernel.org/r/11-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Acked-by: Mike Rapoport (IBM)
Reviewed-by: John Hubbard
Reviewed-by: David Hildenbrand
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Howells
Signed-off-by: Andrew Morton -
Commit ed29c2691188 ("drm/i915: Fix userptr so we do not have to worry
about obj->mm.lock, v7.") removed the only caller, remove this dead code
too.Link: https://lkml.kernel.org/r/10-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Acked-by: Mike Rapoport (IBM)
Reviewed-by: John Hubbard
Reviewed-by: David Hildenbrand
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Howells
Signed-off-by: Andrew Morton -
Now that NULL locked doesn't have a special meaning we can just make it
non-NULL in all cases and remove the special tests.get_user_pages() and pin_user_pages() can safely pass in a locked = 1
get_user_pages_remote) and pin_user_pages_remote() can swap in a local
variable for locked if NULL is passed.Remove all the NULL checks.
Link: https://lkml.kernel.org/r/9-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Acked-by: Mike Rapoport (IBM)
Reviewed-by: John Hubbard
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Hildenbrand
Cc: David Howells
Signed-off-by: Andrew Morton -
Setting FOLL_UNLOCKABLE allows GUP to lock/unlock the mmap lock on its
own. It is a more explicit replacement for locked != NULL. This clears
the way for passing in locked = 1, without intending that the lock can be
unlocked.Set the flag in all cases where it is used, eg locked is present in the
external interface or locked is used internally with locked = 0.Link: https://lkml.kernel.org/r/8-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Acked-by: Mike Rapoport (IBM)
Reviewed-by: John Hubbard
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Hildenbrand
Cc: David Howells
Signed-off-by: Andrew Morton -
The only caller of this function always passes in a non-NULL locked, so
just remove this obsolete comment.Link: https://lkml.kernel.org/r/7-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Reviewed-by: John Hubbard
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Hildenbrand
Cc: David Howells
Cc: Mike Rapoport (IBM)
Signed-off-by: Andrew Morton -
Since commit 5b78ed24e8ec ("mm/pagemap: add mmap_assert_locked()
annotations to find_vma*()") we already have this assertion, it is just
buried in find_vma():__get_user_pages_locked()
__get_user_pages()
find_extend_vma()
find_vma()Also check it at the top of __get_user_pages_locked() as a form of
documentation.Link: https://lkml.kernel.org/r/6-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Reviewed-by: John Hubbard
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Hildenbrand
Cc: David Howells
Cc: Mike Rapoport (IBM)
Signed-off-by: Andrew Morton -
The GUP family of functions have a complex, but fairly well defined, set
of invariants for their arguments. Currently these are sprinkled about,
sometimes in duplicate through many functions.Internally we don't follow all the invariants that the external interface
has to follow, so place these checks directly at the exported interface.
This ensures the internal functions never reach a violated invariant.Remove the duplicated invariant checks.
The end result is to make these functions fully internal:
__get_user_pages_locked()
internal_get_user_pages_fast()
__gup_longterm_locked()And all the other functions call directly into one of these.
Link: https://lkml.kernel.org/r/5-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Suggested-by: John Hubbard
Reviewed-by: John Hubbard
Acked-by: Mike Rapoport (IBM)
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Hildenbrand
Cc: David Howells
Signed-off-by: Andrew Morton -
This is part of the internal function of gup.c and is only non-static so
that the parts of gup.c in the huge_memory.c and hugetlb.c can call it.Put it in internal.h beside the similarly purposed try_grab_folio()
Link: https://lkml.kernel.org/r/4-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Reviewed-by: John Hubbard
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Hildenbrand
Cc: David Howells
Cc: Mike Rapoport (IBM)
Signed-off-by: Andrew Morton -
get_user_pages_remote(), get_user_pages_unlocked() and get_user_pages()
are never called with FOLL_LONGTERM, so directly call
__get_user_pages_locked()The next patch will add an assertion for this.
Link: https://lkml.kernel.org/r/3-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Suggested-by: John Hubbard
Reviewed-by: John Hubbard
Acked-by: Mike Rapoport (IBM)
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Hildenbrand
Cc: David Howells
Signed-off-by: Andrew Morton -
These days FOLL_LONGTERM is not allowed at all on any get_user_pages*()
functions, it must be only be used with pin_user_pages*(), plus it now has
universal support for all the pin_user_pages*() functions.Link: https://lkml.kernel.org/r/2-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Reviewed-by: John Hubbard
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: Claudio Imbrenda
Cc: David Hildenbrand
Cc: David Howells
Cc: Mike Rapoport (IBM)
Signed-off-by: Andrew Morton -
Patch series "Simplify the external interface for GUP", v2.
It is quite a maze of EXPORTED symbols leading up to the three actual
worker functions of GUP. Simplify this by reorganizing some of the code so
the EXPORTED symbols directly call the correct internal function with
validated and consistent arguments.Consolidate all the assertions into one place at the top of the call
chains.Remove some dead code.
Move more things into the mm/internal.h header
This patch (of 13):
__get_user_pages_locked() and __gup_longterm_locked() both require the
mmap lock to be held. They have a slightly unusual locked parameter that
is used to allow these functions to unlock and relock the mmap lock and
convey that fact to the caller.Several places wrap these functions with a simple mmap_read_lock() just so
they can follow the optimized locked protocol.Consolidate this internally to the functions. Allow internal callers to
set locked = 0 to cause the functions to acquire and release the lock on
their own.Reorganize __gup_longterm_locked() to use the autolocking in
__get_user_pages_locked().Replace all the places obtaining the mmap_read_lock() just to call
__get_user_pages_locked() with the new mechanism. Replace all the
internal callers of get_user_pages_unlocked() with direct calls to
__gup_longterm_locked() using the new mechanism.A following patch will add assertions ensuring the external interface
continues to always pass in locked = 1.Link: https://lkml.kernel.org/r/0-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Link: https://lkml.kernel.org/r/1-v2-987e91b59705+36b-gup_tidy_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe
Acked-by: Mike Rapoport (IBM)
Reviewed-by: John Hubbard
Cc: Alistair Popple
Cc: Christoph Hellwig
Cc: David Hildenbrand
Cc: David Howells
Cc: Claudio Imbrenda
Signed-off-by: Andrew Morton -
Currently, for vmalloc areas with flag VM_IOREMAP set, except of the
specific alignment clamping in __get_vm_area_node(), they will be1) Shown as ioremap in /proc/vmallocinfo;
2) Ignored by /proc/kcore reading via vread()
So for the ioremap in __sq_remap() of sh, we should set VM_IOREMAP in flag
to make it handled correctly as above.Link: https://lkml.kernel.org/r/20230206084020.174506-8-bhe@redhat.com
Signed-off-by: Baoquan He
Reviewed-by: Lorenzo Stoakes
Reviewed-by: Uladzislau Rezki (Sony)
Cc: Dan Carpenter
Cc: Stephen Brennan
Signed-off-by: Andrew Morton -
Currently, for vmalloc areas with flag VM_IOREMAP set, except of the
specific alignment clamping in __get_vm_area_node(), they will be1) Shown as ioremap in /proc/vmallocinfo;
2) Ignored by /proc/kcore reading via vread()
So for the io mapping in ioremap_phb() of ppc, we should set VM_IOREMAP in
flag to make it handled correctly as above.Link: https://lkml.kernel.org/r/20230206084020.174506-7-bhe@redhat.com
Signed-off-by: Baoquan He
Reviewed-by: Lorenzo Stoakes
Reviewed-by: Uladzislau Rezki (Sony)
Cc: Dan Carpenter
Cc: Stephen Brennan
Signed-off-by: Andrew Morton -
For areas allocated via vmalloc_xxx() APIs, it searches for unmapped area
to reserve and allocates new pages to map into, please see function
__vmalloc_node_range(). During the process, flag VM_UNINITIALIZED is set
in vm->flags to indicate that the pages allocation and mapping haven't
been done, until clear_vm_uninitialized_flag() is called to clear
VM_UNINITIALIZED.For this kind of area, if VM_UNINITIALIZED is still set, let's ignore it
in vread() because pages newly allocated and being mapped in that area
only contains zero data. reading them out by aligned_vread() is wasting
time.Link: https://lkml.kernel.org/r/20230206084020.174506-6-bhe@redhat.com
Signed-off-by: Baoquan He
Reviewed-by: Lorenzo Stoakes
Reviewed-by: Uladzislau Rezki (Sony)
Cc: Dan Carpenter
Cc: Stephen Brennan
Signed-off-by: Andrew Morton -
Now, by marking VMAP_RAM in vmap_area->flags for vm_map_ram area, we can
clearly differentiate it with other vmalloc areas. So identify
vm_map_area area by checking VMAP_RAM of vmap_area->flags when shown in
/proc/vmcoreinfo.Meanwhile, the code comment above vm_map_ram area checking in s_show() is
not needed any more, remove it here.Link: https://lkml.kernel.org/r/20230206084020.174506-5-bhe@redhat.com
Signed-off-by: Baoquan He
Reviewed-by: Lorenzo Stoakes
Cc: Dan Carpenter
Cc: Stephen Brennan
Cc: Uladzislau Rezki (Sony)
Signed-off-by: Andrew Morton -
Currently, vread can read out vmalloc areas which is associated with a
vm_struct. While this doesn't work for areas created by vm_map_ram()
interface because it doesn't have an associated vm_struct. Then in
vread(), these areas are all skipped.Here, add a new function vmap_ram_vread() to read out vm_map_ram areas.
The area created with vmap_ram_vread() interface directly can be handled
like the other normal vmap areas with aligned_vread(). While areas which
will be further subdivided and managed with vmap_block need carefully read
out page-aligned small regions and zero fill holes.Link: https://lkml.kernel.org/r/20230206084020.174506-4-bhe@redhat.com
Reported-by: Stephen Brennan
Signed-off-by: Baoquan He
Reviewed-by: Lorenzo Stoakes
Tested-by: Stephen Brennan
Cc: Dan Carpenter
Cc: Uladzislau Rezki (Sony)
Signed-off-by: Andrew Morton -
Through vmalloc API, a virtual kernel area is reserved for physical
address mapping. And vmap_area is used to track them, while vm_struct is
allocated to associate with the vmap_area to store more information and
passed out.However, area reserved via vm_map_ram() is an exception. It doesn't have
vm_struct to associate with vmap_area. And we can't recognize the
vmap_area with '->vm == NULL' as a vm_map_ram() area because the normal
freeing path will set va->vm = NULL before unmapping, please see function
remove_vm_area().Meanwhile, there are two kinds of handling for vm_map_ram area. One is
the whole vmap_area being reserved and mapped at one time through
vm_map_area() interface; the other is the whole vmap_area with
VMAP_BLOCK_SIZE size being reserved, while mapped into split regions with
smaller size via vb_alloc().To mark the area reserved through vm_map_ram(), add flags field into
struct vmap_area. Bit 0 indicates this is vm_map_ram area created through
vm_map_ram() interface, while bit 1 marks out the type of vm_map_ram area
which makes use of vmap_block to manage split regions via vb_alloc/free().This is a preparation for later use.
Link: https://lkml.kernel.org/r/20230206084020.174506-3-bhe@redhat.com
Signed-off-by: Baoquan He
Reviewed-by: Lorenzo Stoakes
Reviewed-by: Uladzislau Rezki (Sony)
Cc: Dan Carpenter
Cc: Stephen Brennan
Signed-off-by: Andrew Morton -
Patch series "mm/vmalloc.c: allow vread() to read out vm_map_ram areas", v5.
Problem:
***Stephen reported vread() will skip vm_map_ram areas when reading out
/proc/kcore with drgn utility. Please see below link to get more details./proc/kcore reads 0's for vmap_block
https://lore.kernel.org/all/87ilk6gos2.fsf@oracle.com/T/#uRoot cause:
***The normal vmalloc API uses struct vmap_area to manage the virtual kernel
area allocated, and associate a vm_struct to store more information and
pass out. However, area reserved through vm_map_ram() interface doesn't
allocate vm_struct to associate with. So the current code in vread() will
skip the vm_map_ram area through 'if (!va->vm)' conditional checking.Solution:
***To mark the area reserved through vm_map_ram() interface, add field
'flags' into struct vmap_area. Bit 0 indicates this is vm_map_ram area
created through vm_map_ram() interface, bit 1 marks out the type of
vm_map_ram area which makes use of vmap_block to manage split regions via
vb_alloc/free().And also add bitmap field 'used_map' into struct vmap_block to mark those
further subdivided regions being used to differentiate with dirty and free
regions in vmap_block.With the help of above vmap_area->flags and vmap_block->used_map, we can
recognize and handle vm_map_ram areas successfully. All these are done in
patch 1~3.Meanwhile, do some improvement on areas related to vm_map_ram areas in
patch 4, 5. And also change area flag from VM_ALLOC to VM_IOREMAP in
patch 6, 7 because this will show them as 'ioremap' in /proc/vmallocinfo,
and exclude them from /proc/kcore.This patch (of 7):
In one vmap_block area, there could be three types of regions: region
being used which is allocated through vb_alloc(), dirty region which is
freed via vb_free() and free region. Among them, only used region has
available data. While there's no way to track those used regions
currently.Here, add bitmap field used_map into vmap_block, and set/clear it during
allocation or freeing regions of vmap_block area.This is a preparation for later use.
Link: https://lkml.kernel.org/r/20230206084020.174506-1-bhe@redhat.com
Link: https://lkml.kernel.org/r/20230206084020.174506-2-bhe@redhat.com
Signed-off-by: Baoquan He
Reviewed-by: Lorenzo Stoakes
Reviewed-by: Uladzislau Rezki (Sony)
Cc: Dan Carpenter
Cc: Stephen Brennan
Cc: Uladzislau Rezki (Sony)
Signed-off-by: Andrew Morton -
With W=1 and CONFIG_SHMEM=n, shmem.c functions have no prototypes so the
compiler emits warnings.Link: https://lkml.kernel.org/r/20230206190850.4054983-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Cc: Mark Hemment
Cc: Charan Teja Kalla
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Michal Hocko
Cc: Pavankumar Kondeti
Cc: Shakeel Butt
Cc: Suren Baghdasaryan
Cc: Vlastimil Babka
Signed-off-by: Andrew Morton -
These are the folio replacements for shmem_read_mapping_page() and
shmem_read_mapping_page_gfp().[akpm@linux-foundation.org: fix shmem_read_mapping_page_gfp(), per Matthew]
Link: https://lkml.kernel.org/r/Y+QdJTuzxeBYejw2@casper.infradead.org
Link: https://lkml.kernel.org/r/20230206162520.4029022-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Cc: Mark Hemment
Cc: Charan Teja Kalla
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Michal Hocko
Cc: Pavankumar Kondeti
Cc: Shakeel Butt
Cc: Suren Baghdasaryan
Cc: Vlastimil Babka
Signed-off-by: Andrew Morton -
This is like read_cache_page_gfp() except it returns the folio instead
of the precise page.Link: https://lkml.kernel.org/r/20230206162520.4029022-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Cc: Charan Teja Kalla
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Mark Hemment
Cc: Michal Hocko
Cc: Pavankumar Kondeti
Cc: Shakeel Butt
Cc: Suren Baghdasaryan
Cc: Vlastimil Babka
Signed-off-by: Andrew Morton