06 Nov, 2020
1 commit
-
…rnel.org/pub/scm/linux/kernel/git/acme/linux") into android-mainline
Steps on the way to 5.10-rc3
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Ia09418a96a25f6c602af953db5d3258e032c0f30
03 Nov, 2020
1 commit
-
When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to
pte_unmap_unlock seems like not a good idea.queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't
migrate misplaced pages but returns with EIO when encountering such a
page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return
-EIO when MPOL_MF_STRICT is specified") and early break on the first pte
in the range results in pte_unmap_unlock on an underflow pte. This can
lead to lockups later on when somebody tries to lock the pte resp.
page_table_lock again..Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
Signed-off-by: Shijie Luo
Signed-off-by: Miaohe Lin
Signed-off-by: Andrew Morton
Reviewed-by: Oscar Salvador
Acked-by: Michal Hocko
Cc: Miaohe Lin
Cc: Feilong Lin
Cc: Shijie Luo
Cc:
Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com
Signed-off-by: Linus Torvalds
25 Oct, 2020
1 commit
-
steps on the way to 5.10-rc1
Change-Id: Iddc84c25b6a9d71fa8542b927d6f69c364131c3d
Signed-off-by: Greg Kroah-Hartman
14 Oct, 2020
2 commits
-
No one use this macro anymore.
Also fix code style of policy_node().
Signed-off-by: Wei Yang
Signed-off-by: Andrew Morton
Reviewed-by: Andrew Morton
Link: https://lkml.kernel.org/r/20200921021401.84508-1-richard.weiyang@linux.alibaba.com
Signed-off-by: Linus Torvalds -
It is not necessary to hold the lock of current when setting nodemask of
a new policy.Signed-off-by: Wei Yang
Signed-off-by: Andrew Morton
Reviewed-by: Andrew Morton
Link: https://lkml.kernel.org/r/20200921040416.86185-1-richard.weiyang@linux.alibaba.com
Signed-off-by: Linus Torvalds
17 Aug, 2020
1 commit
-
…ux-block") into android-mainline
Steps on the way to 5.9-rc1
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I904678b5c31139b25fb49ee67cbacb4721a3c7bc
15 Aug, 2020
1 commit
-
The thp prefix is more frequently used than hpage and we should be
consistent between the various functions.[akpm@linux-foundation.org: fix mm/migrate.c]
Signed-off-by: Matthew Wilcox (Oracle)
Signed-off-by: Andrew Morton
Reviewed-by: William Kucharski
Reviewed-by: Zi Yan
Cc: Mike Kravetz
Cc: David Hildenbrand
Cc: "Kirill A. Shutemov"
Link: http://lkml.kernel.org/r/20200629151959.15779-6-willy@infradead.org
Signed-off-by: Linus Torvalds
13 Aug, 2020
6 commits
-
…ernel/git/abelloni/linux") into android-mainline
Steps on the way to 5.9-rc1.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: Iceded779988ff472863b7e1c54e22a9fa6383a30 -
There is a well-defined migration target allocation callback. Use it.
Signed-off-by: Joonsoo Kim
Signed-off-by: Andrew Morton
Acked-by: Michal Hocko
Acked-by: Vlastimil Babka
Cc: Christoph Hellwig
Cc: Mike Kravetz
Cc: Naoya Horiguchi
Cc: Roman Gushchin
Link: http://lkml.kernel.org/r/1594622517-20681-7-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds -
There is no difference between two migration callback functions,
alloc_huge_page_node() and alloc_huge_page_nodemask(), except
__GFP_THISNODE handling. It's redundant to have two almost similar
functions in order to handle this flag. So, this patch tries to remove
one by introducing a new argument, gfp_mask, to
alloc_huge_page_nodemask().After introducing gfp_mask argument, it's caller's job to provide correct
gfp_mask. So, every callsites for alloc_huge_page_nodemask() are changed
to provide gfp_mask.Note that it's safe to remove a node id check in alloc_huge_page_node()
since there is no caller passing NUMA_NO_NODE as a node id.Signed-off-by: Joonsoo Kim
Signed-off-by: Andrew Morton
Reviewed-by: Mike Kravetz
Reviewed-by: Vlastimil Babka
Acked-by: Michal Hocko
Cc: Christoph Hellwig
Cc: Naoya Horiguchi
Cc: Roman Gushchin
Link: http://lkml.kernel.org/r/1594622517-20681-4-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Linus Torvalds -
Previous implementatoin calls untagged_addr() before error check, while if
the error check failed and return EINVAL, the untagged_addr() call is just
useless work.Signed-off-by: Wenchao Hao
Signed-off-by: Andrew Morton
Reviewed-by: Andrew Morton
Link: http://lkml.kernel.org/r/20200801090825.5597-1-haowenchao22@gmail.com
Signed-off-by: Linus Torvalds -
Fix W=1 compile warnings (invalid kerneldoc):
mm/mempolicy.c:137: warning: Function parameter or member 'node' not described in 'numa_map_to_online_node'
mm/mempolicy.c:137: warning: Excess function parameter 'nid' description in 'numa_map_to_online_node'Signed-off-by: Krzysztof Kozlowski
Signed-off-by: Andrew Morton
Reviewed-by: Andrew Morton
Link: http://lkml.kernel.org/r/20200728171109.28687-3-krzk@kernel.org
Signed-off-by: Linus Torvalds -
In the reservation routine, we only check whether the cpuset meets the
memory allocation requirements. But we ignore the mempolicy of MPOL_BIND
case. If someone mmap hugetlb succeeds, but the subsequent memory
allocation may fail due to mempolicy restrictions and receives the SIGBUS
signal. This can be reproduced by the follow steps.1) Compile the test case.
cd tools/testing/selftests/vm/
gcc map_hugetlb.c -o map_hugetlb2) Pre-allocate huge pages. Suppose there are 2 numa nodes in the
system. Each node will pre-allocate one huge page.
echo 2 > /proc/sys/vm/nr_hugepages3) Run test case(mmap 4MB). We receive the SIGBUS signal.
numactl --membind=3D0 ./map_hugetlb 4With this patch applied, the mmap will fail in the step 3) and throw
"mmap: Cannot allocate memory".[akpm@linux-foundation.org: include sched.h for `current']
Reported-by: Jianchao Guo
Suggested-by: Michal Hocko
Signed-off-by: Muchun Song
Signed-off-by: Andrew Morton
Reviewed-by: Mike Kravetz
Cc: David Rientjes
Cc: Mel Gorman
Cc: Michel Lespinasse
Cc: Baoquan He
Link: http://lkml.kernel.org/r/20200728034938.14993-1-songmuchun@bytedance.com
Signed-off-by: Linus Torvalds
07 Aug, 2020
1 commit
-
…into android-mainline
Steps on the way to 5.9-rc1
Resolves conflicts in:
drivers/irqchip/qcom-pdc.c
include/linux/device.h
net/xfrm/xfrm_state.c
security/lsm_audit.cSigned-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I4aeb3d04f4717714a421721eb3ce690c099bb30a
17 Jul, 2020
1 commit
-
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
xargs perl -pi -e \
's/\buninitialized_var\(([^\)]+)\)/\1/g;
s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/Reviewed-by: Leon Romanovsky # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe # IB
Acked-by: Kalle Valo # wireless drivers
Reviewed-by: Chao Yu # erofs
Signed-off-by: Kees Cook
24 Jun, 2020
1 commit
-
…cm/linux/kernel/git/linkinjeon/exfat") into android-mainline
Steps on the way to 5.8-rc1.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I4bc42f572167ea2f815688b4d1eb6124b6d260d4
22 Jun, 2020
1 commit
-
Steps along the way to 5.8-rc1.
Signed-off-by: Greg Kroah-Hartman
Change-Id: I6cca4fa48322228c8182201d68dc05f9b72cfc50
10 Jun, 2020
3 commits
-
Convert comments that reference mmap_sem to reference mmap_lock instead.
[akpm@linux-foundation.org: fix up linux-next leftovers]
[akpm@linux-foundation.org: s/lockaphore/lock/, per Vlastimil]
[akpm@linux-foundation.org: more linux-next fixups, per Michel]Signed-off-by: Michel Lespinasse
Signed-off-by: Andrew Morton
Reviewed-by: Vlastimil Babka
Reviewed-by: Daniel Jordan
Cc: Davidlohr Bueso
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Jason Gunthorpe
Cc: Jerome Glisse
Cc: John Hubbard
Cc: Laurent Dufour
Cc: Liam Howlett
Cc: Matthew Wilcox
Cc: Peter Zijlstra
Cc: Ying Han
Link: http://lkml.kernel.org/r/20200520052908.204642-13-walken@google.com
Signed-off-by: Linus Torvalds -
Convert comments that reference old mmap_sem APIs to reference
corresponding new mmap locking APIs instead.Signed-off-by: Michel Lespinasse
Signed-off-by: Andrew Morton
Reviewed-by: Vlastimil Babka
Reviewed-by: Davidlohr Bueso
Reviewed-by: Daniel Jordan
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Jason Gunthorpe
Cc: Jerome Glisse
Cc: John Hubbard
Cc: Laurent Dufour
Cc: Liam Howlett
Cc: Matthew Wilcox
Cc: Peter Zijlstra
Cc: Ying Han
Link: http://lkml.kernel.org/r/20200520052908.204642-12-walken@google.com
Signed-off-by: Linus Torvalds -
This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.The change is generated using coccinelle with the following rule:
// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .
@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)Signed-off-by: Michel Lespinasse
Signed-off-by: Andrew Morton
Reviewed-by: Daniel Jordan
Reviewed-by: Laurent Dufour
Reviewed-by: Vlastimil Babka
Cc: Davidlohr Bueso
Cc: David Rientjes
Cc: Hugh Dickins
Cc: Jason Gunthorpe
Cc: Jerome Glisse
Cc: John Hubbard
Cc: Liam Howlett
Cc: Matthew Wilcox
Cc: Peter Zijlstra
Cc: Ying Han
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
Signed-off-by: Linus Torvalds
04 Jun, 2020
1 commit
-
ba841078cd05 ("mm/mempolicy: Allow lookup_node() to handle fatal signal")
has added a special casing for 0 return value because that was a possible
gup return value when interrupted by fatal signal. This has been fixed by
ae46d2aa6a7f ("mm/gup: Let __get_user_pages_locked() return -EINTR for
fatal signal") in the mean time so ba841078cd05 can be reverted.This patch however doesn't go all the way to revert it because the check
for 0 is wrong and confusing here. Firstly it is inherently unsafe to
access the page when get_user_pages_locked returns 0 (aka no page
returned).Fortunatelly this will not happen because get_user_pages_locked will not
return 0 when nr_pages > 0 unless FOLL_NOWAIT is specified which is not
the case here. Document this potential error code in gup code while we
are at it.Signed-off-by: Michal Hocko
Signed-off-by: Andrew Morton
Cc: Peter Xu
Link: http://lkml.kernel.org/r/20200421071026.18394-1-mhocko@kernel.org
Signed-off-by: Linus Torvalds
10 Apr, 2020
1 commit
-
…x") into android-mainline
Baby steps on the way to 5.7-rc1
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I89095a90046a14eab189aab257a75b3dfdb5b1db
09 Apr, 2020
1 commit
-
Pull libnvdimm and dax updates from Dan Williams:
"There were multiple touches outside of drivers/nvdimm/ this round to
add cross arch compatibility to the devm_memremap_pages() interface,
enhance numa information for persistent memory ranges, and add a
zero_page_range() dax operation.This cycle I switched from the patchwork api to Konstantin's b4 script
for collecting tags (from x86, PowerPC, filesystem, and device-mapper
folks), and everything looks to have gone ok there. This has all
appeared in -next with no reported issues.Summary:
- Add support for region alignment configuration and enforcement to
fix compatibility across architectures and PowerPC page size
configurations.- Introduce 'zero_page_range' as a dax operation. This facilitates
filesystem-dax operation without a block-device.- Introduce phys_to_target_node() to facilitate drivers that want to
know resulting numa node if a given reserved address range was
onlined.- Advertise a persistence-domain for of_pmem and papr_scm. The
persistence domain indicates where cpu-store cycles need to reach
in the platform-memory subsystem before the platform will consider
them power-fail protected.- Promote numa_map_to_online_node() to a cross-kernel generic
facility.- Save x86 numa information to allow for node-id lookups for reserved
memory ranges, deploy that capability for the e820-pmem driver.- Pick up some miscellaneous minor fixes, that missed v5.6-final,
including a some smatch reports in the ioctl path and some unit
test compilation fixups.- Fixup some flexible-array declarations"
* tag 'libnvdimm-for-5.7' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: (29 commits)
dax: Move mandatory ->zero_page_range() check in alloc_dax()
dax,iomap: Add helper dax_iomap_zero() to zero a range
dax: Use new dax zero page method for zeroing a page
dm,dax: Add dax zero_page_range operation
s390,dcssblk,dax: Add dax zero_page_range operation to dcssblk driver
dax, pmem: Add a dax operation zero_page_range
pmem: Add functions for reading/writing page to/from pmem
libnvdimm: Update persistence domain value for of_pmem and papr_scm device
tools/test/nvdimm: Fix out of tree build
libnvdimm/region: Fix build error
libnvdimm/region: Replace zero-length array with flexible-array member
libnvdimm/label: Replace zero-length array with flexible-array member
ACPI: NFIT: Replace zero-length array with flexible-array member
libnvdimm/region: Introduce an 'align' attribute
libnvdimm/region: Introduce NDD_LABELING
libnvdimm/namespace: Enforce memremap_compat_align()
libnvdimm/pfn: Prevent raw mode fallback if pfn-infoblock valid
libnvdimm: Out of bounds read in __nd_ioctl()
acpi/nfit: improve bounds checking for 'func'
mm/memremap_pages: Introduce memremap_compat_align()
...
08 Apr, 2020
6 commits
-
lookup_node() uses gup to pin the page and get node information. It
checks against ret>=0 assuming the page will be filled in. However it's
also possible that gup will return zero, for example, when the thread is
quickly killed with a fatal signal. Teach lookup_node() to gracefully
return an error -EFAULT if it happens.Meanwhile, initialize "page" to NULL to avoid potential risk of
exploiting the pointer.Fixes: 4426e945df58 ("mm/gup: allow VM_FAULT_RETRY for multiple times")
Reported-by: syzbot+693dc11fcb53120b5559@syzkaller.appspotmail.com
Signed-off-by: Peter Xu
Signed-off-by: Linus Torvalds -
Convert the various /* fallthrough */ comments to the pseudo-keyword
fallthrough;Done via script:
https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com/Signed-off-by: Joe Perches
Signed-off-by: Andrew Morton
Reviewed-by: Gustavo A. R. Silva
Link: http://lkml.kernel.org/r/f62fea5d10eb0ccfc05d87c242a620c261219b66.camel@perches.com
Signed-off-by: Linus Torvalds -
Sparse reports a warning at queue_pages_pmd()
context imbalance in queue_pages_pmd() - unexpected unlock
The root cause is the missing annotation at queue_pages_pmd()
Add the missing __releases(ptl)Signed-off-by: Jules Irenge
Signed-off-by: Andrew Morton
Link: http://lkml.kernel.org/r/20200214204741.94112-8-jbi.octave@gmail.com
Signed-off-by: Linus Torvalds -
change_protection() was used by either the NUMA or mprotect() code,
there's one parameter for each of the callers (dirty_accountable and
prot_numa). Further, these parameters are passed along the calls:- change_protection_range()
- change_p4d_range()
- change_pud_range()
- change_pmd_range()
- ...Now we introduce a flag for change_protect() and all these helpers to
replace these parameters. Then we can avoid passing multiple parameters
multiple times along the way.More importantly, it'll greatly simplify the work if we want to introduce
any new parameters to change_protection(). In the follow up patches, a
new parameter for userfaultfd write protection will be introduced.No functional change at all.
Signed-off-by: Peter Xu
Signed-off-by: Andrew Morton
Reviewed-by: Jerome Glisse
Cc: Andrea Arcangeli
Cc: Bobby Powers
Cc: Brian Geffon
Cc: David Hildenbrand
Cc: Denis Plotnikov
Cc: "Dr . David Alan Gilbert"
Cc: Hugh Dickins
Cc: Johannes Weiner
Cc: "Kirill A . Shutemov"
Cc: Martin Cracauer
Cc: Marty McFadden
Cc: Maya Gokhale
Cc: Mel Gorman
Cc: Mike Kravetz
Cc: Mike Rapoport
Cc: Pavel Emelyanov
Cc: Rik van Riel
Cc: Shaohua Li
Link: http://lkml.kernel.org/r/20200220163112.11409-7-peterx@redhat.com
Signed-off-by: Linus Torvalds -
Some comments for MADV_FREE is revised and added to help people understand
the MADV_FREE code, especially the page flag, PG_swapbacked. This makes
page_is_file_cache() isn't consistent with its comments. So the function
is renamed to page_is_file_lru() to make them consistent again. All these
are put in one patch as one logical change.Suggested-by: David Hildenbrand
Suggested-by: Johannes Weiner
Suggested-by: David Rientjes
Signed-off-by: "Huang, Ying"
Signed-off-by: Andrew Morton
Acked-by: Johannes Weiner
Acked-by: David Rientjes
Acked-by: Michal Hocko
Acked-by: Pankaj Gupta
Acked-by: Vlastimil Babka
Cc: Dave Hansen
Cc: Mel Gorman
Cc: Minchan Kim
Cc: Hugh Dickins
Cc: Rik van Riel
Link: http://lkml.kernel.org/r/20200317100342.2730705-1-ying.huang@intel.com
Signed-off-by: Linus Torvalds -
Lets move vma_is_accessible() helper to include/linux/mm.h which makes it
available for general use. While here, this replaces all remaining open
encodings for VMA access check with vma_is_accessible().Signed-off-by: Anshuman Khandual
Signed-off-by: Andrew Morton
Acked-by: Geert Uytterhoeven
Acked-by: Guo Ren
Acked-by: Vlastimil Babka
Cc: Guo Ren
Cc: Geert Uytterhoeven
Cc: Ralf Baechle
Cc: Paul Burton
Cc: Benjamin Herrenschmidt
Cc: Paul Mackerras
Cc: Michael Ellerman
Cc: Yoshinori Sato
Cc: Rich Felker
Cc: Dave Hansen
Cc: Andy Lutomirski
Cc: Peter Zijlstra
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: Steven Rostedt
Cc: Mel Gorman
Cc: Alexander Viro
Cc: "Aneesh Kumar K.V"
Cc: Arnaldo Carvalho de Melo
Cc: Arnd Bergmann
Cc: Nick Piggin
Cc: Paul Mackerras
Cc: Will Deacon
Link: http://lkml.kernel.org/r/1582520593-30704-3-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Linus Torvalds
07 Apr, 2020
1 commit
-
…ux/kernel/git/mtd/linux") into android-mainline
Baby steps on the way to 5.7-rc1
Change-Id: I136ebb5242e3499873dcd5f5178ad7f68512d11c
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
03 Apr, 2020
4 commits
-
Using an empty (malformed) nodelist that is not caught during mount option
parsing leads to a stack-out-of-bounds access.The option string that was used was: "mpol=prefer:,". However,
MPOL_PREFERRED requires a single node number, which is not being provided
here.Add a check that 'nodes' is not empty after parsing for MPOL_PREFERRED's
nodeid.Fixes: 095f1fc4ebf3 ("mempolicy: rework shmem mpol parsing and display")
Reported-by: Entropy Moe
Reported-by: syzbot+b055b1a6b2b958707a21@syzkaller.appspotmail.com
Signed-off-by: Randy Dunlap
Signed-off-by: Andrew Morton
Tested-by: syzbot+b055b1a6b2b958707a21@syzkaller.appspotmail.com
Cc: Lee Schermerhorn
Link: http://lkml.kernel.org/r/89526377-7eb6-b662-e1d8-4430928abde9@infradead.org
Signed-off-by: Linus Torvalds -
The VM_BUG_ON() is already used by queue_pages_test_walk(), it sounds
better to dump more debug information by using VM_BUG_ON_VMA() to help
debugging.Signed-off-by: Yang Shi
Signed-off-by: Andrew Morton
Reviewed-by: Andrew Morton
Cc: "Li Xinhai"
Cc: Qian Cai
Link: http://lkml.kernel.org/r/1579068565-110432-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Linus Torvalds -
vma_migratable() is called to check if pages in vma can be migrated before
go ahead to further actions. Currently it is used in below code path:- task_numa_work
- mbind
- move_pagesFor hugetlb mapping, whether vma is migratable or not is determined by:
- CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
- arch_hugetlb_migration_supportedIssue: current code only checks for CONFIG_ARCH_ENABLE_HUGEPAGE_MIGRATION
alone, and no code should use it directly. (note that current code in
vma_migratable don't cause failure or bug because
unmap_and_move_huge_page() will catch unsupported hugepage and handle it
properly)This patch checks the two factors by hugepage_migration_supported for
impoving code logic and robustness. It will enable early bail out of
hugepage migration procedure, but because currently all architecture
supporting hugepage migration is able to support all page size, we would
not see performance gain with this patch applied.vma_migratable() is moved to mm/mempolicy.c, because of the circular
reference of mempolicy.h and hugetlb.h cause defining it as inline not
feasible.Signed-off-by: Li Xinhai
Signed-off-by: Andrew Morton
Reviewed-by: Mike Kravetz
Acked-by: Michal Hocko
Cc: Anshuman Khandual
Cc: Naoya Horiguchi
Link: http://lkml.kernel.org/r/1579786179-30633-1-git-send-email-lixinhai.lxh@gmail.com
Signed-off-by: Linus Torvalds -
MPOL_MF_STRICT is used in mbind() for purposes:
(1) MPOL_MF_STRICT is set alone without MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL, to check if there is misplaced page and return -EIO;(2) MPOL_MF_STRICT is set with MPOL_MF_MOVE or MPOL_MF_MOVE_ALL, to
check if there is misplaced page which is failed to isolate, or page
is success on isolate but failed to move, and return -EIO.For non hugepage mapping, (1) and (2) are implemented as expectation. For
hugepage mapping, (1) is not implemented. And in (2), the part about
failed to isolate and report -EIO is not implemented.This patch implements the missed parts for hugepage mapping. Benefits
with it applied:- User space can apply same code logic to handle mbind() on hugepage and
non hugepage mapping;- Reliably using MPOL_MF_STRICT alone to check whether there is
misplaced page or not when bind policy on address range, especially for
address range which contains both hugepage and non hugepage mapping.Analysis of potential impact to existing users:
- If MPOL_MF_STRICT alone was previously used, hugetlb pages not
following the memory policy would not cause an EIO error. After this
change, hugetlb pages are treated like all other pages. If
MPOL_MF_STRICT alone is used and hugetlb pages do not follow memory
policy an EIO error will be returned.- For users who using MPOL_MF_STRICT with MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL, the semantic about some pages could not be moved will
not be changed by this patch, because failed to isolate and failed to
move have same effects to users, so their existing code will not be
impacted.In mbind man page, the note about 'MPOL_MF_STRICT is ignored on huge page
mappings' can be removed after this patch is applied.Mike:
: The current behavior with MPOL_MF_STRICT and hugetlb pages is inconsistent
: and does not match documentation (as described above). The special
: behavior for hugetlb pages ideally should have been removed when hugetlb
: page migration was introduced. It is unlikely that anyone relies on
: today's inconsistent behavior, and removing one more case of special
: handling for hugetlb pages is a good thing.Signed-off-by: Li Xinhai
Signed-off-by: Andrew Morton
Reviewed-by: Mike Kravetz
Reviewed-by: Naoya Horiguchi
Cc: Michal Hocko
Cc: linux-man
Link: http://lkml.kernel.org/r/1581559627-6206-1-git-send-email-lixinhai.lxh@gmail.com
Signed-off-by: Linus Torvalds
18 Feb, 2020
2 commits
-
Update numa_map_to_online_node() to stop falling back to numa node 0
when the input is NUMA_NO_NODE. Also, skip the lookup if @node is
online. This makes the routine compatible with other arch node mapping
routines.Reported-by: Aneesh Kumar K.V
Reviewed-by: Aneesh Kumar K.V
Link: https://lore.kernel.org/r/157401275716.43284.13185549705765009174.stgit@dwillia2-desk3.amr.corp.intel.com
Reviewed-by: Ingo Molnar
Signed-off-by: Dan Williams
Link: https://lore.kernel.org/r/158188325316.894464.15650888748083329531.stgit@dwillia2-desk3.amr.corp.intel.com -
The acpi_map_pxm_to_online_node() helper is used to find the closest
online node to a given proximity domain. This is used to map devices in
a proximity domain with no online memory or cpus to the closest online
node and populate a device's 'numa_node' property. The numa_node
property allows applications to be migrated "close" to a resource.In preparation for providing a generic facility to optionally map an
address range to its closest online node, or the node the range would
represent were it to be onlined (target_node), up-level the core of
acpi_map_pxm_to_online_node() to a generic mm/numa helper.Cc: Michal Hocko
Acked-by: Rafael J. Wysocki
Reviewed-by: Ingo Molnar
Signed-off-by: Dan Williams
Link: https://lore.kernel.org/r/158188324802.894464.13128795207831894206.stgit@dwillia2-desk3.amr.corp.intel.com
08 Feb, 2020
1 commit
-
…x/kernel/git/andersson/remoteproc") into android-mainline
Another "small" merge point to handle conflicts in a sane way.
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Change-Id: I5dc2f5f11275b29f3c9b5b8d4dd59864ceb6faf9
01 Feb, 2020
1 commit
-
What we are trying to do is change the '=' character to a NUL terminator
and then at the end of the function we restore it back to an '='. The
problem is there are two error paths where we jump to the end of the
function before we have replaced the '=' with NUL.We end up putting the '=' in the wrong place (possibly one element
before the start of the buffer).Link: http://lkml.kernel.org/r/20200115055426.vdjwvry44nfug7yy@kili.mountain
Reported-by: syzbot+e64a13c5369a194d67df@syzkaller.appspotmail.com
Fixes: 095f1fc4ebf3 ("mempolicy: rework shmem mpol parsing and display")
Signed-off-by: Dan Carpenter
Acked-by: Vlastimil Babka
Dmitry Vyukov
Cc: Michal Hocko
Cc: Dan Carpenter
Cc: Lee Schermerhorn
Cc: Andrea Arcangeli
Cc: Hugh Dickins
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
20 Jan, 2020
1 commit
-
Linux 5.5-rc7
Signed-off-by: Greg Kroah-Hartman
Change-Id: Ibda9b40265c1a8e76cb8eb58107312438ecf687b
14 Jan, 2020
1 commit
-
THP page faults now attempt a __GFP_THISNODE allocation first, which
should only compact existing free memory, followed by another attempt
that can allocate from any node using reclaim/compaction effort
specified by global defrag setting and madvise.This patch makes the following changes to the scheme:
- Before the patch, the first allocation relies on a check for
pageblock order and __GFP_IO to prevent excessive reclaim. This
however affects also the second attempt, which is not limited to
single node.Instead of that, reuse the existing check for costly order
__GFP_NORETRY allocations, and make sure the first THP attempt uses
__GFP_NORETRY. As a side-effect, all costly order __GFP_NORETRY
allocations will bail out if compaction needs reclaim, while
previously they only bailed out when compaction was deferred due to
previous failures.This should be still acceptable within the __GFP_NORETRY semantics.
- Before the patch, the second allocation attempt (on all nodes) was
passing __GFP_NORETRY. This is redundant as the check for pageblock
order (discussed above) was stronger. It's also contrary to
madvise(MADV_HUGEPAGE) which means some effort to allocate THP is
requested.After this patch, the second attempt doesn't pass __GFP_THISNODE nor
__GFP_NORETRY.To sum up, THP page faults now try the following attempts:
1. local node only THP allocation with no reclaim, just compaction.
2. for madvised VMA's or when synchronous compaction is enabled always - THP
allocation from any node with effort determined by global defrag setting
and VMA madvise
3. fallback to base pages on any nodeLink: http://lkml.kernel.org/r/08a3f4dd-c3ce-0009-86c5-9ee51aba8557@suse.cz
Fixes: b39d0ee2632d ("mm, page_alloc: avoid expensive reclaim when compaction may not succeed")
Signed-off-by: Vlastimil Babka
Acked-by: Michal Hocko
Cc: Linus Torvalds
Cc: Andrea Arcangeli
Cc: Mel Gorman
Cc: "Kirill A. Shutemov"
Cc: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds