Eric Lee / smarc-fsl-linux-kernel

12 Jul, 2022

1 commit

066a5b678 mm/filemap: fix UAF in find_lock_entries ... Browse Code »

Release refcount after xas_set to fix UAF which may cause panic like this:

page:ffffea000491fa40 refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x1247e9
head:ffffea000491fa00 order:3 compound_mapcount:0 compound_pincount:0
memcg:ffff888104f91091
flags: 0x2fffff80010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff)
...
page dumped because: VM_BUG_ON_PAGE(PageTail(page))
------------[ cut here ]------------
kernel BUG at include/linux/page-flags.h:632!
invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN
CPU: 1 PID: 7642 Comm: sh Not tainted 5.15.51-dirty #26
...
Call Trace:

__invalidate_mapping_pages+0xe7/0x540
drop_pagecache_sb+0x159/0x320
iterate_supers+0x120/0x240
drop_caches_sysctl_handler+0xaa/0xe0
proc_sys_call_handler+0x2b4/0x480
new_sync_write+0x3d6/0x5c0
vfs_write+0x446/0x7a0
ksys_write+0x105/0x210
do_syscall_64+0x35/0x80
entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f52b5733130
...

This problem has been fixed on mainline by patch 6b24ca4a1a8d ("mm: Use
multi-index entries in the page cache") since it deletes the related code.

Fixes: 5c211ba29deb ("mm: add and use find_lock_entries")
Signed-off-by: Liu Shixin
Acked-by: Matthew Wilcox (Oracle)
Signed-off-by: Greg Kroah-Hartman

Liu Shixin
2022-07-12 22:34:47 +0800

01 May, 2022

2 commits

30e66b1df iov_iter: Turn iov_iter_fault_in_readable into fault_in_iov_iter_readable ... Browse Code »

commit a6294593e8a1290091d0b078d5d33da5e0cd3dfe upstream

Turn iov_iter_fault_in_readable into a function that returns the number
of bytes not faulted in, similar to copy_to_user, instead of returning a
non-zero value when any of the requested pages couldn't be faulted in.
This supports the existing users that require all pages to be faulted in
as well as new users that are happy if any pages can be faulted in.

Rename iov_iter_fault_in_readable to fault_in_iov_iter_readable to make
sure this change doesn't silently break things.

Signed-off-by: Andreas Gruenbacher
Signed-off-by: Anand Jain
Signed-off-by: Greg Kroah-Hartman

Andreas Gruenbacher
2022-05-01 23:22:28 +0800
923f05a66 gup: Turn fault_in_pages_{readable,writeable} into fault_in_{readable,writeable} ... Browse Code »

commit bb523b406c849eef8f265a07cd7f320f1f177743 upstream

Turn fault_in_pages_{readable,writeable} into versions that return the
number of bytes not faulted in, similar to copy_to_user, instead of
returning a non-zero value when any of the requested pages couldn't be
faulted in. This supports the existing users that require all pages to
be faulted in as well as new users that are happy if any pages can be
faulted in.

Rename the functions to fault_in_{readable,writeable} to make sure
this change doesn't silently break things.

Neither of these functions is entirely trivial and it doesn't seem
useful to inline them, so move them to mm/gup.c.

Signed-off-by: Andreas Gruenbacher
Signed-off-by: Anand Jain
Signed-off-by: Greg Kroah-Hartman

Andreas Gruenbacher
2022-05-01 23:22:28 +0800

02 Mar, 2022

1 commit

f89903ae9 mm/filemap: Fix handling of THPs in generic_file_buffered_read() ... Browse Code »

When a THP is present in the page cache, we can return it several times,
leading to userspace seeing the same data repeatedly if doing a read()
that crosses a 64-page boundary. This is probably not a security issue
(since the data all comes from the same file), but it can be interpreted
as a transient data corruption issue. Fortunately, it is very rare as
it can only occur when CONFIG_READ_ONLY_THP_FOR_FS is enabled, and it can
only happen to executables. We don't often call read() on executables.

This bug is fixed differently in v5.17 by commit 6b24ca4a1a8d
("mm: Use multi-index entries in the page cache"). That commit is
unsuitable for backporting, so fix this in the clearest way. It
sacrifices a little performance for clarity, but this should never
be a performance path in these kernel versions.

Fixes: cbd59c48ae2b ("mm/filemap: use head pages in generic_file_buffered_read")
Cc: stable@vger.kernel.org # v5.15, v5.16
Link: https://lore.kernel.org/r/df3b5d1c-a36b-2c73-3e27-99e74983de3a@suse.cz/
Analyzed-by: Adam Majer
Analyzed-by: Dirk Mueller
Bisected-by: Takashi Iwai
Reported-by: Vlastimil Babka
Tested-by: Vlastimil Babka
Signed-off-by: Matthew Wilcox (Oracle)
Signed-off-by: Greg Kroah-Hartman

Matthew Wilcox (Oracle)
2022-03-02 18:47:47 +0800

19 Nov, 2021

1 commit

6560e8cd8 mm/filemap.c: remove bogus VM_BUG_ON ... Browse Code »

commit d417b49fff3e2f21043c834841e8623a6098741d upstream.

It is not safe to check page->index without holding the page lock. It
can be changed if the page is moved between the swap cache and the page
cache for a shmem file, for example. There is a VM_BUG_ON below which
checks page->index is correct after taking the page lock.

Link: https://lkml.kernel.org/r/20210818144932.940640-1-willy@infradead.org
Fixes: 5c211ba29deb ("mm: add and use find_lock_entries")
Signed-off-by: Matthew Wilcox (Oracle)
Reported-by:
Cc: Hugh Dickins
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Greg Kroah-Hartman

Matthew Wilcox (Oracle)
2021-11-19 02:17:16 +0800

04 Sep, 2021

2 commits

14726903c Merge branch 'akpm' (patches from Andrew) ... Browse Code »

Merge misc updates from Andrew Morton:
"173 patches.

Subsystems affected by this series: ia64, ocfs2, block, and mm (debug,
pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure,
hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock,
oom-kill, migration, ksm, percpu, vmstat, and madvise)"

* emailed patches from Andrew Morton : (173 commits)
mm/madvise: add MADV_WILLNEED to process_madvise()
mm/vmstat: remove unneeded return value
mm/vmstat: simplify the array size calculation
mm/vmstat: correct some wrong comments
mm/percpu,c: remove obsolete comments of pcpu_chunk_populated()
selftests: vm: add COW time test for KSM pages
selftests: vm: add KSM merging time test
mm: KSM: fix data type
selftests: vm: add KSM merging across nodes test
selftests: vm: add KSM zero page merging test
selftests: vm: add KSM unmerge test
selftests: vm: add KSM merge test
mm/migrate: correct kernel-doc notation
mm: wire up syscall process_mrelease
mm: introduce process_mrelease system call
memblock: make memblock_find_in_range method private
mm/mempolicy.c: use in_task() in mempolicy_slab_node()
mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies
mm/mempolicy: advertise new MPOL_PREFERRED_MANY
mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY
...

Linus Torvalds
2021-09-04 01:08:28 +0800
304725097 mm: remove irqsave/restore locking from contexts with irqs enabled ... Browse Code »

The page cache deletion paths all have interrupts enabled, so no need to
use irqsafe/irqrestore locking variants.

They used to have irqs disabled by the memcg lock added in commit
c4843a7593a9 ("memcg: add per cgroup dirty page accounting"), but that has
since been replaced by memcg taking the page lock instead, commit
0a31bc97c80c ("mm: memcontrol: rewrite uncharge AP").

Link: https://lkml.kernel.org/r/20210614211904.14420-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2021-09-04 00:58:10 +0800

01 Sep, 2021

1 commit

87045e654 Merge tag 'for-5.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux ... Browse Code »

Pull btrfs updates from David Sterba:
"The highlights of this round are integrations with fs-verity and
idmapped mounts, the rest is usual mix of minor improvements, speedups
and cleanups.

There are some patches outside of btrfs, namely updating some VFS
interfaces, all straightforward and acked.

Features:

- fs-verity support, using standard ioctls, backward compatible with
read-only limitation on inodes with previously enabled fs-verity

- idmapped mount support

- make mount with rescue=ibadroots more tolerant to partially damaged
trees

- allow raid0 on a single device and raid10 on two devices,
degenerate cases but might be useful as an intermediate step during
conversion to other profiles

- zoned mode block group auto reclaim can be disabled via sysfs knob

Performance improvements:

- continue readahead of node siblings even if target node is in
memory, could speed up full send (on sample test +11%)

- batching of delayed items can speed up creating many files

- fsync/tree-log speedups
- avoid unnecessary work (gains +2% throughput, -2% run time on
sample load)
- reduced lock contention on renames (on dbench +4% throughput,
up to -30% latency)

Fixes:

- various zoned mode fixes

- preemptive flushing threshold tuning, avoid excessive work on
almost full filesystems

Core:

- continued subpage support, preparation for implementing remaining
features like compression and defragmentation; with some
limitations, write is now enabled on 64K page systems with 4K
sectors, still considered experimental
- no readahead on compressed reads
- inline extents disabled
- disabled raid56 profile conversion and mount

- improved flushing logic, fixing early ENOSPC on some workloads

- inode flags have been internally split to read-only and read-write
incompat bit parts, used by fs-verity

- new tree items for fs-verity
- descriptor item
- Merkle tree item

- inode operations extended to be namespace-aware

- cleanups and refactoring

Generic code changes:

- fs: new export filemap_fdatawrite_wbc

- fs: removed sync_inode

- block: bio_trim argument type fixups

- vfs: add namespace-aware lookup"

* tag 'for-5.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (114 commits)
btrfs: reset replace target device to allocation state on close
btrfs: zoned: fix ordered extent boundary calculation
btrfs: do not do preemptive flushing if the majority is global rsv
btrfs: reduce the preemptive flushing threshold to 90%
btrfs: tree-log: check btrfs_lookup_data_extent return value
btrfs: avoid unnecessarily logging directories that had no changes
btrfs: allow idmapped mount
btrfs: handle ACLs on idmapped mounts
btrfs: allow idmapped INO_LOOKUP_USER ioctl
btrfs: allow idmapped SUBVOL_SETFLAGS ioctl
btrfs: allow idmapped SET_RECEIVED_SUBVOL ioctls
btrfs: relax restrictions for SNAP_DESTROY_V2 with subvolids
btrfs: allow idmapped SNAP_DESTROY ioctls
btrfs: allow idmapped SNAP_CREATE/SUBVOL_CREATE ioctls
btrfs: check whether fsgid/fsuid are mapped during subvolume creation
btrfs: allow idmapped permission inode op
btrfs: allow idmapped setattr inode op
btrfs: allow idmapped tmpfile inode op
btrfs: allow idmapped symlink inode op
btrfs: allow idmapped mkdir inode op
...

Linus Torvalds
2021-09-01 00:41:22 +0800

23 Aug, 2021

1 commit

5a798493b fs: add a filemap_fdatawrite_wbc helper ... Browse Code »

Btrfs sometimes needs to flush dirty pages on a bunch of dirty inodes in
order to reclaim metadata reservations. Unfortunately most helpers in
this area are too smart for us:

1) The normal filemap_fdata* helpers only take range and sync modes, and
don't give any indication of how much was written, so we can only
flush full inodes, which isn't what we want in most cases.
2) The normal writeback path requires us to have the s_umount sem held,
but we can't unconditionally take it in this path because we could
deadlock.
3) The normal writeback path also skips inodes with I_SYNC set if we
write with WB_SYNC_NONE. This isn't the behavior we want under heavy
ENOSPC pressure, we want to actually make sure the pages are under
writeback before returning, and if another thread is in the middle of
writing the file we may return before they're under writeback and
miss our ordered extents and not properly wait for completion.
4) sync_inode() uses the normal writeback path and has the same problem
as #3.

What we really want is to call do_writepages() with our wbc. This way
we can make sure that writeback is actually started on the pages, and we
can control how many pages are written as a whole as we write many
inodes using the same wbc. Accomplish this with a new helper that does
just that so we can use it for our ENOSPC flushing infrastructure.

Reviewed-by: Nikolay Borisov
Reviewed-by: Christoph Hellwig
Signed-off-by: Josef Bacik
Reviewed-by: David Sterba
Signed-off-by: David Sterba

Josef Bacik
2021-08-23 19:19:07 +0800

13 Jul, 2021

3 commits

7506ae6a7 mm: Add functions to lock invalidate_lock for two mappings ... Browse Code »

Some operations such as reflinking blocks among files will need to lock
invalidate_lock for two mappings. Add helper functions to do that.

Reviewed-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Jan Kara

Jan Kara
2021-07-13 20:29:00 +0800
730633f0b mm: Protect operations adding pages to page cache with invalidate_lock ... Browse Code »

Currently, serializing operations such as page fault, read, or readahead
against hole punching is rather difficult. The basic race scheme is
like:

fallocate(FALLOC_FL_PUNCH_HOLE) read / fault / ..
truncate_inode_pages_range()

Now the problem is in this way read / page fault / readahead can
instantiate pages in page cache with potentially stale data (if blocks
get quickly reused). Avoiding this race is not simple - page locks do
not work because we want to make sure there are *no* pages in given
range. inode->i_rwsem does not work because page fault happens under
mmap_sem which ranks below inode->i_rwsem. Also using it for reads makes
the performance for mixed read-write workloads suffer.

So create a new rw_semaphore in the address_space - invalidate_lock -
that protects adding of pages to page cache for page faults / reads /
readahead.

Reviewed-by: Darrick J. Wong
Reviewed-by: Christoph Hellwig
Signed-off-by: Jan Kara

Jan Kara
2021-07-13 19:14:27 +0800
9608703e4 mm: Fix comments mentioning i_mutex ... Browse Code »

inode->i_mutex has been replaced with inode->i_rwsem long ago. Fix
comments still mentioning i_mutex.

Reviewed-by: Christoph Hellwig
Reviewed-by: Darrick J. Wong
Acked-by: Hugh Dickins
Signed-off-by: Jan Kara

Jan Kara
2021-07-13 00:31:16 +0800

04 Jul, 2021

1 commit

d3acb15a3 Merge branch 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull iov_iter updates from Al Viro:
"iov_iter cleanups and fixes.

There are followups, but this is what had sat in -next this cycle. IMO
the macro forest in there became much thinner and easier to follow..."

* 'work.iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (37 commits)
csum_and_copy_to_pipe_iter(): leave handling of csum_state to caller
clean up copy_mc_pipe_to_iter()
pipe_zero(): we don't need no stinkin' kmap_atomic()...
iov_iter: clean csum_and_copy_...() primitives up a bit
copy_page_from_iter(): don't need kmap_atomic() for kvec/bvec cases
copy_page_to_iter(): don't bother with kmap_atomic() for bvec/kvec cases
iterate_xarray(): only of the first iteration we might get offset != 0
pull handling of ->iov_offset into iterate_{iovec,bvec,xarray}
iov_iter: make iterator callbacks use base and len instead of iovec
iov_iter: make the amount already copied available to iterator callbacks
iov_iter: get rid of separate bvec and xarray callbacks
iov_iter: teach iterate_{bvec,xarray}() about possible short copies
iterate_bvec(): expand bvec.h macro forest, massage a bit
iov_iter: unify iterate_iovec and iterate_kvec
iov_iter: massage iterate_iovec and iterate_kvec to logics similar to iterate_bvec
iterate_and_advance(): get rid of magic in case when n is 0
csum_and_copy_to_iter(): massage into form closer to csum_and_copy_from_iter()
iov_iter: replace iov_iter_copy_from_user_atomic() with iterator-advancing variant
[xarray] iov_iter_npages(): just use DIV_ROUND_UP()
iov_iter_npages(): don't bother with iterate_all_kinds()
...

Linus Torvalds
2021-07-04 02:30:04 +0800

30 Jun, 2021

1 commit

04f94e3fb mm: charge active memcg when no mm is set ... Browse Code »

set_active_memcg() worked for kernel allocations but was silently ignored
for user pages.

This patch establishes a precedence order for who gets charged:

1. If there is a memcg associated with the page already, that memcg is
charged. This happens during swapin.

2. If an explicit mm is passed, mm->memcg is charged. This happens
during page faults, which can be triggered in remote VMs (eg gup).

3. Otherwise consult the current process context. If there is an
active_memcg, use that. Otherwise, current->mm->memcg.

Previously, if a NULL mm was passed to mem_cgroup_charge (case 3) it would
always charge the root cgroup. Now it looks up the active_memcg first
(falling back to charging the root cgroup if not set).

Link: https://lkml.kernel.org/r/20210610173944.1203706-3-schatzberg.dan@gmail.com
Signed-off-by: Dan Schatzberg
Acked-by: Johannes Weiner
Acked-by: Tejun Heo
Acked-by: Chris Down
Acked-by: Jens Axboe
Reviewed-by: Shakeel Butt
Reviewed-by: Michal Koutný
Cc: Michal Hocko
Cc: Ming Lei
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Schatzberg
2021-06-30 01:53:50 +0800

10 Jun, 2021

1 commit

f0b65f39a iov_iter: replace iov_iter_copy_from_user_atomic() with iterator-advancing variant ... Browse Code »

Replacement is called copy_page_from_iter_atomic(); unlike the old primitive the
callers do *not* need to do iov_iter_advance() after it. In case when they end
up consuming less than they'd been given they need to do iov_iter_revert() on
everything they had not consumed. That, however, needs to be done only on slow
paths.

All in-tree callers converted. And that kills the last user of iterate_all_kinds()

Signed-off-by: Al Viro

Al Viro
2021-06-10 23:45:14 +0800

03 Jun, 2021

1 commit

bc1bb416b generic_perform_write()/iomap_write_actor(): saner logics for short copy ... Browse Code »

if we run into a short copy and ->write_end() refuses to advance at all,
use the amount we'd managed to copy for the next iteration to handle.

Signed-off-by: Al Viro

Al Viro
2021-06-03 05:50:44 +0800

07 May, 2021

1 commit

f0953a1bb mm: fix typos in comments ... Browse Code »

Fix ~94 single-word typos in locking code comments, plus a few
very obvious grammar mistakes.

Link: https://lkml.kernel.org/r/20210322212624.GA1963421@gmail.com
Link: https://lore.kernel.org/r/20210322205203.GB1959563@gmail.com
Signed-off-by: Ingo Molnar
Reviewed-by: Matthew Wilcox (Oracle)
Reviewed-by: Randy Dunlap
Cc: Bhaskar Chowdhury
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Ingo Molnar
2021-05-07 15:26:35 +0800

06 May, 2021

3 commits

68d68ff6e mm/mempool: minor coding style tweaks ... Browse Code »

Various coding style tweaks to various files under mm/

[daizhiyuan@phytium.com.cn: mm/swapfile: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1614223624-16055-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/sparse: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1614227288-19363-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/vmscan: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1614227649-19853-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/compaction: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1614228218-20770-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/oom_kill: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1614228360-21168-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/shmem: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1614228504-21491-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/page_alloc: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1614228613-21754-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/filemap: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1614228936-22337-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/mlock: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1613956588-2453-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/frontswap: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1613962668-15045-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/vmalloc: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1613963379-15988-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/memory_hotplug: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1613971784-24878-1-git-send-email-daizhiyuan@phytium.com.cn
[daizhiyuan@phytium.com.cn: mm/mempolicy: minor coding style tweaks]
Link: https://lkml.kernel.org/r/1613972228-25501-1-git-send-email-daizhiyuan@phytium.com.cn

Link: https://lkml.kernel.org/r/1614222374-13805-1-git-send-email-daizhiyuan@phytium.com.cn
Signed-off-by: Zhiyuan Dai
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Zhiyuan Dai
2021-05-06 02:27:27 +0800
7f0e07fb0 dax: account DAX entries as nrpages ... Browse Code »

Simplify mapping_needs_writeback() by accounting DAX entries as pages
instead of exceptional entries.

Link: https://lkml.kernel.org/r/20201026151849.24232-4-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Tested-by: Vishal Verma
Acked-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-05-06 02:27:19 +0800
46be67b42 mm: stop accounting shadow entries ... Browse Code »

We no longer need to keep track of how many shadow entries are present in
a mapping. This saves a few writes to the inode and memory barriers.

Link: https://lkml.kernel.org/r/20201026151849.24232-3-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Tested-by: Vishal Verma
Acked-by: Johannes Weiner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-05-06 02:27:19 +0800

01 May, 2021

5 commits

4b17f030f mm/filemap: update stale comment ... Browse Code »

Commit a6de4b4873e1 ("mm: convert find_get_entry to return the head page")
uses @index instead of @offset, but the comment is stale, update it.

Link: https://lkml.kernel.org/r/1617948260-50724-1-git-send-email-zhangshaokun@hisilicon.com
Signed-off-by: Rui Sun
Signed-off-by: Shaokun Zhang
Cc: Matthew Wilcox (Oracle)
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Rui Sun
2021-05-01 02:20:37 +0800
79e3094c5 mm/filemap: drop check for truncated page after I/O ... Browse Code »

If the I/O completed successfully, the page will remain Uptodate, even
if it is subsequently truncated. If the I/O completed with an error,
this check would cause us to retry the I/O if the page were truncated
before we woke up. There is no need to retry the I/O; the I/O to fill
the page failed, so we can legitimately just return -EIO.

This code was originally added by commit 56f0d5fe6851 ("[PATCH]
readpage-vs-invalidate fix") in 2005 (this commit ID is from the
linux-fullhistory tree; it is also commit ba1f08f14b52 in tglx-history).

At the time, truncate_complete_page() called ClearPageUptodate(), and so
this was fixing a real bug. In 2008, commit 84209e02de48 ("mm: dont clear
PG_uptodate on truncate/invalidate") removed the call to
ClearPageUptodate, and this check has been unnecessary ever since.

It doesn't do any real harm, but there's no need to keep it.

Link: https://lkml.kernel.org/r/20210303222547.1056428-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Acked-by: William Kucharski
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-05-01 02:20:37 +0800
d31fa86a2 mm/filemap: use filemap_read_page in filemap_fault ... Browse Code »

After splitting generic_file_buffered_read() into smaller parts, it turns
out we can reuse one of the parts in filemap_fault(). This fixes an
oversight -- waiting for the I/O to complete is now interruptible by a
fatal signal. And it saves us a few bytes of text in an unlikely path.

$ ./scripts/bloat-o-meter before.o after.o
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-207 (-207)
Function old new delta
filemap_fault 2187 1980 -207
Total: Before=37491, After=37284, chg -0.55%

Link: https://lkml.kernel.org/r/20210226140011.2883498-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Andrew Morton
Cc: Kent Overstreet
Cc: Christoph Hellwig
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-05-01 02:20:36 +0800
7a60d6d7b mm: use filemap_range_needs_writeback() for O_DIRECT reads ... Browse Code »

For the generic page cache read helper, use the better variant of checking
for the need to call filemap_write_and_wait_range() when doing O_DIRECT
reads. This avoids falling back to the slow path for IOCB_NOWAIT, if
there are no pages to wait for (or write out).

Link: https://lkml.kernel.org/r/20210224164455.1096727-3-axboe@kernel.dk
Signed-off-by: Jens Axboe
Reviewed-by: Matthew Wilcox (Oracle)
Reviewed-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jens Axboe
2021-05-01 02:20:36 +0800
63135aa38 mm: provide filemap_range_needs_writeback() helper ... Browse Code »

Patch series "Improve IOCB_NOWAIT O_DIRECT reads", v3.

An internal workload complained because it was using too much CPU, and
when I took a look, we had a lot of io_uring workers going to town.

For an async buffered read like workload, I am normally expecting _zero_
offloads to a worker thread, but this one had tons of them. I'd drop
caches and things would look good again, but then a minute later we'd
regress back to using workers. Turns out that every minute something
was reading parts of the device, which would add page cache for that
inode. I put patches like these in for our kernel, and the problem was
solved.

Don't -EAGAIN IOCB_NOWAIT dio reads just because we have page cache
entries for the given range. This causes unnecessary work from the
callers side, when the IO could have been issued totally fine without
blocking on writeback when there is none.

This patch (of 3):

For O_DIRECT reads/writes, we check if we need to issue a call to
filemap_write_and_wait_range() to issue and/or wait for writeback for any
page in the given range. The existing mechanism just checks for a page in
the range, which is suboptimal for IOCB_NOWAIT as we'll fallback to the
slow path (and needing retry) if there's just a clean page cache page in
the range.

Provide filemap_range_needs_writeback() which tries a little harder to
check if we actually need to issue and/or wait for writeback in the range.

Link: https://lkml.kernel.org/r/20210224164455.1096727-1-axboe@kernel.dk
Link: https://lkml.kernel.org/r/20210224164455.1096727-2-axboe@kernel.dk
Signed-off-by: Jens Axboe
Reviewed-by: Matthew Wilcox (Oracle)
Reviewed-by: Jan Kara
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Jens Axboe
2021-05-01 02:20:36 +0800

28 Apr, 2021

1 commit

820c4bae4 Merge tag 'netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs ... Browse Code »

Pull network filesystem helper library updates from David Howells:
"Here's a set of patches for 5.13 to begin the process of overhauling
the local caching API for network filesystems. This set consists of
two parts:

(1) Add a helper library to handle the new VM readahead interface.

This is intended to be used unconditionally by the filesystem
(whether or not caching is enabled) and provides a common
framework for doing caching, transparent huge pages and, in the
future, possibly fscrypt and read bandwidth maximisation. It also
allows the netfs and the cache to align, expand and slice up a
read request from the VM in various ways; the netfs need only
provide a function to read a stretch of data to the pagecache and
the helper takes care of the rest.

(2) Add an alternative fscache/cachfiles I/O API that uses the kiocb
facility to do async DIO to transfer data to/from the netfs's
pages, rather than using readpage with wait queue snooping on one
side and vfs_write() on the other. It also uses less memory, since
it doesn't do buffered I/O on the backing file.

Note that this uses SEEK_HOLE/SEEK_DATA to locate the data
available to be read from the cache. Whilst this is an improvement
from the bmap interface, it still has a problem with regard to a
modern extent-based filesystem inserting or removing bridging
blocks of zeros. Fixing that requires a much greater overhaul.

This is a step towards overhauling the fscache API. The change is
opt-in on the part of the network filesystem. A netfs should not try
to mix the old and the new API because of conflicting ways of handling
pages and the PG_fscache page flag and because it would be mixing DIO
with buffered I/O. Further, the helper library can't be used with the
old API.

This does not change any of the fscache cookie handling APIs or the
way invalidation is done at this time.

In the near term, I intend to deprecate and remove the old I/O API
(fscache_allocate_page{,s}(), fscache_read_or_alloc_page{,s}(),
fscache_write_page() and fscache_uncache_page()) and eventually
replace most of fscache/cachefiles with something simpler and easier
to follow.

This patchset contains the following parts:

- Some helper patches, including provision of an ITER_XARRAY iov
iterator and a function to do readahead expansion.

- Patches to add the netfs helper library.

- A patch to add the fscache/cachefiles kiocb API.

- A pair of patches to fix some review issues in the ITER_XARRAY and
read helpers as spotted by Al and Willy.

Jeff Layton has patches to add support in Ceph for this that he
intends for this merge window. I have a set of patches to support AFS
that I will post a separate pull request for.

With this, AFS without a cache passes all expected xfstests; with a
cache, there's an extra failure, but that's also there before these
patches. Fixing that probably requires a greater overhaul. Ceph also
passes the expected tests.

I also have patches in a separate branch to tidy up the handling of
PG_fscache/PG_private_2 and their contribution to page refcounting in
the core kernel here, but I haven't included them in this set and will
route them separately"

Link: https://lore.kernel.org/lkml/3779937.1619478404@warthog.procyon.org.uk/

* tag 'netfs-lib-20210426' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs:
netfs: Miscellaneous fixes
iov_iter: Four fixes for ITER_XARRAY
fscache, cachefiles: Add alternate API to use kiocb for read/write to cache
netfs: Add a tracepoint to log failures that would be otherwise unseen
netfs: Define an interface to talk to a cache
netfs: Add write_begin helper
netfs: Gather stats
netfs: Add tracepoints
netfs: Provide readahead and readpage netfs helpers
netfs, mm: Add set/end/wait_on_page_fscache() aliases
netfs, mm: Move PG_fscache helper funcs to linux/netfs.h
netfs: Documentation for helper library
netfs: Make a netfs helper module
mm: Implement readahead_control pageset expansion
mm/readahead: Handle ractl nr_pages being modified
fs: Document file_ra_state
mm/filemap: Pass the file_ra_state in the ractl
mm: Add set/end/wait functions for PG_private_2
iov_iter: Add ITER_XARRAY

Linus Torvalds
2021-04-28 04:08:12 +0800

24 Apr, 2021

2 commits

ed98b0159 mm/filemap: fix mapping_seek_hole_data on THP & 32-bit ... Browse Code »

No problem on 64-bit, or without huge pages, but xfstests generic/285
and other SEEK_HOLE/SEEK_DATA tests have regressed on huge tmpfs, and on
32-bit architectures, with the new mapping_seek_hole_data(). Several
different bugs turned out to need fixing.

u64 cast to stop losing bits when converting unsigned long to loff_t
(and let's use shifts throughout, rather than mixed with * and /).

Use round_up() when advancing pos, to stop assuming that pos was already
THP-aligned when advancing it by THP-size. (This use of round_up()
assumes that any THP has THP-aligned index: true at present and true
going forward, but could be recoded to avoid the assumption.)

Use xas_set() when iterating away from a THP, so that xa_index stays in
synch with start, instead of drifting away to return bogus offset.

Check start against end to avoid wrapping 32-bit xa_index to 0 (and to
handle these additional cases, seek_data or not, it's easier to break
the loop than goto: so rearrange exit from the function).

[hughd@google.com: remove unneeded u64 casts, per Matthew]
Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104221347240.1170@eggly.anvils

Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104211737410.3299@eggly.anvils
Fixes: 41139aa4c3a3 ("mm/filemap: add mapping_seek_hole_data")
Signed-off-by: Hugh Dickins
Cc: Christoph Hellwig
Cc: Dave Chinner
Cc: Jan Kara
Cc: Johannes Weiner
Cc: "Kirill A. Shutemov"
Cc: Matthew Wilcox
Cc: William Kucharski
Cc: Yang Shi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2021-04-24 05:42:39 +0800
2d11e7381 mm/filemap: fix find_lock_entries hang on 32-bit THP ... Browse Code »

No problem on 64-bit, or without huge pages, but xfstests generic/308
hung uninterruptibly on 32-bit huge tmpfs.

Since commit 0cc3b0ec23ce ("Clarify (and fix) in 4.13 MAX_LFS_FILESIZE
macros"), MAX_LFS_FILESIZE is only a PAGE_SIZE away from wrapping 32-bit
xa_index to 0, so the new find_lock_entries() has to be extra careful
when handling a THP.

Link: https://lkml.kernel.org/r/alpine.LSU.2.11.2104211735430.3299@eggly.anvils
Fixes: 5c211ba29deb ("mm: add and use find_lock_entries")
Signed-off-by: Hugh Dickins
Cc: Matthew Wilcox
Cc: William Kucharski
Cc: Christoph Hellwig
Cc: Jan Kara
Cc: Dave Chinner
Cc: Johannes Weiner
Cc: "Kirill A. Shutemov"
Cc: Yang Shi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Hugh Dickins
2021-04-24 05:42:39 +0800

23 Apr, 2021

2 commits

fcd9ae4f7 mm/filemap: Pass the file_ra_state in the ractl ... Browse Code »

For readahead_expand(), we need to modify the file ra_state, so pass it
down by adding it to the ractl. We have to do this because it's not always
the same as f_ra in the struct file that is already being passed.

Signed-off-by: Matthew Wilcox (Oracle)
Signed-off-by: David Howells
Tested-by: Jeff Layton
Tested-by: Dave Wysochanski
Tested-By: Marc Dionne
Link: https://lore.kernel.org/r/20210407201857.3582797-2-willy@infradead.org/
Link: https://lore.kernel.org/r/161789067431.6155.8063840447229665720.stgit@warthog.procyon.org.uk/ # v6

Matthew Wilcox (Oracle)
2021-04-23 16:25:00 +0800
73e10ded3 mm: Add set/end/wait functions for PG_private_2 ... Browse Code »

Add three functions to manipulate PG_private_2:

(*) set_page_private_2() - Set the flag and take an appropriate reference
on the flagged page.

(*) end_page_private_2() - Clear the flag, drop the reference and wake up
any waiters, somewhat analogously with end_page_writeback().

(*) wait_on_page_private_2() - Wait for the flag to be cleared.

Wrappers will need to be placed in the netfs lib header in the patch that
adds that.

[This implements a suggestion by Linus[1] to not mix the terminology of
PG_private_2 and PG_fscache in the mm core function]

Changes:
v7:
- Use compound_head() in all the functions to make them THP safe[6].

v5:
- Add set and end functions, calling the end function end rather than
unlock[3].
- Keep a ref on the page when PG_private_2 is set[4][5].

v4:
- Remove extern from the declaration[2].

Suggested-by: Linus Torvalds
Signed-off-by: David Howells
Reviewed-by: Matthew Wilcox (Oracle)
Tested-by: Jeff Layton
Tested-by: Dave Wysochanski
Tested-By: Marc Dionne
cc: Alexander Viro
cc: Christoph Hellwig
cc: linux-mm@kvack.org
cc: linux-cachefs@redhat.com
cc: linux-afs@lists.infradead.org
cc: linux-nfs@vger.kernel.org
cc: linux-cifs@vger.kernel.org
cc: ceph-devel@vger.kernel.org
cc: v9fs-developer@lists.sourceforge.net
cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/1330473.1612974547@warthog.procyon.org.uk/ # v1
Link: https://lore.kernel.org/r/CAHk-=wjgA-74ddehziVk=XAEMTKswPu1Yw4uaro1R3ibs27ztw@mail.gmail.com/ [1]
Link: https://lore.kernel.org/r/20210216102659.GA27714@lst.de/ [2]
Link: https://lore.kernel.org/r/161340387944.1303470.7944159520278177652.stgit@warthog.procyon.org.uk/ # v3
Link: https://lore.kernel.org/r/161539528910.286939.1252328699383291173.stgit@warthog.procyon.org.uk # v4
Link: https://lore.kernel.org/r/20210321105309.GG3420@casper.infradead.org [3]
Link: https://lore.kernel.org/r/CAHk-=wh+2gbF7XEjYc=HV9w_2uVzVf7vs60BPz0gFA=+pUm3ww@mail.gmail.com/ [4]
Link: https://lore.kernel.org/r/CAHk-=wjSGsRj7xwhSMQ6dAQiz53xA39pOG+XA_WeTgwBBu4uqg@mail.gmail.com/ [5]
Link: https://lore.kernel.org/r/20210408145057.GN2531743@casper.infradead.org/ [6]
Link: https://lore.kernel.org/r/161653788200.2770958.9517755716374927208.stgit@warthog.procyon.org.uk/ # v5
Link: https://lore.kernel.org/r/161789066013.6155.9816857201817288382.stgit@warthog.procyon.org.uk/ # v6

David Howells
2021-04-23 16:20:49 +0800

27 Feb, 2021

9 commits

cf2039af1 mm: pass pvec directly to find_get_entries ... Browse Code »

All callers of find_get_entries() use a pvec, so pass it directly instead
of manipulating it in the caller.

Link: https://lkml.kernel.org/r/20201112212641.27837-14-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Jan Kara
Reviewed-by: William Kucharski
Cc: Christoph Hellwig
Cc: Dave Chinner
Cc: Hugh Dickins
Cc: Johannes Weiner
Cc: Kirill A. Shutemov
Cc: Yang Shi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-02-27 01:40:59 +0800
ca122fe40 mm: add an 'end' parameter to find_get_entries ... Browse Code »

This simplifies the callers and leads to a more efficient implementation
since the XArray has this functionality already.

Link: https://lkml.kernel.org/r/20201112212641.27837-11-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Jan Kara
Reviewed-by: William Kucharski
Reviewed-by: Christoph Hellwig
Cc: Dave Chinner
Cc: Hugh Dickins
Cc: Johannes Weiner
Cc: Kirill A. Shutemov
Cc: Yang Shi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-02-27 01:40:59 +0800
5c211ba29 mm: add and use find_lock_entries ... Browse Code »

We have three functions (shmem_undo_range(), truncate_inode_pages_range()
and invalidate_mapping_pages()) which want exactly this function, so add
it to filemap.c. Before this patch, shmem_undo_range() would split any
compound page which overlaps either end of the range being punched in both
the first and second loops through the address space. After this patch,
that functionality is left for the second loop, which is arguably more
appropriate since the first loop is supposed to run through all the pages
quickly, and splitting a page can sleep.

[willy@infradead.org: add assertion]
Link: https://lkml.kernel.org/r/20201124041507.28996-3-willy@infradead.org

Link: https://lkml.kernel.org/r/20201112212641.27837-10-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Jan Kara
Reviewed-by: William Kucharski
Reviewed-by: Christoph Hellwig
Cc: Dave Chinner
Cc: Hugh Dickins
Cc: Johannes Weiner
Cc: Kirill A. Shutemov
Cc: Yang Shi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-02-27 01:40:59 +0800
54fa39ac2 iomap: use mapping_seek_hole_data ... Browse Code »

Enhance mapping_seek_hole_data() to handle partially uptodate pages and
convert the iomap seek code to call it.

Link: https://lkml.kernel.org/r/20201112212641.27837-9-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Cc: Christoph Hellwig
Cc: Dave Chinner
Cc: Hugh Dickins
Cc: Jan Kara
Cc: Johannes Weiner
Cc: Kirill A. Shutemov
Cc: William Kucharski
Cc: Yang Shi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-02-27 01:40:59 +0800
41139aa4c mm/filemap: add mapping_seek_hole_data ... Browse Code »

Rewrite shmem_seek_hole_data() and move it to filemap.c.

[willy@infradead.org: don't put an xa_is_value() page]
Link: https://lkml.kernel.org/r/20201124041507.28996-4-willy@infradead.org

Link: https://lkml.kernel.org/r/20201112212641.27837-8-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: William Kucharski
Reviewed-by: Christoph Hellwig
Cc: Dave Chinner
Cc: Hugh Dickins
Cc: Jan Kara
Cc: Johannes Weiner
Cc: Kirill A. Shutemov
Cc: Yang Shi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-02-27 01:40:59 +0800
c7bad633e mm/filemap: add helper for finding pages ... Browse Code »

There is a lot of common code in find_get_entries(),
find_get_pages_range() and find_get_pages_range_tag(). Factor out
find_get_entry() which simplifies all three functions.

[willy@infradead.org: remove VM_BUG_ON_PAGE()]
Link: https://lkml.kernel.org/r/20201124041507.28996-2-willy@infradead.orgLink: https://lkml.kernel.org/r/20201112212641.27837-7-willy@infradead.org

Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Jan Kara
Reviewed-by: William Kucharski
Reviewed-by: Christoph Hellwig
Cc: Dave Chinner
Cc: Hugh Dickins
Cc: Johannes Weiner
Cc: Kirill A. Shutemov
Cc: Yang Shi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-02-27 01:40:59 +0800
bc5a30112 mm/filemap: rename find_get_entry to mapping_get_entry ... Browse Code »

find_get_entry doesn't "find" anything. It returns the entry at a
particular index.

Link: https://lkml.kernel.org/r/20201112212641.27837-6-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Christoph Hellwig
Cc: Dave Chinner
Cc: Hugh Dickins
Cc: Jan Kara
Cc: Johannes Weiner
Cc: Kirill A. Shutemov
Cc: William Kucharski
Cc: Yang Shi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-02-27 01:40:59 +0800
44835d20b mm: add FGP_ENTRY ... Browse Code »

The functionality of find_lock_entry() and find_get_entry() can be
provided by pagecache_get_page(), which lets us delete find_lock_entry()
and make find_get_entry() static.

Link: https://lkml.kernel.org/r/20201112212641.27837-5-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Christoph Hellwig
Cc: Dave Chinner
Cc: Hugh Dickins
Cc: Jan Kara
Cc: Johannes Weiner
Cc: Kirill A. Shutemov
Cc: William Kucharski
Cc: Yang Shi
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-02-27 01:40:59 +0800
c49f50d19 mm: make pagecache tagged lookups return only head pages ... Browse Code »

Patch series "Overhaul multi-page lookups for THP", v4.

This THP prep patchset changes several page cache iteration APIs to only
return head pages.

- It's only possible to tag head pages in the page cache, so only
return head pages, not all their subpages.
- Factor a lot of common code out of the various batch lookup routines
- Add mapping_seek_hole_data()
- Unify find_get_entries() and pagevec_lookup_entries()
- Make find_get_entries only return head pages, like find_get_entry().

These are only loosely connected, but they seem to make sense together as
a series.

This patch (of 14):

Pagecache tags are used for dirty page writeback. Since dirtiness is
tracked on a per-THP basis, we only want to return the head page rather
than each subpage of a tagged page. All the filesystems which use huge
pages today are in-memory, so there are no tagged huge pages today.

Link: https://lkml.kernel.org/r/20201112212641.27837-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle)
Reviewed-by: Jan Kara
Reviewed-by: William Kucharski
Reviewed-by: Christoph Hellwig
Cc: Hugh Dickins
Cc: Johannes Weiner
Cc: Yang Shi
Cc: Dave Chinner
Cc: Kirill A. Shutemov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox (Oracle)
2021-02-27 01:40:58 +0800

25 Feb, 2021

1 commit

57b2847d3 mm: memcontrol: convert NR_SHMEM_THPS account to pages ... Browse Code »

Currently we use struct per_cpu_nodestat to cache the vmstat counters,
which leads to inaccurate statistics especially THP vmstat counters. In
the systems with hundreds of processors it can be GBs of memory. For
example, for a 96 CPUs system, the threshold is the maximum number of 125.
And the per cpu counters can cache 23.4375 GB in total.

The THP page is already a form of batched addition (it will add 512 worth
of memory in one go) so skipping the batching seems like sensible.
Although every THP stats update overflows the per-cpu counter, resorting
to atomic global updates. But it can make the statistics more accuracy
for the THP vmstat counters.

So we convert the NR_SHMEM_THPS account to pages. This patch is
consistent with 8f182270dfec ("mm/swap.c: flush lru pvecs on compound page
arrival"). Doing this also can make the unit of vmstat counters more
unified. Finally, the unit of the vmstat counters are pages, kB and
bytes. The B/KB suffix can tell us that the unit is bytes or kB. The
rest which is without suffix are pages.

Link: https://lkml.kernel.org/r/20201228164110.2838-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song
Cc: Alexey Dobriyan
Cc: Feng Tang
Cc: Greg Kroah-Hartman
Cc: Hugh Dickins
Cc: Johannes Weiner
Cc: Joonsoo Kim
Cc: Michal Hocko
Cc: NeilBrown
Cc: Pankaj Gupta
Cc: Rafael. J. Wysocki
Cc: Randy Dunlap
Cc: Roman Gushchin
Cc: Sami Tolvanen
Cc: Shakeel Butt
Cc: Vladimir Davydov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Muchun Song
2021-02-25 05:38:29 +0800