Eric Lee / smarc-fsl-linux-kernel

08 Oct, 2019

1 commit

5b6807de1 mm/z3fold.c: claim page in the beginning of free ... Browse Code »

There's a really hard to reproduce race in z3fold between z3fold_free()
and z3fold_reclaim_page(). z3fold_reclaim_page() can claim the page
after z3fold_free() has checked if the page was claimed and
z3fold_free() will then schedule this page for compaction which may in
turn lead to random page faults (since that page would have been
reclaimed by then).

Fix that by claiming page in the beginning of z3fold_free() and not
forgetting to clear the claim in the end.

[vitalywool@gmail.com: v2]
Link: http://lkml.kernel.org/r/20190928113456.152742cf@bigdell
Link: http://lkml.kernel.org/r/20190926104844.4f0c6efa1366b8f5741eaba9@gmail.com
Signed-off-by: Vitaly Wool
Reported-by: Markus Linnala
Cc: Dan Streetman
Cc: Vlastimil Babka
Cc: Henry Burns
Cc: Shakeel Butt
Cc: Markus Linnala
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2019-10-08 06:47:19 +0800

25 Sep, 2019

3 commits

63398413c z3fold: fix memory leak in kmem cache ... Browse Code »

Currently there is a leak in init_z3fold_page() -- it allocates handles
from kmem cache even for headless pages, but then they are never used and
never freed, so eventually kmem cache may get exhausted. This patch
provides a fix for that.

Link: http://lkml.kernel.org/r/20190917185352.44cf285d3ebd9e64548de5de@gmail.com
Signed-off-by: Vitaly Wool
Reported-by: Markus Linnala
Tested-by: Markus Linnala
Cc: Dan Streetman
Cc: Henry Burns
Cc: Shakeel Butt
Cc: Vlastimil Babka
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2019-09-25 06:54:10 +0800
3f9d2b576 z3fold: fix retry mechanism in page reclaim ... Browse Code »

z3fold_page_reclaim()'s retry mechanism is broken: on a second iteration
it will have zhdr from the first one so that zhdr is no longer in line
with struct page. That leads to crashes when the system is stressed.

Fix that by moving zhdr assignment up.

While at it, protect against using already freed handles by using own
local slots structure in z3fold_page_reclaim().

Link: http://lkml.kernel.org/r/20190908162919.830388dc7404d1e2c80f4095@gmail.com
Signed-off-by: Vitaly Wool
Reported-by: Markus Linnala
Reported-by: Chris Murphy
Reported-by: Agustin Dall'Alba
Cc: "Maciej S. Szmigiero"
Cc: Shakeel Butt
Cc: Henry Burns
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2019-09-25 06:54:06 +0800
6e73fd25e Revert "mm/z3fold.c: fix race between migration and destruction" ... Browse Code »

With the original commit applied, z3fold_zpool_destroy() may get blocked
on wait_event() for indefinite time. Revert this commit for the time
being to get rid of this problem since the issue the original commit
addresses is less severe.

Link: http://lkml.kernel.org/r/20190910123142.7a9c8d2de4d0acbc0977c602@gmail.com
Fixes: d776aaa9895eb6eb77 ("mm/z3fold.c: fix race between migration and destruction")
Reported-by: Agustín Dall'Alba
Signed-off-by: Vitaly Wool
Cc: Vlastimil Babka
Cc: Vitaly Wool
Cc: Shakeel Butt
Cc: Jonathan Adams
Cc: Henry Burns
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2019-09-25 06:54:06 +0800

31 Aug, 2019

1 commit

14108b913 mm/z3fold.c: fix lock/unlock imbalance in z3fold_page_isolate ... Browse Code »

Fix lock/unlock imbalance by unlocking *zhdr* before return.

Addresses Coverity ID 1452811 ("Missing unlock")

Link: http://lkml.kernel.org/r/20190826030634.GA4379@embeddedor
Fixes: d776aaa9895e ("mm/z3fold.c: fix race between migration and destruction")
Signed-off-by: Gustavo A. R. Silva
Reviewed-by: Andrew Morton
Cc: Henry Burns
Cc: Vitaly Wool
Cc: Shakeel Butt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Gustavo A. R. Silva
2019-08-31 09:00:50 +0800

25 Aug, 2019

1 commit

d776aaa98 mm/z3fold.c: fix race between migration and destruction ... Browse Code »

In z3fold_destroy_pool() we call destroy_workqueue(&pool->compact_wq).
However, we have no guarantee that migration isn't happening in the
background at that time.

Migration directly calls queue_work_on(pool->compact_wq), if destruction
wins that race we are using a destroyed workqueue.

Link: http://lkml.kernel.org/r/20190809213828.202833-1-henryburns@google.com
Signed-off-by: Henry Burns
Cc: Vitaly Wool
Cc: Shakeel Butt
Cc: Jonathan Adams
Cc: Henry Burns
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Henry Burns
2019-08-25 10:48:42 +0800

14 Aug, 2019

2 commits

b997052bc mm/z3fold.c: fix z3fold_destroy_pool() race condition ... Browse Code »

The constraint from the zpool use of z3fold_destroy_pool() is there are
no outstanding handles to memory (so no active allocations), but it is
possible for there to be outstanding work on either of the two wqs in
the pool.

Calling z3fold_deregister_migration() before the workqueues are drained
means that there can be allocated pages referencing a freed inode,
causing any thread in compaction to be able to trip over the bad pointer
in PageMovable().

Link: http://lkml.kernel.org/r/20190726224810.79660-2-henryburns@google.com
Fixes: 1f862989b04a ("mm/z3fold.c: support page migration")
Signed-off-by: Henry Burns
Reviewed-by: Shakeel Butt
Reviewed-by: Jonathan Adams
Cc: Vitaly Vul
Cc: Vitaly Wool
Cc: David Howells
Cc: Thomas Gleixner
Cc: Al Viro
Cc: Henry Burns
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Henry Burns
2019-08-14 07:06:52 +0800
6051d3bd3 mm/z3fold.c: fix z3fold_destroy_pool() ordering ... Browse Code »

The constraint from the zpool use of z3fold_destroy_pool() is there are
no outstanding handles to memory (so no active allocations), but it is
possible for there to be outstanding work on either of the two wqs in
the pool.

If there is work queued on pool->compact_workqueue when it is called,
z3fold_destroy_pool() will do:

z3fold_destroy_pool()
destroy_workqueue(pool->release_wq)
destroy_workqueue(pool->compact_wq)
drain_workqueue(pool->compact_wq)
do_compact_page(zhdr)
kref_put(&zhdr->refcount)
__release_z3fold_page(zhdr, ...)
queue_work_on(pool->release_wq, &pool->work) *BOOM*

So compact_wq needs to be destroyed before release_wq.

Link: http://lkml.kernel.org/r/20190726224810.79660-1-henryburns@google.com
Fixes: 5d03a6613957 ("mm/z3fold.c: use kref to prevent page free/compact race")
Signed-off-by: Henry Burns
Reviewed-by: Shakeel Butt
Reviewed-by: Jonathan Adams
Cc: Vitaly Vul
Cc: Vitaly Wool
Cc: David Howells
Cc: Thomas Gleixner
Cc: Al Viro
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Henry Burns
2019-08-14 07:06:52 +0800

20 Jul, 2019

1 commit

933a90bf4 Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs ... Browse Code »

Pull vfs mount updates from Al Viro:
"The first part of mount updates.

Convert filesystems to use the new mount API"

* 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
mnt_init(): call shmem_init() unconditionally
constify ksys_mount() string arguments
don't bother with registering rootfs
init_rootfs(): don't bother with init_ramfs_fs()
vfs: Convert smackfs to use the new mount API
vfs: Convert selinuxfs to use the new mount API
vfs: Convert securityfs to use the new mount API
vfs: Convert apparmorfs to use the new mount API
vfs: Convert openpromfs to use the new mount API
vfs: Convert xenfs to use the new mount API
vfs: Convert gadgetfs to use the new mount API
vfs: Convert oprofilefs to use the new mount API
vfs: Convert ibmasmfs to use the new mount API
vfs: Convert qib_fs/ipathfs to use the new mount API
vfs: Convert efivarfs to use the new mount API
vfs: Convert configfs to use the new mount API
vfs: Convert binfmt_misc to use the new mount API
convenience helper: get_tree_single()
convenience helper get_tree_nodev()
vfs: Kill sget_userns()
...

Linus Torvalds
2019-07-20 01:42:02 +0800

17 Jul, 2019

4 commits

c92d2f385 mm/z3fold.c: reinitialize zhdr structs after migration ... Browse Code »

z3fold_page_migration() calls memcpy(new_zhdr, zhdr, PAGE_SIZE).
However, zhdr contains fields that can't be directly coppied over (ex:
list_head, a circular linked list). We only need to initialize the
linked lists in new_zhdr, as z3fold_isolate_page() already ensures that
these lists are empty

Additionally it is possible that zhdr->work has been placed in a
workqueue. In this case we shouldn't migrate the page, as zhdr->work
references zhdr as opposed to new_zhdr.

Link: http://lkml.kernel.org/r/20190716000520.230595-1-henryburns@google.com
Fixes: 1f862989b04ade61d3 ("mm/z3fold.c: support page migration")
Signed-off-by: Henry Burns
Reviewed-by: Shakeel Butt
Cc: Vitaly Vul
Cc: Vitaly Wool
Cc: Jonathan Adams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Henry Burns
2019-07-17 10:23:21 +0800
be03074c9 mm/z3fold.c: remove z3fold_migration trylock ... Browse Code »

z3fold_page_migrate() will never succeed because it attempts to acquire
a lock that has already been taken by migrate.c in __unmap_and_move().

__unmap_and_move() migrate.c
trylock_page(oldpage)
move_to_new_page(oldpage_newpage)
a_ops->migrate_page(oldpage, newpage)
z3fold_page_migrate(oldpage, newpage)
trylock_page(oldpage)

Link: http://lkml.kernel.org/r/20190710213238.91835-1-henryburns@google.com
Fixes: 1f862989b04a ("mm/z3fold.c: support page migration")
Signed-off-by: Henry Burns
Reviewed-by: Shakeel Butt
Cc: Vitaly Wool
Cc: Vitaly Vul
Cc: Jonathan Adams
Cc: Greg Kroah-Hartman
Cc: Snild Dolkow
Cc: Thomas Gleixner
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Henry Burns
2019-07-17 10:23:21 +0800
f1549cb5a mm/z3fold.c: allow __GFP_HIGHMEM in z3fold_alloc ... Browse Code »

One of the gfp flags used to show that a page is movable is
__GFP_HIGHMEM. Currently z3fold_alloc() fails when __GFP_HIGHMEM is
passed. Now that z3fold pages are movable, we allow __GFP_HIGHMEM. We
strip the movability related flags from the call to kmem_cache_alloc()
for our slots since it is a kernel allocation.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20190712222118.108192-1-henryburns@google.com
Signed-off-by: Henry Burns
Acked-by: Vitaly Wool
Reviewed-by: Shakeel Butt
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Henry Burns
2019-07-17 10:23:21 +0800
bb9a374df mm/z3fold: don't try to use buddy slots after free ... Browse Code »

As reported by Henry Burns:

Running z3fold stress testing with address sanitization showed zhdr->slots
was being used after it was freed.

z3fold_free(z3fold_pool, handle)
free_handle(handle)
kmem_cache_free(pool->c_handle, zhdr->slots)
release_z3fold_page_locked_list(kref)
__release_z3fold_page(zhdr, true)
zhdr_to_pool(zhdr)
slots_to_pool(zhdr->slots) *BOOM*

To fix this, add pointer to the pool back to z3fold_header and modify
zhdr_to_pool to return zhdr->pool.

Link: http://lkml.kernel.org/r/20190708134808.e89f3bfadd9f6ffd7eff9ba9@gmail.com
Fixes: 7c2b8baa61fe ("mm/z3fold.c: add structure for buddy handles")
Signed-off-by: Vitaly Wool
Reported-by: Henry Burns
Reviewed-by: Shakeel Butt
Cc: Jonathan Adams
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2019-07-17 10:23:21 +0800

13 Jul, 2019

1 commit

810481a24 mm/z3fold.c: lock z3fold page before __SetPageMovable() ... Browse Code »

Following zsmalloc.c's example we call trylock_page() and unlock_page().
Also make z3fold_page_migrate() assert that newpage is passed in locked,
as per the documentation.

[akpm@linux-foundation.org: fix trylock_page return value test, per Shakeel]
Link: http://lkml.kernel.org/r/20190702005122.41036-1-henryburns@google.com
Link: http://lkml.kernel.org/r/20190702233538.52793-1-henryburns@google.com
Signed-off-by: Henry Burns
Suggested-by: Vitaly Wool
Acked-by: Vitaly Wool
Acked-by: David Rientjes
Reviewed-by: Shakeel Butt
Cc: Vitaly Vul
Cc: Mike Rapoport
Cc: Xidong Wang
Cc: Jonathan Adams
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Henry Burns
2019-07-13 02:05:40 +0800

02 Jun, 2019

1 commit

bb9f6f63f z3fold: fix sheduling while atomic ... Browse Code »

kmem_cache_alloc() may be called from z3fold_alloc() in atomic context, so
we need to pass correct gfp flags to avoid "scheduling while atomic" bug.

Link: http://lkml.kernel.org/r/20190523153245.119dfeed55927e8755250ddd@gmail.com
Fixes: 7c2b8baa61fe5 ("mm/z3fold.c: add structure for buddy handles")
Signed-off-by: Vitaly Wool
Reviewed-by: Andrew Morton
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2019-06-02 06:51:31 +0800

26 May, 2019

2 commits

ea8157ab2 zsfold: Convert zsfold to use the new mount API ... Browse Code »

Convert the zsfold filesystem to the new internal mount API as the old one
will be obsoleted and removed. This allows greater flexibility in
communication of mount parameters between userspace, the VFS and the
filesystem.

See Documentation/filesystems/mount_api.txt for more information.

Signed-off-by: David Howells

David Howells
2019-05-26 06:06:01 +0800
1f58bb18f mount_pseudo(): drop 'name' argument, switch to d_make_root() ... Browse Code »

Once upon a time we used to set ->d_name of e.g. pipefs root
so that d_path() on pipes would work. These days it's
completely pointless - dentries of pipes are not even connected
to pipefs root. However, mount_pseudo() had set the root
dentry name (passed as the second argument) and callers
kept inventing names to pass to it. Including those that
didn't *have* any non-root dentries to start with...

All of that had been pointless for about 8 years now; it's
time to get rid of that cargo-culting...

Signed-off-by: Al Viro

Al Viro
2019-05-26 05:59:24 +0800

21 May, 2019

2 commits

09c434b8a treewide: Add SPDX license identifier for more missed files ... Browse Code »

Add SPDX license identifiers to all files which:

- Have no license information of any form

- Have MODULE_LICENCE("GPL*") inside which was used in the initial
scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

GPL-2.0-only

Signed-off-by: Thomas Gleixner
Signed-off-by: Greg Kroah-Hartman

Thomas Gleixner
2019-05-21 16:50:45 +0800
4de1e3a8e z3fold: don't bother with dentry_operations ... Browse Code »

Don't bother with dentry_operations as no dentry is ever allocated.

Signed-off-by: David Howells

David Howells
2019-05-21 15:22:17 +0800

15 May, 2019

4 commits

1f862989b mm/z3fold.c: support page migration ... Browse Code »

Now that we are not using page address in handles directly, we can make
z3fold pages movable to decrease the memory fragmentation z3fold may
create over time.

This patch starts advertising non-headless z3fold pages as movable and
uses the existing kernel infrastructure to implement moving of such pages
per memory management subsystem's request. It thus implements 3 required
callbacks for page migration:

* isolation callback: z3fold_page_isolate(): try to isolate the page by
removing it from all lists. Pages scheduled for some activity and
mapped pages will not be isolated. Return true if isolation was
successful or false otherwise

* migration callback: z3fold_page_migrate(): re-check critical
conditions and migrate page contents to the new page provided by the
memory subsystem. Returns 0 on success or negative error code otherwise

* putback callback: z3fold_page_putback(): put back the page if
z3fold_page_migrate() for it failed permanently (i. e. not with
-EAGAIN code).

[lkp@intel.com: z3fold_page_isolate() can be static]
Link: http://lkml.kernel.org/r/20190419130924.GA161478@ivb42
Link: http://lkml.kernel.org/r/20190417103922.31253da5c366c4ebe0419cfc@gmail.com
Signed-off-by: Vitaly Wool
Signed-off-by: kbuild test robot
Cc: Bartlomiej Zolnierkiewicz
Cc: Dan Streetman
Cc: Krzysztof Kozlowski
Cc: Oleksiy Avramchenko
Cc: Uladzislau Rezki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2019-05-15 00:47:50 +0800
7c2b8baa6 mm/z3fold.c: add structure for buddy handles ... Browse Code »

For z3fold to be able to move its pages per request of the memory
subsystem, it should not use direct object addresses in handles. Instead,
it will create abstract handles (3 per page) which will contain pointers
to z3fold objects. Thus, it will be possible to change these pointers
when z3fold page is moved.

Link: http://lkml.kernel.org/r/20190417103826.484eaf18c1294d682769880f@gmail.com
Signed-off-by: Vitaly Wool
Cc: Bartlomiej Zolnierkiewicz
Cc: Dan Streetman
Cc: Krzysztof Kozlowski
Cc: Oleksiy Avramchenko
Cc: Uladzislau Rezki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2019-05-15 00:47:50 +0800
351618b20 mm/z3fold.c: improve compression by extending search ... Browse Code »

The current z3fold implementation only searches this CPU's page lists for
a fitting page to put a new object into. This patch adds quick search for
very well fitting pages (i. e. those having exactly the required number
of free space) on other CPUs too, before allocating a new page for that
object.

Link: http://lkml.kernel.org/r/20190417103733.72ae81abe1552397c95a008e@gmail.com
Signed-off-by: Vitaly Wool
Cc: Bartlomiej Zolnierkiewicz
Cc: Dan Streetman
Cc: Krzysztof Kozlowski
Cc: Oleksiy Avramchenko
Cc: Uladzislau Rezki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2019-05-15 00:47:50 +0800
9050cce10 mm/z3fold.c: introduce helper functions ... Browse Code »

Patch series "z3fold: support page migration", v2.

This patchset implements page migration support and slightly better buddy
search. To implement page migration support, z3fold has to move away from
the current scheme of handle encoding. i. e. stop encoding page address
in handles. Instead, a small per-page structure is created which will
contain actual addresses for z3fold objects, while pointers to fields of
that structure will be used as handles.

Thus, it will be possible to change the underlying addresses to reflect
page migration.

To support migration itself, 3 callbacks will be implemented:

1: isolation callback: z3fold_page_isolate(): try to isolate the page
by removing it from all lists. Pages scheduled for some activity and
mapped pages will not be isolated. Return true if isolation was
successful or false otherwise

2: migration callback: z3fold_page_migrate(): re-check critical
conditions and migrate page contents to the new page provided by the
system. Returns 0 on success or negative error code otherwise

3: putback callback: z3fold_page_putback(): put back the page if
z3fold_page_migrate() for it failed permanently (i. e. not with
-EAGAIN code).

To make sure an isolated page doesn't get freed, its kref is incremented
in z3fold_page_isolate() and decremented during post-migration compaction,
if migration was successful, or by z3fold_page_putback() in the other
case.

Since the new handle encoding scheme implies slight memory consumption
increase, better buddy search (which decreases memory consumption) is
included in this patchset.

This patch (of 4):

Introduce a separate helper function for object allocation, as well as 2
smaller helpers to add a buddy to the list and to get a pointer to the
pool from the z3fold header. No functional changes here.

Link: http://lkml.kernel.org/r/20190417103633.a4bb770b5bf0fb7e43ce1666@gmail.com
Signed-off-by: Vitaly Wool
Cc: Dan Streetman
Cc: Bartlomiej Zolnierkiewicz
Cc: Krzysztof Kozlowski
Cc: Oleksiy Avramchenko
Cc: Uladzislau Rezki
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2019-05-15 00:47:50 +0800

19 Nov, 2018

1 commit

ca0246bb9 z3fold: fix possible reclaim races ... Browse Code »

Reclaim and free can race on an object which is basically fine but in
order for reclaim to be able to map "freed" object we need to encode
object length in the handle. handle_to_chunks() is then introduced to
extract object length from a handle and use it during mapping.

Moreover, to avoid racing on a z3fold "headless" page release, we should
not try to free that page in z3fold_free() if the reclaim bit is set.
Also, in the unlikely case of trying to reclaim a page being freed, we
should not proceed with that page.

While at it, fix the page accounting in reclaim function.

This patch supersedes "[PATCH] z3fold: fix reclaim lock-ups".

Link: http://lkml.kernel.org/r/20181105162225.74e8837d03583a9b707cf559@gmail.com
Signed-off-by: Vitaly Wool
Signed-off-by: Jongseok Kim
Reported-by-by: Jongseok Kim
Reviewed-by: Snild Dolkow
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2018-11-19 02:15:09 +0800

12 May, 2018

1 commit

6098d7e13 z3fold: fix reclaim lock-ups ... Browse Code »

Do not try to optimize in-page object layout while the page is under
reclaim. This fixes lock-ups on reclaim and improves reclaim
performance at the same time.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20180430125800.444cae9706489f412ad12621@gmail.com
Signed-off-by: Vitaly Wool
Reported-by: Guenter Roeck
Tested-by: Guenter Roeck
Cc:
Cc: Matthew Wilcox
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2018-05-12 08:28:45 +0800

12 Apr, 2018

2 commits

8a97ea546 mm/z3fold.c: use gfpflags_allow_blocking ... Browse Code »

We have a perfectly good macro to determine whether the gfp flags allow
you to sleep or not; use it instead of trying to infer it.

Link: http://lkml.kernel.org/r/20180408062206.GC16007@bombadil.infradead.org
Signed-off-by: Matthew Wilcox
Reviewed-by: Andrew Morton
Cc: Vitaly Wool
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Matthew Wilcox
2018-04-12 01:28:31 +0800
1ec6995d1 z3fold: fix memory leak ... Browse Code »

In z3fold_create_pool(), the memory allocated by __alloc_percpu() is not
released on the error path that pool->compact_wq , which holds the
return value of create_singlethread_workqueue(), is NULL. This will
result in a memory leak bug.

[akpm@linux-foundation.org: fix oops on kzalloc() failure, check __alloc_percpu() retval]
Link: http://lkml.kernel.org/r/1522803111-29209-1-git-send-email-wangxidong_97@163.com
Signed-off-by: Xidong Wang
Reviewed-by: Andrew Morton
Cc: Vitaly Wool
Cc: Mike Rapoport
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Xidong Wang
2018-04-12 01:28:31 +0800

06 Apr, 2018

1 commit

5c9bab592 z3fold: limit use of stale list for allocation ... Browse Code »

Currently if z3fold couldn't find an unbuddied page it would first try
to pull a page off the stale list. The problem with this approach is
that we can't 100% guarantee that the page is not processed by the
workqueue thread at the same time unless we run cancel_work_sync() on
it, which we can't do if we're in an atomic context. So let's just
limit stale list usage to non-atomic contexts only.

Link: http://lkml.kernel.org/r/47ab51e7-e9c1-d30e-ab17-f734dbc3abce@gmail.com
Signed-off-by: Vitaly Vul
Reviewed-by: Andrew Morton
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2018-04-06 12:36:25 +0800

07 Feb, 2018

1 commit

f144c390f mm: docs: fix parameter names mismatch ... Browse Code »

There are several places where parameter descriptions do no match the
actual code. Fix it.

Link: http://lkml.kernel.org/r/1516700871-22279-3-git-send-email-rppt@linux.vnet.ibm.com
Signed-off-by: Mike Rapoport
Cc: Jonathan Corbet
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Mike Rapoport
2018-02-07 10:32:48 +0800

18 Nov, 2017

1 commit

5d03a6613 mm/z3fold.c: use kref to prevent page free/compact race ... Browse Code »

There is a race in the current z3fold implementation between
do_compact() called in a work queue context and the page release
procedure when page's kref goes to 0.

do_compact() may be waiting for page lock, which is released by
release_z3fold_page_locked right before putting the page onto the
"stale" list, and then the page may be freed as do_compact() modifies
its contents.

The mechanism currently implemented to handle that (checking the
PAGE_STALE flag) is not reliable enough. Instead, we'll use page's kref
counter to guarantee that the page is not released if its compaction is
scheduled. It then becomes compaction function's responsibility to
decrease the counter and quit immediately if the page was actually
freed.

Link: http://lkml.kernel.org/r/20171117092032.00ea56f42affbed19f4fcc6c@gmail.com
Signed-off-by: Vitaly Wool
Cc:
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-11-18 08:10:00 +0800

04 Oct, 2017

2 commits

355293574 z3fold: fix stale list handling ... Browse Code »

Fix the situation when clear_bit() is called for page->private before
the page pointer is actually assigned. While at it, remove work_busy()
check because it is costly and does not give 100% guarantee anyway.

Signed-off-by: Vitaly Wool
Cc: Dan Streetman
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-10-04 08:54:24 +0800
d5567c9df z3fold: fix potential race in z3fold_reclaim_page ... Browse Code »

It is possible that on a (partially) unsuccessful page reclaim,
kref_put() called in z3fold_reclaim_page() does not yield page release,
but the page is released shortly afterwards by another thread. Then
z3fold_reclaim_page() would try to list_add() that (released) page again
which is obviously a bug.

To avoid that, spin_lock() has to be taken earlier, before the
kref_put() call mentioned earlier.

Link: http://lkml.kernel.org/r/20170913162937.bfff21c7d12b12a5f47639fd@gmail.com
Signed-off-by: Vitaly Wool
Cc: Dan Streetman
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-10-04 08:54:24 +0800

07 Sep, 2017

1 commit

d30561c56 z3fold: use per-cpu unbuddied lists ... Browse Code »

It's been noted that z3fold doesn't scale well when it's run in a large
number of threads on many cores, which can be easily reproduced with fio
'randrw' test with --numjobs=32. E.g. the result for 1 cluster (4 cores)
is:

Run status group 0 (all jobs):
READ: io=244785MB, aggrb=496883KB/s, minb=15527KB/s, ...
WRITE: io=246735MB, aggrb=500841KB/s, minb=15651KB/s, ...

While for 8 cores (2 clusters) the result is:

Run status group 0 (all jobs):
READ: io=244785MB, aggrb=265942KB/s, minb=8310KB/s, ...
WRITE: io=246735MB, aggrb=268060KB/s, minb=8376KB/s, ...

The bottleneck here is the pool lock which many threads become waiting
upon. To reduce that spin lock contention, z3fold can operate only on
the lists local to the current CPU whenever possible. Due to the nature
of z3fold unbuddied list handling (it only takes the first entry off the
list on a hot path), if the z3fold pool is big enough and balanced well
enough, limiting search to only local unbuddied list doesn't lead to a
significant compression ratio degrade (2.57x vs 2.65x in our
measurements).

This patch also introduces two worker threads: one for async in-page
object layout optimization and one for releasing freed pages. This is
done to speed up z3fold_free() which is often on a hot path.

The fio results for 8-core case are now the following:

Run status group 0 (all jobs):
READ: io=244785MB, aggrb=1568.3MB/s, minb=50182KB/s, ...
WRITE: io=246735MB, aggrb=1580.8MB/s, minb=50582KB/s, ...

So we're in for almost 6x performance increase.

Link: http://lkml.kernel.org/r/20170806181443.f9b65018f8bde25ef990f9e8@gmail.com
Signed-off-by: Vitaly Wool
Cc: Dan Streetman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-09-07 08:27:30 +0800

14 Apr, 2017

1 commit

76e32a2a0 z3fold: fix page locking in z3fold_alloc() ... Browse Code »

Stress testing of the current z3fold implementation on a 8-core system
revealed it was possible that a z3fold page deleted from its unbuddied
list in z3fold_alloc() would be put on another unbuddied list by
z3fold_free() while z3fold_alloc() is still processing it. This has
been introduced with commit 5a27aa822 ("z3fold: add kref refcounting")
due to the removal of special handling of a z3fold page not on any list
in z3fold_free().

To fix this, the z3fold page lock should be taken in z3fold_alloc()
before the pool lock is released. To avoid deadlocking, we just try to
lock the page as soon as we get a hold of it, and if trylock fails, we
drop this page and take the next one.

Signed-off-by: Vitaly Wool
Cc: Dan Streetman
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-04-14 09:24:20 +0800

17 Mar, 2017

1 commit

271df90e4 z3fold: fix spinlock unlocking in page reclaim ... Browse Code »

Commmit 5a27aa822029 ("z3fold: add kref refcounting") introduced a bug
in z3fold_reclaim_page() with function exit that may leave pool->lock
spinlock held. Here comes the trivial fix.

Fixes: 5a27aa822029 ("z3fold: add kref refcounting")
Link: http://lkml.kernel.org/r/20170311222239.7b83d8e7ef1914e05497649f@gmail.com
Reported-by: Alexey Khoroshilov
Signed-off-by: Vitaly Wool
Cc: Dan Streetman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-03-17 07:56:18 +0800

25 Feb, 2017

5 commits

5a27aa822 z3fold: add kref refcounting ... Browse Code »

With both coming and already present locking optimizations, introducing
kref to reference-count z3fold objects is the right thing to do.
Moreover, it makes buddied list no longer necessary, and allows for a
simpler handling of headless pages.

[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170131214650.8ea78033d91ded233f552bc0@gmail.com
Signed-off-by: Vitaly Wool
Reviewed-by: Dan Streetman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-02-25 09:46:54 +0800
2f1e5e4d8 z3fold: use per-page spinlock ... Browse Code »

Most of z3fold operations are in-page, such as modifying z3fold page
header or moving z3fold objects within a page. Taking per-pool spinlock
to protect per-page objects is therefore suboptimal, and the idea of
having a per-page spinlock (or rwlock) has been around for some time.

This patch implements spinlock-based per-page locking mechanism which is
lightweight enough to normally fit ok into the z3fold header.

Link: http://lkml.kernel.org/r/20170131214438.433e0a5fda908337b63206d3@gmail.com
Signed-off-by: Vitaly Wool
Reviewed-by: Dan Streetman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-02-25 09:46:54 +0800
1b096e5ae z3fold: extend compaction function ... Browse Code »

z3fold_compact_page() currently only handles the situation when there's
a single middle chunk within the z3fold page. However it may be worth
it to move middle chunk closer to either first or last chunk, whichever
is there, if the gap between them is big enough.

This patch adds the relevant code, using BIG_CHUNK_GAP define as a
threshold for middle chunk to be worth moving.

Link: http://lkml.kernel.org/r/20170131214334.c4f3eac9a477af0fa9a22c46@gmail.com
Signed-off-by: Vitaly Wool
Reviewed-by: Dan Streetman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-02-25 09:46:54 +0800
ede93213a z3fold: fix header size related issues ... Browse Code »

Currently the whole kernel build will be stopped if the size of struct
z3fold_header is greater than the size of one chunk, which is 64 bytes
by default. This patch instead defines the offset for z3fold objects as
the size of the z3fold header in chunks.

Fixed also are the calculation of num_free_chunks() and the address to
move the middle chunk to in case of in-page compaction in
z3fold_compact_page().

Link: http://lkml.kernel.org/r/20170131214057.d98677032bc7b1c6c59a80c9@gmail.com
Signed-off-by: Vitaly Wool
Reviewed-by: Dan Streetman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-02-25 09:46:54 +0800
12d59ae67 z3fold: make pages_nr atomic ... Browse Code »

Convert pages_nr per-pool counter to atomic64_t.

Link: http://lkml.kernel.org/r/20170131213946.b828676ab17bbea42022c213@gmail.com
Signed-off-by: Vitaly Wool
Reviewed-by: Dan Streetman
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Vitaly Wool
2017-02-25 09:46:54 +0800