Eric Lee / smarc-ti-linux-kernel | Embedian Git Server

09 Aug, 2014

2 commits

c119239b1 mm/zswap.c: add __init to zswap_entry_cache_destroy() ... Browse Code »

zswap_entry_cache_destroy() is only called by __init init_zswap().

This patch also fixes function name zswap_entry_cache_ s/destory/destroy

Signed-off-by: Fabian Frederick
Acked-by: Seth Jennings
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Fabian Frederick
2014-08-09 06:57:18 +0800
0a31bc97c mm: memcontrol: rewrite uncharge API ... Browse Code »
93

The memcg uncharging code that is involved towards the end of a page's
lifetime - truncation, reclaim, swapout, migration - is impressively
complicated and fragile.

Because anonymous and file pages were always charged before they had their
page->mapping established, uncharges had to happen when the page type
could still be known from the context; as in unmap for anonymous, page
cache removal for file and shmem pages, and swap cache truncation for swap
pages. However, these operations happen well before the page is actually
freed, and so a lot of synchronization is necessary:

- Charging, uncharging, page migration, and charge migration all need
to take a per-page bit spinlock as they could race with uncharging.

- Swap cache truncation happens during both swap-in and swap-out, and
possibly repeatedly before the page is actually freed. This means
that the memcg swapout code is called from many contexts that make
no sense and it has to figure out the direction from page state to
make sure memory and memory+swap are always correctly charged.

- On page migration, the old page might be unmapped but then reused,
so memcg code has to prevent untimely uncharging in that case.
Because this code - which should be a simple charge transfer - is so
special-cased, it is not reusable for replace_page_cache().

But now that charged pages always have a page->mapping, introduce
mem_cgroup_uncharge(), which is called after the final put_page(), when we
know for sure that nobody is looking at the page anymore.

For page migration, introduce mem_cgroup_migrate(), which is called after
the migration is successful and the new page is fully rmapped. Because
the old page is no longer uncharged after migration, prevent double
charges by decoupling the page's memcg association (PCG_USED and
pc->mem_cgroup) from the page holding an actual charge. The new bits
PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
to the new page during migration.

mem_cgroup_migrate() is suitable for replace_page_cache() as well,
which gets rid of mem_cgroup_replace_page_cache(). However, care
needs to be taken because both the source and the target page can
already be charged and on the LRU when fuse is splicing: grab the page
lock on the charge moving side to prevent changing pc->mem_cgroup of a
page under migration. Also, the lruvecs of both pages change as we
uncharge the old and charge the new during migration, and putback may
race with us, so grab the lru lock and isolate the pages iff on LRU to
prevent races and ensure the pages are on the right lruvec afterward.

Swap accounting is massively simplified: because the page is no longer
uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
before the final put_page() in page reclaim.

Finally, page_cgroup changes are now protected by whatever protection the
page itself offers: anonymous pages are charged under the page table lock,
whereas page cache insertions, swapin, and migration hold the page lock.
Uncharging happens under full exclusion with no outstanding references.
Charging and uncharging also ensure that the page is off-LRU, which
serializes against charge migration. Remove the very costly page_cgroup
lock and set pc->flags non-atomically.

[mhocko@suse.cz: mem_cgroup_charge_statistics needs preempt_disable]
[vdavydov@parallels.com: fix flags definition]
Signed-off-by: Johannes Weiner
Cc: Hugh Dickins
Cc: Tejun Heo
Cc: Vladimir Davydov
Tested-by: Jet Chen
Acked-by: Michal Hocko
Tested-by: Felipe Balbi
Signed-off-by: Vladimir Davydov
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Johannes Weiner
2014-08-09 06:57:17 +0800

07 Aug, 2014

1 commit

12d79d64b mm/zpool: update zswap to use zpool ... Browse Code »

Change zswap to use the zpool api instead of directly using zbud. Add a
boot-time param to allow selecting which zpool implementation to use,
with zbud as the default.

Signed-off-by: Dan Streetman
Tested-by: Seth Jennings
Cc: Weijie Yang
Cc: Minchan Kim
Cc: Nitin Gupta
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Streetman
2014-08-07 09:01:23 +0800

05 Jun, 2014

1 commit

72d09633c mm/zswap: NUMA aware allocation for zswap_dstmem ... Browse Code »

zswap_dstmem is a percpu block of memory, which should be allocated using
kmalloc_node(), to get better NUMA locality.

Without it, all the blocks are allocated from a single node.

Signed-off-by: Eric Dumazet
Acked-by: Seth Jennings
Acked-by: David Rientjes
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Eric Dumazet
2014-06-05 07:54:14 +0800

08 Apr, 2014

5 commits

26c12d933 Merge branch 'akpm' (incoming from Andrew) ... Browse Code »

Merge second patch-bomb from Andrew Morton:
- the rest of MM
- zram updates
- zswap updates
- exit
- procfs
- exec
- wait
- crash dump
- lib/idr
- rapidio
- adfs, affs, bfs, ufs
- cris
- Kconfig things
- initramfs
- small amount of IPC material
- percpu enhancements
- early ioremap support
- various other misc things

* emailed patches from Andrew Morton : (156 commits)
MAINTAINERS: update Intel C600 SAS driver maintainers
fs/ufs: remove unused ufs_super_block_third pointer
fs/ufs: remove unused ufs_super_block_second pointer
fs/ufs: remove unused ufs_super_block_first pointer
fs/ufs/super.c: add __init to init_inodecache()
doc/kernel-parameters.txt: add early_ioremap_debug
arm64: add early_ioremap support
arm64: initialize pgprot info earlier in boot
x86: use generic early_ioremap
mm: create generic early_ioremap() support
x86/mm: sparse warning fix for early_memremap
lglock: map to spinlock when !CONFIG_SMP
percpu: add preemption checks to __this_cpu ops
vmstat: use raw_cpu_ops to avoid false positives on preemption checks
slub: use raw_cpu_inc for incrementing statistics
net: replace __this_cpu_inc in route.c with raw_cpu_inc
modules: use raw_cpu_write for initialization of per cpu refcount.
mm: use raw_cpu ops for determining current NUMA node
percpu: add raw_cpu_ops
slub: fix leak of 'name' in sysfs_slab_add
...

Linus Torvalds
2014-04-08 07:38:06 +0800
5d2d42de1 mm/zswap.c: remove unnecessary parentheses ... Browse Code »

Fix following trivial checkpatch error:

ERROR: return is not a function, parentheses are not required

Signed-off-by: SeongJae Park
Acked-by: Seth Jennings
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

SeongJae Park
2014-04-08 07:36:03 +0800
60105e124 mm/zswap: support multiple swap devices ... Browse Code »

Cai Liu reporeted that now zbud pool pages counting has a problem when
multiple swap is used because it just counts only one swap intead of all
of swap so zswap cannot control writeback properly. The result is
unnecessary writeback or no writeback when we should really writeback.

IOW, it made zswap crazy.

Another problem in zswap is:

For example, let's assume we use two swap A and B with different
priority and A already has charged 19% long time ago and let's assume
that A swap is full now so VM start to use B so that B has charged 1%
recently. It menas zswap charged (19% + 1%) is full by default. Then,
if VM want to swap out more pages into B, zbud_reclaim_page would be
evict one of pages in B's pool and it would be repeated continuously.
It's totally LRU reverse problem and swap thrashing in B would happen.

This patch makes zswap consider mutliple swap by creating *a* zbud pool
which will be shared by multiple swap so all of zswap pages in multiple
swap keep order by LRU so it can prevent above two problems.

Signed-off-by: Minchan Kim
Reported-by: Cai Liu
Suggested-by: Weijie Yang
Cc: Seth Jennings
Reviewed-by: Bob Liu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Minchan Kim
2014-04-08 07:36:03 +0800
6335b1934 mm/zswap.c: update zsmalloc in comment to zbud ... Browse Code »

zswap used zsmalloc before and now using zbud. But, some comments saying
it use zsmalloc yet. Fix the trivial problems.

Signed-off-by: SeongJae Park
Cc: Seth Jennings
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

SeongJae Park
2014-04-08 07:36:03 +0800
6b4525164 mm/zswap.c: fix trivial typo and arrange indentation ... Browse Code »

Signed-off-by: SeongJae Park
Cc: Seth Jennings
Cc: Minchan Kim
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

SeongJae Park
2014-04-08 07:36:03 +0800

20 Mar, 2014

1 commit

576378249 mm, zswap: Fix CPU hotplug callback registration ... Browse Code »

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

get_online_cpus();

for_each_online_cpu(cpu)
init_cpu(cpu);

register_cpu_notifier(&foobar_cpu_notifier);

put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Instead, the correct and race-free way of performing the callback
registration is:

cpu_notifier_register_begin();

for_each_online_cpu(cpu)
init_cpu(cpu);

/* Note the use of the double underscored version of the API */
__register_cpu_notifier(&foobar_cpu_notifier);

cpu_notifier_register_done();

Fix the zswap code by using this latter form of callback registration.

Cc: Ingo Molnar
Signed-off-by: Srivatsa S. Bhat
Signed-off-by: Rafael J. Wysocki

Srivatsa S. Bhat
2014-03-20 20:43:48 +0800

24 Jan, 2014

1 commit

12ab028be mm/zswap.c: change params from hidden to ro ... Browse Code »

The "compressor" and "enabled" params are currently hidden, this changes
them to read-only, so userspace can tell if zswap is enabled or not and
see what compressor is in use.

Signed-off-by: Dan Streetman
Cc: Vladimir Murzin
Cc: Bob Liu
Cc: Minchan Kim
Cc: Weijie Yang
Acked-by: Seth Jennings
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Dan Streetman
2014-01-24 08:36:50 +0800

13 Nov, 2013

3 commits

0ab0abcf5 mm/zswap: refactor the get/put routines ... Browse Code »

The refcount routine was not fit the kernel get/put semantic exactly,
There were too many judgement statements on refcount and it could be
minus.

This patch does the following:

- move refcount judgement to zswap_entry_put() to hide resource free function.

- add a new function zswap_entry_find_get(), so that callers can use
easily in the following pattern:

zswap_entry_find_get
.../* do something */
zswap_entry_put

- to eliminate compile error, move some functions declaration

This patch is based on Minchan Kim 's idea and suggestion.

Signed-off-by: Weijie Yang
Cc: Seth Jennings
Acked-by: Minchan Kim
Cc: Bob Liu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Weijie Yang
2013-11-13 11:09:11 +0800
67d13fe84 mm/zswap: bugfix: memory leak when invalidate and reclaim occur concurrently ... Browse Code »
2

Consider the following scenario:

thread 0: reclaim entry x (get refcount, but not call zswap_get_swap_cache_page)
thread 1: call zswap_frontswap_invalidate_page to invalidate entry x.
finished, entry x and its zbud is not freed as its refcount != 0
now, the swap_map[x] = 0
thread 0: now call zswap_get_swap_cache_page
swapcache_prepare return -ENOENT because entry x is not used any more
zswap_get_swap_cache_page return ZSWAP_SWAPCACHE_NOMEM
zswap_writeback_entry do nothing except put refcount

Now, the memory of zswap_entry x and its zpage leak.

Modify:
- check the refcount in fail path, free memory if it is not referenced.

- use ZSWAP_SWAPCACHE_FAIL instead of ZSWAP_SWAPCACHE_NOMEM as the fail path
can be not only caused by nomem but also by invalidate.

Signed-off-by: Weijie Yang
Reviewed-by: Bob Liu
Reviewed-by: Minchan Kim
Acked-by: Seth Jennings
Cc:
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Weijie Yang
2013-11-13 11:09:10 +0800
b349acc76 mm/zswap: avoid unnecessary page scanning ... Browse Code »

Add SetPageReclaim() before __swap_writepage() so that page can be moved
to the tail of the inactive list, which can avoid unnecessary page
scanning as this page was reclaimed by swap subsystem before.

Signed-off-by: Weijie Yang
Reviewed-by: Bob Liu
Reviewed-by: Minchan Kim
Acked-by: Seth Jennings
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Weijie Yang
2013-11-13 11:09:08 +0800

17 Oct, 2013

1 commit

aa9bca05a mm/zswap: bugfix: memory leak when re-swapon ... Browse Code »

zswap_tree is not freed when swapoff, and it got re-kmalloced in swapon,
so a memory leak occurs.

Free the memory of zswap_tree in zswap_frontswap_invalidate_area().

Signed-off-by: Weijie Yang
Reviewed-by: Bob Liu
Cc: Minchan Kim
Reviewed-by: Minchan Kim
Cc:
From: Weijie Yang
Subject: mm/zswap: bugfix: memory leak when invalidate and reclaim occur concurrently

Consider the following scenario:
thread 0: reclaim entry x (get refcount, but not call zswap_get_swap_cache_page)
thread 1: call zswap_frontswap_invalidate_page to invalidate entry x.
finished, entry x and its zbud is not freed as its refcount != 0
now, the swap_map[x] = 0
thread 0: now call zswap_get_swap_cache_page
swapcache_prepare return -ENOENT because entry x is not used any more
zswap_get_swap_cache_page return ZSWAP_SWAPCACHE_NOMEM
zswap_writeback_entry do nothing except put refcount
Now, the memory of zswap_entry x and its zpage leak.

Modify:
- check the refcount in fail path, free memory if it is not referenced.

- use ZSWAP_SWAPCACHE_FAIL instead of ZSWAP_SWAPCACHE_NOMEM as the fail path
can be not only caused by nomem but also by invalidate.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Weijie Yang
Reviewed-by: Bob Liu
Cc: Minchan Kim
Cc:
Acked-by: Seth Jennings

Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Weijie Yang
2013-10-17 12:35:52 +0800

12 Sep, 2013

2 commits

0bd42136f mm/zswap: use postorder iteration when destroying rbtree ... Browse Code »

Signed-off-by: Cody P Schafer
Reviewed-by: Seth Jennings
Cc: David Woodhouse
Cc: Rik van Riel
Cc: Michel Lespinasse
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Cody P Schafer
2013-09-12 06:59:21 +0800
822518dc5 mm/zswap.c: get swapper address_space by using macro ... Browse Code »

There is a proper macro to get the corresponding swapper address space
from a swap entry. Instead of directly accessing "swapper_spaces" array,
use the "swap_address_space" macro.

Signed-off-by: Sunghan Suh
Reviewed-by: Bob Liu
Reviewed-by: Wanpeng Li
Acked-by: Seth Jennings
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Sunghan Suh
2013-09-12 06:57:08 +0800

11 Jul, 2013

1 commit

2b2811178 zswap: add to mm/ ... Browse Code »

zswap is a thin backend for frontswap that takes pages that are in the
process of being swapped out and attempts to compress them and store
them in a RAM-based memory pool. This can result in a significant I/O
reduction on the swap device and, in the case where decompressing from
RAM is faster than reading from the swap device, can also improve
workload performance.

It also has support for evicting swap pages that are currently
compressed in zswap to the swap device on an LRU(ish) basis. This
functionality makes zswap a true cache in that, once the cache is full,
the oldest pages can be moved out of zswap to the swap device so newer
pages can be compressed and stored in zswap.

This patch adds the zswap driver to mm/

Signed-off-by: Seth Jennings
Acked-by: Rik van Riel
Cc: Greg Kroah-Hartman
Cc: Nitin Gupta
Cc: Minchan Kim
Cc: Konrad Rzeszutek Wilk
Cc: Dan Magenheimer
Cc: Robert Jennings
Cc: Jenifer Hopper
Cc: Mel Gorman
Cc: Johannes Weiner
Cc: Larry Woodman
Cc: Benjamin Herrenschmidt
Cc: Dave Hansen
Cc: Joe Perches
Cc: Joonsoo Kim
Cc: Cody P Schafer
Cc: Hugh Dickens
Cc: Paul Mackerras
Cc: Fengguang Wu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds

Seth Jennings
2013-07-11 09:11:34 +0800