09 Feb, 2017

1 commit

  • commit d7b028f56a971a2e4d8d7887540a144eeefcd4ab upstream.

    Add zswap_init_failed bool that prevents changing any of the module
    params, if init_zswap() fails, and set zswap_enabled to false. Change
    'enabled' param to a callback, and check zswap_init_failed before
    allowing any change to 'enabled', 'zpool', or 'compressor' params.

    Any driver that is built-in to the kernel will not be unloaded if its
    init function returns error, and its module params remain accessible for
    users to change via sysfs. Since zswap uses param callbacks, which
    assume that zswap has been initialized, changing the zswap params after
    a failed initialization will result in WARNING due to the param
    callbacks expecting a pool to already exist. This prevents that by
    immediately exiting any of the param callbacks if initialization failed.

    This was reported here:
    https://marc.info/?l=linux-mm&m=147004228125528&w=4

    And fixes this WARNING:
    [ 429.723476] WARNING: CPU: 0 PID: 5140 at mm/zswap.c:503 __zswap_pool_current+0x56/0x60

    The warning is just noise, and not serious. However, when init fails,
    zswap frees all its percpu dstmem pages and its kmem cache. The kmem
    cache might be serious, if kmem_cache_alloc(NULL, gfp) has problems; but
    the percpu dstmem pages are definitely a problem, as they're used as
    temporary buffer for compressed pages before copying into place in the
    zpool.

    If the user does get zswap enabled after an init failure, then zswap
    will likely Oops on the first page it tries to compress (or worse, start
    corrupting memory).

    Fixes: 90b0fc26d5db ("zswap: change zpool/compressor at runtime")
    Link: http://lkml.kernel.org/r/20170124200259.16191-2-ddstreet@ieee.org
    Signed-off-by: Dan Streetman
    Reported-by: Marcin Miroslaw
    Cc: Seth Jennings
    Cc: Michal Hocko
    Cc: Sergey Senozhatsky
    Cc: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds
    Signed-off-by: Greg Kroah-Hartman

    Dan Streetman
     

21 May, 2016

1 commit

  • Add a work_struct to struct zswap_pool, and change __zswap_pool_empty to
    use the workqueue instead of using call_rcu().

    When zswap destroys a pool no longer in use, it uses call_rcu() to
    perform the destruction/freeing. Since that executes in softirq
    context, it must not sleep. However, actually destroying the pool
    involves freeing the per-cpu compressors (which requires locking the
    cpu_add_remove_lock mutex) and freeing the zpool, for which the
    implementation may sleep (e.g. zsmalloc calls kmem_cache_destroy, which
    locks the slab_mutex). So if either mutex is currently taken, or any
    other part of the compressor or zpool implementation sleeps, it will
    result in a BUG().

    It's not easy to reproduce this when changing zswap's params normally.
    In testing with a loaded system, this does not fail:

    $ cd /sys/module/zswap/parameters
    $ echo lz4 > compressor ; echo zsmalloc > zpool

    nor does this:

    $ while true ; do
    > echo lzo > compressor ; echo zbud > zpool
    > sleep 1
    > echo lz4 > compressor ; echo zsmalloc > zpool
    > sleep 1
    > done

    although it's still possible either of those might fail, depending on
    whether anything else besides zswap has locked the mutexes.

    However, changing a parameter with no delay immediately causes the
    schedule while atomic BUG:

    $ while true ; do
    > echo lzo > compressor ; echo lz4 > compressor
    > done

    This is essentially the same as Yu Zhao's proposed patch to zsmalloc,
    but moved to zswap, to cover compressor and zpool freeing.

    Fixes: f1c54846ee45 ("zswap: dynamic pool creation")
    Signed-off-by: Dan Streetman
    Reported-by: Yu Zhao
    Reviewed-by: Sergey Senozhatsky
    Cc: Minchan Kim
    Cc: Dan Streetman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     

06 May, 2016

1 commit

  • Instead of using "zswap" as the name for all zpools created, add an
    atomic counter and use "zswap%x" with the counter number for each zpool
    created, to provide a unique name for each new zpool.

    As zsmalloc, one of the zpool implementations, requires/expects a unique
    name for each pool created, zswap should provide a unique name. The
    zsmalloc pool creation does not fail if a new pool with a conflicting
    name is created, unless CONFIG_ZSMALLOC_STAT is enabled; in that case,
    zsmalloc pool creation fails with -ENOMEM. Then zswap will be unable to
    change its compressor parameter if its zpool is zsmalloc; it also will
    be unable to change its zpool parameter back to zsmalloc, if it has any
    existing old zpool using zsmalloc with page(s) in it. Attempts to
    change the parameters will result in failure to create the zpool. This
    changes zswap to provide a unique name for each zpool creation.

    Fixes: f1c54846ee45 ("zswap: dynamic pool creation")
    Signed-off-by: Dan Streetman
    Reported-by: Sergey Senozhatsky
    Reviewed-by: Sergey Senozhatsky
    Cc: Dan Streetman
    Cc: Minchan Kim
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     

05 Apr, 2016

1 commit

  • PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
    ago with promise that one day it will be possible to implement page
    cache with bigger chunks than PAGE_SIZE.

    This promise never materialized. And unlikely will.

    We have many places where PAGE_CACHE_SIZE assumed to be equal to
    PAGE_SIZE. And it's constant source of confusion on whether
    PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
    especially on the border between fs and mm.

    Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
    breakage to be doable.

    Let's stop pretending that pages in page cache are special. They are
    not.

    The changes are pretty straight-forward:

    - << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> ;

    - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};

    - page_cache_get() -> get_page();

    - page_cache_release() -> put_page();

    This patch contains automated changes generated with coccinelle using
    script below. For some reason, coccinelle doesn't patch header files.
    I've called spatch for them manually.

    The only adjustment after coccinelle is revert of changes to
    PAGE_CAHCE_ALIGN definition: we are going to drop it later.

    There are few places in the code where coccinelle didn't reach. I'll
    fix them manually in a separate patch. Comments and documentation also
    will be addressed with the separate patch.

    virtual patch

    @@
    expression E;
    @@
    - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    expression E;
    @@
    - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
    + E

    @@
    @@
    - PAGE_CACHE_SHIFT
    + PAGE_SHIFT

    @@
    @@
    - PAGE_CACHE_SIZE
    + PAGE_SIZE

    @@
    @@
    - PAGE_CACHE_MASK
    + PAGE_MASK

    @@
    expression E;
    @@
    - PAGE_CACHE_ALIGN(E)
    + PAGE_ALIGN(E)

    @@
    expression E;
    @@
    - page_cache_get(E)
    + get_page(E)

    @@
    expression E;
    @@
    - page_cache_release(E)
    + put_page(E)

    Signed-off-by: Kirill A. Shutemov
    Acked-by: Michal Hocko
    Signed-off-by: Linus Torvalds

    Kirill A. Shutemov
     

19 Dec, 2015

1 commit

  • Change the use of strncmp in zswap_pool_find_get() to strcmp.

    The use of strncmp is no longer correct, now that zswap_zpool_type is
    not an array; sizeof() will return the size of a pointer, which isn't
    the right length to compare. We don't need to use strncmp anyway,
    because the existing params and the passed in params are all guaranteed
    to be null terminated, so strcmp should be used.

    Signed-off-by: Dan Streetman
    Reported-by: Weijie Yang
    Cc: Seth Jennings
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     

07 Nov, 2015

3 commits

  • Instead of using a fixed-length string for the zswap params, use charp.
    This simplifies the code and uses less memory, as most zswap param strings
    will be less than the current maximum length.

    Signed-off-by: Dan Streetman
    Cc: Rusty Russell
    Cc: Seth Jennings
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     
  • On the next line entry variable will be re-initialized so no need to init
    it with NULL.

    Signed-off-by: Alexey Klimov
    Cc: Seth Jennings
    Cc: Dan Streetman
    Cc: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Alexey Klimov
     
  • …d avoiding waking kswapd

    __GFP_WAIT has been used to identify atomic context in callers that hold
    spinlocks or are in interrupts. They are expected to be high priority and
    have access one of two watermarks lower than "min" which can be referred
    to as the "atomic reserve". __GFP_HIGH users get access to the first
    lower watermark and can be called the "high priority reserve".

    Over time, callers had a requirement to not block when fallback options
    were available. Some have abused __GFP_WAIT leading to a situation where
    an optimisitic allocation with a fallback option can access atomic
    reserves.

    This patch uses __GFP_ATOMIC to identify callers that are truely atomic,
    cannot sleep and have no alternative. High priority users continue to use
    __GFP_HIGH. __GFP_DIRECT_RECLAIM identifies callers that can sleep and
    are willing to enter direct reclaim. __GFP_KSWAPD_RECLAIM to identify
    callers that want to wake kswapd for background reclaim. __GFP_WAIT is
    redefined as a caller that is willing to enter direct reclaim and wake
    kswapd for background reclaim.

    This patch then converts a number of sites

    o __GFP_ATOMIC is used by callers that are high priority and have memory
    pools for those requests. GFP_ATOMIC uses this flag.

    o Callers that have a limited mempool to guarantee forward progress clear
    __GFP_DIRECT_RECLAIM but keep __GFP_KSWAPD_RECLAIM. bio allocations fall
    into this category where kswapd will still be woken but atomic reserves
    are not used as there is a one-entry mempool to guarantee progress.

    o Callers that are checking if they are non-blocking should use the
    helper gfpflags_allow_blocking() where possible. This is because
    checking for __GFP_WAIT as was done historically now can trigger false
    positives. Some exceptions like dm-crypt.c exist where the code intent
    is clearer if __GFP_DIRECT_RECLAIM is used instead of the helper due to
    flag manipulations.

    o Callers that built their own GFP flags instead of starting with GFP_KERNEL
    and friends now also need to specify __GFP_KSWAPD_RECLAIM.

    The first key hazard to watch out for is callers that removed __GFP_WAIT
    and was depending on access to atomic reserves for inconspicuous reasons.
    In some cases it may be appropriate for them to use __GFP_HIGH.

    The second key hazard is callers that assembled their own combination of
    GFP flags instead of starting with something like GFP_KERNEL. They may
    now wish to specify __GFP_KSWAPD_RECLAIM. It's almost certainly harmless
    if it's missed in most cases as other activity will wake kswapd.

    Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
    Acked-by: Vlastimil Babka <vbabka@suse.cz>
    Acked-by: Michal Hocko <mhocko@suse.com>
    Acked-by: Johannes Weiner <hannes@cmpxchg.org>
    Cc: Christoph Lameter <cl@linux.com>
    Cc: David Rientjes <rientjes@google.com>
    Cc: Vitaly Wool <vitalywool@gmail.com>
    Cc: Rik van Riel <riel@redhat.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

    Mel Gorman
     

11 Sep, 2015

2 commits

  • Update the zpool and compressor parameters to be changeable at runtime.
    When changed, a new pool is created with the requested zpool/compressor,
    and added as the current pool at the front of the pool list. Previous
    pools remain in the list only to remove existing compressed pages from.
    The old pool(s) are removed once they become empty.

    Signed-off-by: Dan Streetman
    Acked-by: Seth Jennings
    Cc: Sergey Senozhatsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     
  • Add dynamic creation of pools. Move the static crypto compression per-cpu
    transforms into each pool. Add a pointer to zswap_entry to the pool it's
    in.

    This is required by the following patch which enables changing the zswap
    zpool and compressor params at runtime.

    [akpm@linux-foundation.org: fix merge snafus]
    Signed-off-by: Dan Streetman
    Acked-by: Seth Jennings
    Cc: Sergey Senozhatsky
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     

09 Sep, 2015

2 commits

  • The structure zpool_ops is not modified so make the pointer to it a
    pointer to const.

    Signed-off-by: Krzysztof Kozlowski
    Acked-by: Dan Streetman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Krzysztof Kozlowski
     
  • zswap_get_swap_cache_page and read_swap_cache_async have pretty much the
    same code with only significant difference in return value and usage of
    swap_readpage.

    I a helper __read_swap_cache_async() with the common code. Behavior
    change: now zswap_get_swap_cache_page will use radix_tree_maybe_preload
    instead radix_tree_preload. Looks like, this wasn't changed only by the
    reason of code duplication.

    Signed-off-by: Dmitry Safonov
    Cc: Johannes Weiner
    Cc: Vladimir Davydov
    Cc: Michal Hocko
    Cc: Hugh Dickins
    Cc: Minchan Kim
    Cc: Tejun Heo
    Cc: Jens Axboe
    Cc: Christoph Hellwig
    Cc: David Herrmann
    Cc: Seth Jennings
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dmitry Safonov
     

26 Jun, 2015

1 commit

  • Change the "enabled" parameter to be configurable at runtime. Remove the
    enabled check from init(), and move it to the frontswap store() function;
    when enabled, pages will be stored, and when disabled, pages won't be
    stored.

    This is almost identical to Seth's patch from 2 years ago:
    http://lkml.iu.edu/hypermail/linux/kernel/1307.2/04289.html

    [akpm@linux-foundation.org: tweak documentation]
    Signed-off-by: Dan Streetman
    Suggested-by: Seth Jennings
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     

13 Feb, 2015

1 commit

  • Currently the underlay of zpool: zsmalloc/zbud, do not know who creates
    them. There is not a method to let zsmalloc/zbud find which caller they
    belong to.

    Now we want to add statistics collection in zsmalloc. We need to name the
    debugfs dir for each pool created. The way suggested by Minchan Kim is to
    use a name passed by caller(such as zram) to create the zsmalloc pool.

    /sys/kernel/debug/zsmalloc/zram0

    This patch adds an argument `name' to zs_create_pool() and other related
    functions.

    Signed-off-by: Ganesh Mahendran
    Acked-by: Minchan Kim
    Cc: Seth Jennings
    Cc: Nitin Gupta
    Cc: Dan Streetman
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Ganesh Mahendran
     

14 Dec, 2014

2 commits


20 Nov, 2014

1 commit


13 Nov, 2014

1 commit


09 Aug, 2014

2 commits

  • zswap_entry_cache_destroy() is only called by __init init_zswap().

    This patch also fixes function name zswap_entry_cache_ s/destory/destroy

    Signed-off-by: Fabian Frederick
    Acked-by: Seth Jennings
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Fabian Frederick
     
  • The memcg uncharging code that is involved towards the end of a page's
    lifetime - truncation, reclaim, swapout, migration - is impressively
    complicated and fragile.

    Because anonymous and file pages were always charged before they had their
    page->mapping established, uncharges had to happen when the page type
    could still be known from the context; as in unmap for anonymous, page
    cache removal for file and shmem pages, and swap cache truncation for swap
    pages. However, these operations happen well before the page is actually
    freed, and so a lot of synchronization is necessary:

    - Charging, uncharging, page migration, and charge migration all need
    to take a per-page bit spinlock as they could race with uncharging.

    - Swap cache truncation happens during both swap-in and swap-out, and
    possibly repeatedly before the page is actually freed. This means
    that the memcg swapout code is called from many contexts that make
    no sense and it has to figure out the direction from page state to
    make sure memory and memory+swap are always correctly charged.

    - On page migration, the old page might be unmapped but then reused,
    so memcg code has to prevent untimely uncharging in that case.
    Because this code - which should be a simple charge transfer - is so
    special-cased, it is not reusable for replace_page_cache().

    But now that charged pages always have a page->mapping, introduce
    mem_cgroup_uncharge(), which is called after the final put_page(), when we
    know for sure that nobody is looking at the page anymore.

    For page migration, introduce mem_cgroup_migrate(), which is called after
    the migration is successful and the new page is fully rmapped. Because
    the old page is no longer uncharged after migration, prevent double
    charges by decoupling the page's memcg association (PCG_USED and
    pc->mem_cgroup) from the page holding an actual charge. The new bits
    PCG_MEM and PCG_MEMSW represent the respective charges and are transferred
    to the new page during migration.

    mem_cgroup_migrate() is suitable for replace_page_cache() as well,
    which gets rid of mem_cgroup_replace_page_cache(). However, care
    needs to be taken because both the source and the target page can
    already be charged and on the LRU when fuse is splicing: grab the page
    lock on the charge moving side to prevent changing pc->mem_cgroup of a
    page under migration. Also, the lruvecs of both pages change as we
    uncharge the old and charge the new during migration, and putback may
    race with us, so grab the lru lock and isolate the pages iff on LRU to
    prevent races and ensure the pages are on the right lruvec afterward.

    Swap accounting is massively simplified: because the page is no longer
    uncharged as early as swap cache deletion, a new mem_cgroup_swapout() can
    transfer the page's memory+swap charge (PCG_MEMSW) to the swap entry
    before the final put_page() in page reclaim.

    Finally, page_cgroup changes are now protected by whatever protection the
    page itself offers: anonymous pages are charged under the page table lock,
    whereas page cache insertions, swapin, and migration hold the page lock.
    Uncharging happens under full exclusion with no outstanding references.
    Charging and uncharging also ensure that the page is off-LRU, which
    serializes against charge migration. Remove the very costly page_cgroup
    lock and set pc->flags non-atomically.

    [mhocko@suse.cz: mem_cgroup_charge_statistics needs preempt_disable]
    [vdavydov@parallels.com: fix flags definition]
    Signed-off-by: Johannes Weiner
    Cc: Hugh Dickins
    Cc: Tejun Heo
    Cc: Vladimir Davydov
    Tested-by: Jet Chen
    Acked-by: Michal Hocko
    Tested-by: Felipe Balbi
    Signed-off-by: Vladimir Davydov
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Johannes Weiner
     

07 Aug, 2014

1 commit

  • Change zswap to use the zpool api instead of directly using zbud. Add a
    boot-time param to allow selecting which zpool implementation to use,
    with zbud as the default.

    Signed-off-by: Dan Streetman
    Tested-by: Seth Jennings
    Cc: Weijie Yang
    Cc: Minchan Kim
    Cc: Nitin Gupta
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     

05 Jun, 2014

1 commit

  • zswap_dstmem is a percpu block of memory, which should be allocated using
    kmalloc_node(), to get better NUMA locality.

    Without it, all the blocks are allocated from a single node.

    Signed-off-by: Eric Dumazet
    Acked-by: Seth Jennings
    Acked-by: David Rientjes
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Eric Dumazet
     

08 Apr, 2014

5 commits

  • Merge second patch-bomb from Andrew Morton:
    - the rest of MM
    - zram updates
    - zswap updates
    - exit
    - procfs
    - exec
    - wait
    - crash dump
    - lib/idr
    - rapidio
    - adfs, affs, bfs, ufs
    - cris
    - Kconfig things
    - initramfs
    - small amount of IPC material
    - percpu enhancements
    - early ioremap support
    - various other misc things

    * emailed patches from Andrew Morton : (156 commits)
    MAINTAINERS: update Intel C600 SAS driver maintainers
    fs/ufs: remove unused ufs_super_block_third pointer
    fs/ufs: remove unused ufs_super_block_second pointer
    fs/ufs: remove unused ufs_super_block_first pointer
    fs/ufs/super.c: add __init to init_inodecache()
    doc/kernel-parameters.txt: add early_ioremap_debug
    arm64: add early_ioremap support
    arm64: initialize pgprot info earlier in boot
    x86: use generic early_ioremap
    mm: create generic early_ioremap() support
    x86/mm: sparse warning fix for early_memremap
    lglock: map to spinlock when !CONFIG_SMP
    percpu: add preemption checks to __this_cpu ops
    vmstat: use raw_cpu_ops to avoid false positives on preemption checks
    slub: use raw_cpu_inc for incrementing statistics
    net: replace __this_cpu_inc in route.c with raw_cpu_inc
    modules: use raw_cpu_write for initialization of per cpu refcount.
    mm: use raw_cpu ops for determining current NUMA node
    percpu: add raw_cpu_ops
    slub: fix leak of 'name' in sysfs_slab_add
    ...

    Linus Torvalds
     
  • Fix following trivial checkpatch error:

    ERROR: return is not a function, parentheses are not required

    Signed-off-by: SeongJae Park
    Acked-by: Seth Jennings
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    SeongJae Park
     
  • Cai Liu reporeted that now zbud pool pages counting has a problem when
    multiple swap is used because it just counts only one swap intead of all
    of swap so zswap cannot control writeback properly. The result is
    unnecessary writeback or no writeback when we should really writeback.

    IOW, it made zswap crazy.

    Another problem in zswap is:

    For example, let's assume we use two swap A and B with different
    priority and A already has charged 19% long time ago and let's assume
    that A swap is full now so VM start to use B so that B has charged 1%
    recently. It menas zswap charged (19% + 1%) is full by default. Then,
    if VM want to swap out more pages into B, zbud_reclaim_page would be
    evict one of pages in B's pool and it would be repeated continuously.
    It's totally LRU reverse problem and swap thrashing in B would happen.

    This patch makes zswap consider mutliple swap by creating *a* zbud pool
    which will be shared by multiple swap so all of zswap pages in multiple
    swap keep order by LRU so it can prevent above two problems.

    Signed-off-by: Minchan Kim
    Reported-by: Cai Liu
    Suggested-by: Weijie Yang
    Cc: Seth Jennings
    Reviewed-by: Bob Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Minchan Kim
     
  • zswap used zsmalloc before and now using zbud. But, some comments saying
    it use zsmalloc yet. Fix the trivial problems.

    Signed-off-by: SeongJae Park
    Cc: Seth Jennings
    Cc: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    SeongJae Park
     
  • Signed-off-by: SeongJae Park
    Cc: Seth Jennings
    Cc: Minchan Kim
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    SeongJae Park
     

20 Mar, 2014

1 commit

  • Subsystems that want to register CPU hotplug callbacks, as well as perform
    initialization for the CPUs that are already online, often do it as shown
    below:

    get_online_cpus();

    for_each_online_cpu(cpu)
    init_cpu(cpu);

    register_cpu_notifier(&foobar_cpu_notifier);

    put_online_cpus();

    This is wrong, since it is prone to ABBA deadlocks involving the
    cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
    with CPU hotplug operations).

    Instead, the correct and race-free way of performing the callback
    registration is:

    cpu_notifier_register_begin();

    for_each_online_cpu(cpu)
    init_cpu(cpu);

    /* Note the use of the double underscored version of the API */
    __register_cpu_notifier(&foobar_cpu_notifier);

    cpu_notifier_register_done();

    Fix the zswap code by using this latter form of callback registration.

    Cc: Ingo Molnar
    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     

24 Jan, 2014

1 commit

  • The "compressor" and "enabled" params are currently hidden, this changes
    them to read-only, so userspace can tell if zswap is enabled or not and
    see what compressor is in use.

    Signed-off-by: Dan Streetman
    Cc: Vladimir Murzin
    Cc: Bob Liu
    Cc: Minchan Kim
    Cc: Weijie Yang
    Acked-by: Seth Jennings
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     

13 Nov, 2013

3 commits

  • The refcount routine was not fit the kernel get/put semantic exactly,
    There were too many judgement statements on refcount and it could be
    minus.

    This patch does the following:

    - move refcount judgement to zswap_entry_put() to hide resource free function.

    - add a new function zswap_entry_find_get(), so that callers can use
    easily in the following pattern:

    zswap_entry_find_get
    .../* do something */
    zswap_entry_put

    - to eliminate compile error, move some functions declaration

    This patch is based on Minchan Kim 's idea and suggestion.

    Signed-off-by: Weijie Yang
    Cc: Seth Jennings
    Acked-by: Minchan Kim
    Cc: Bob Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Weijie Yang
     
  • Consider the following scenario:

    thread 0: reclaim entry x (get refcount, but not call zswap_get_swap_cache_page)
    thread 1: call zswap_frontswap_invalidate_page to invalidate entry x.
    finished, entry x and its zbud is not freed as its refcount != 0
    now, the swap_map[x] = 0
    thread 0: now call zswap_get_swap_cache_page
    swapcache_prepare return -ENOENT because entry x is not used any more
    zswap_get_swap_cache_page return ZSWAP_SWAPCACHE_NOMEM
    zswap_writeback_entry do nothing except put refcount

    Now, the memory of zswap_entry x and its zpage leak.

    Modify:
    - check the refcount in fail path, free memory if it is not referenced.

    - use ZSWAP_SWAPCACHE_FAIL instead of ZSWAP_SWAPCACHE_NOMEM as the fail path
    can be not only caused by nomem but also by invalidate.

    Signed-off-by: Weijie Yang
    Reviewed-by: Bob Liu
    Reviewed-by: Minchan Kim
    Acked-by: Seth Jennings
    Cc:
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Weijie Yang
     
  • Add SetPageReclaim() before __swap_writepage() so that page can be moved
    to the tail of the inactive list, which can avoid unnecessary page
    scanning as this page was reclaimed by swap subsystem before.

    Signed-off-by: Weijie Yang
    Reviewed-by: Bob Liu
    Reviewed-by: Minchan Kim
    Acked-by: Seth Jennings
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Weijie Yang
     

17 Oct, 2013

1 commit

  • zswap_tree is not freed when swapoff, and it got re-kmalloced in swapon,
    so a memory leak occurs.

    Free the memory of zswap_tree in zswap_frontswap_invalidate_area().

    Signed-off-by: Weijie Yang
    Reviewed-by: Bob Liu
    Cc: Minchan Kim
    Reviewed-by: Minchan Kim
    Cc:
    From: Weijie Yang
    Subject: mm/zswap: bugfix: memory leak when invalidate and reclaim occur concurrently

    Consider the following scenario:
    thread 0: reclaim entry x (get refcount, but not call zswap_get_swap_cache_page)
    thread 1: call zswap_frontswap_invalidate_page to invalidate entry x.
    finished, entry x and its zbud is not freed as its refcount != 0
    now, the swap_map[x] = 0
    thread 0: now call zswap_get_swap_cache_page
    swapcache_prepare return -ENOENT because entry x is not used any more
    zswap_get_swap_cache_page return ZSWAP_SWAPCACHE_NOMEM
    zswap_writeback_entry do nothing except put refcount
    Now, the memory of zswap_entry x and its zpage leak.

    Modify:
    - check the refcount in fail path, free memory if it is not referenced.

    - use ZSWAP_SWAPCACHE_FAIL instead of ZSWAP_SWAPCACHE_NOMEM as the fail path
    can be not only caused by nomem but also by invalidate.

    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Weijie Yang
    Reviewed-by: Bob Liu
    Cc: Minchan Kim
    Cc:
    Acked-by: Seth Jennings

    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Weijie Yang
     

12 Sep, 2013

2 commits


11 Jul, 2013

1 commit

  • zswap is a thin backend for frontswap that takes pages that are in the
    process of being swapped out and attempts to compress them and store
    them in a RAM-based memory pool. This can result in a significant I/O
    reduction on the swap device and, in the case where decompressing from
    RAM is faster than reading from the swap device, can also improve
    workload performance.

    It also has support for evicting swap pages that are currently
    compressed in zswap to the swap device on an LRU(ish) basis. This
    functionality makes zswap a true cache in that, once the cache is full,
    the oldest pages can be moved out of zswap to the swap device so newer
    pages can be compressed and stored in zswap.

    This patch adds the zswap driver to mm/

    Signed-off-by: Seth Jennings
    Acked-by: Rik van Riel
    Cc: Greg Kroah-Hartman
    Cc: Nitin Gupta
    Cc: Minchan Kim
    Cc: Konrad Rzeszutek Wilk
    Cc: Dan Magenheimer
    Cc: Robert Jennings
    Cc: Jenifer Hopper
    Cc: Mel Gorman
    Cc: Johannes Weiner
    Cc: Larry Woodman
    Cc: Benjamin Herrenschmidt
    Cc: Dave Hansen
    Cc: Joe Perches
    Cc: Joonsoo Kim
    Cc: Cody P Schafer
    Cc: Hugh Dickens
    Cc: Paul Mackerras
    Cc: Fengguang Wu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Seth Jennings