22 Feb, 2018

1 commit

  • It was reported by Sergey Senozhatsky that if THP (Transparent Huge
    Page) and frontswap (via zswap) are both enabled, when memory goes low
    so that swap is triggered, segfault and memory corruption will occur in
    random user space applications as follow,

    kernel: urxvt[338]: segfault at 20 ip 00007fc08889ae0d sp 00007ffc73a7fc40 error 6 in libc-2.26.so[7fc08881a000+1ae000]
    #0 0x00007fc08889ae0d _int_malloc (libc.so.6)
    #1 0x00007fc08889c2f3 malloc (libc.so.6)
    #2 0x0000560e6004bff7 _Z14rxvt_wcstoutf8PKwi (urxvt)
    #3 0x0000560e6005e75c n/a (urxvt)
    #4 0x0000560e6007d9f1 _ZN16rxvt_perl_interp6invokeEP9rxvt_term9hook_typez (urxvt)
    #5 0x0000560e6003d988 _ZN9rxvt_term9cmd_parseEv (urxvt)
    #6 0x0000560e60042804 _ZN9rxvt_term6pty_cbERN2ev2ioEi (urxvt)
    #7 0x0000560e6005c10f _Z17ev_invoke_pendingv (urxvt)
    #8 0x0000560e6005cb55 ev_run (urxvt)
    #9 0x0000560e6003b9b9 main (urxvt)
    #10 0x00007fc08883af4a __libc_start_main (libc.so.6)
    #11 0x0000560e6003f9da _start (urxvt)

    After bisection, it was found the first bad commit is bd4c82c22c36 ("mm,
    THP, swap: delay splitting THP after swapped out").

    The root cause is as follows:

    When the pages are written to swap device during swapping out in
    swap_writepage(), zswap (fontswap) is tried to compress the pages to
    improve performance. But zswap (frontswap) will treat THP as a normal
    page, so only the head page is saved. After swapping in, tail pages
    will not be restored to their original contents, causing memory
    corruption in the applications.

    This is fixed by refusing to save page in the frontswap store functions
    if the page is a THP. So that the THP will be swapped out to swap
    device.

    Another choice is to split THP if frontswap is enabled. But it is found
    that the frontswap enabling isn't flexible. For example, if
    CONFIG_ZSWAP=y (cannot be module), frontswap will be enabled even if
    zswap itself isn't enabled.

    Frontswap has multiple backends, to make it easy for one backend to
    enable THP support, the THP checking is put in backend frontswap store
    functions instead of the general interfaces.

    Link: http://lkml.kernel.org/r/20180209084947.22749-1-ying.huang@intel.com
    Fixes: bd4c82c22c367e068 ("mm, THP, swap: delay splitting THP after swapped out")
    Signed-off-by: "Huang, Ying"
    Reported-by: Sergey Senozhatsky
    Tested-by: Sergey Senozhatsky
    Suggested-by: Minchan Kim [put THP checking in backend]
    Cc: Konrad Rzeszutek Wilk
    Cc: Dan Streetman
    Cc: Seth Jennings
    Cc: Tetsuo Handa
    Cc: Shaohua Li
    Cc: Michal Hocko
    Cc: Johannes Weiner
    Cc: Mel Gorman
    Cc: Shakeel Butt
    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: [4.14]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Huang Ying
     

05 Jun, 2017

1 commit

  • For some file systems we still memcpy into it, but in various places this
    already allows us to use the proper uuid helpers. More to come..

    Signed-off-by: Christoph Hellwig
    Reviewed-by: Amir Goldstein
    Acked-by: Mimi Zohar  (Changes to IMA/EVM)
    Reviewed-by: Andy Shevchenko

    Christoph Hellwig
     

27 Jan, 2016

1 commit


09 Sep, 2015

2 commits

  • All the caller of xen_tmem_{get,put}_page have a struct page * in hand
    and call pfn_to_gfn for the only benefits of these 2 functions.

    Rather than passing the pfn in parameter, pass directly the page and use
    directly xen_page_to_gfn.

    Signed-off-by: Julien Grall
    Reviewed-by: Stefano Stabellini
    Signed-off-by: David Vrabel

    Julien Grall
     
  • Based on include/xen/mm.h [1], Linux is mistakenly using MFN when GFN
    is meant, I suspect this is because the first support for Xen was for
    PV. This resulted in some misimplementation of helpers on ARM and
    confused developers about the expected behavior.

    For instance, with pfn_to_mfn, we expect to get an MFN based on the name.
    Although, if we look at the implementation on x86, it's returning a GFN.

    For clarity and avoid new confusion, replace any reference to mfn with
    gfn in any helpers used by PV drivers. The x86 code will still keep some
    reference of pfn_to_mfn which may be used by all kind of guests
    No changes as been made in the hypercall field, even
    though they may be invalid, in order to keep the same as the defintion
    in xen repo.

    Note that page_to_mfn has been renamed to xen_page_to_gfn to avoid a
    name to close to the KVM function gfn_to_page.

    Take also the opportunity to simplify simple construction such
    as pfn_to_mfn(page_to_pfn(page)) into xen_page_to_gfn. More complex clean up
    will come in follow-up patches.

    [1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=e758ed14f390342513405dd766e874934573e6cb

    Signed-off-by: Julien Grall
    Reviewed-by: Stefano Stabellini
    Acked-by: Dmitry Torokhov
    Acked-by: Wei Liu
    Signed-off-by: David Vrabel

    Julien Grall
     

02 Jul, 2015

1 commit

  • Pull xen updates from David Vrabel:
    "Xen features and cleanups for 4.2-rc0:

    - add "make xenconfig" to assist in generating configs for Xen guests

    - preparatory cleanups necessary for supporting 64 KiB pages in ARM
    guests

    - automatically use hvc0 as the default console in ARM guests"

    * tag 'for-linus-4.2-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    block/xen-blkback: s/nr_pages/nr_segs/
    block/xen-blkfront: Remove invalid comment
    block/xen-blkfront: Remove unused macro MAXIMUM_OUTSTANDING_BLOCK_REQS
    arm/xen: Drop duplicate define mfn_to_virt
    xen/grant-table: Remove unused macro SPP
    xen/xenbus: client: Fix call of virt_to_mfn in xenbus_grant_ring
    xen: Include xen/page.h rather than asm/xen/page.h
    kconfig: add xenconfig defconfig helper
    kconfig: clarify kvmconfig is for kvm
    xen/pcifront: Remove usage of struct timeval
    xen/tmem: use BUILD_BUG_ON() in favor of BUG_ON()
    hvc_xen: avoid uninitialized variable warning
    xenbus: avoid uninitialized variable warning
    xen/arm: allow console=hvc0 to be omitted for guests
    arm,arm64/xen: move Xen initialization earlier
    arm/xen: Correctly check if the event channel interrupt is present

    Linus Torvalds
     

25 Jun, 2015

1 commit

  • Change frontswap single pointer to a singly linked list of frontswap
    implementations. Update Xen tmem implementation as register no longer
    returns anything.

    Frontswap only keeps track of a single implementation; any
    implementation that registers second (or later) will replace the
    previously registered implementation, and gets a pointer to the previous
    implementation that the new implementation is expected to pass all
    frontswap functions to if it can't handle the function itself. However
    that method doesn't really make much sense, as passing that work on to
    every implementation adds unnecessary work to implementations; instead,
    frontswap should simply keep a list of all registered implementations
    and try each implementation for any function. Most importantly, neither
    of the two currently existing frontswap implementations in the kernel
    actually do anything with any previous frontswap implementation that
    they replace when registering.

    This allows frontswap to successfully manage multiple implementations by
    keeping a list of them all.

    Signed-off-by: Dan Streetman
    Cc: Konrad Rzeszutek Wilk
    Cc: Boris Ostrovsky
    Cc: David Vrabel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Streetman
     

17 Jun, 2015

1 commit

  • Using xen/page.h will be necessary later for using common xen page
    helpers.

    As xen/page.h already include asm/xen/page.h, always use the later.

    Signed-off-by: Julien Grall
    Reviewed-by: David Vrabel
    Cc: Stefano Stabellini
    Cc: Ian Campbell
    Cc: Wei Liu
    Cc: Konrad Rzeszutek Wilk
    Cc: Boris Ostrovsky
    Cc: netdev@vger.kernel.org
    Signed-off-by: David Vrabel

    Julien Grall
     

28 May, 2015

1 commit


15 Apr, 2015

1 commit

  • Currently, cleancache_register_ops returns the previous value of
    cleancache_ops to allow chaining. However, chaining, as it is
    implemented now, is extremely dangerous due to possible pool id
    collisions. Suppose, a new cleancache driver is registered after the
    previous one assigned an id to a super block. If the new driver assigns
    the same id to another super block, which is perfectly possible, we will
    have two different filesystems using the same id. No matter if the new
    driver implements chaining or not, we are likely to get data corruption
    with such a configuration eventually.

    This patch therefore disables the ability to override cleancache_ops
    altogether as potentially dangerous. If there is already cleancache
    driver registered, all further calls to cleancache_register_ops will
    return EBUSY. Since no user of cleancache implements chaining, we only
    need to make minor changes to the code outside the cleancache core.

    Signed-off-by: Vladimir Davydov
    Cc: Konrad Rzeszutek Wilk
    Cc: Boris Ostrovsky
    Cc: David Vrabel
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Stefan Hengelein
    Cc: Florian Schmaus
    Cc: Andor Daam
    Cc: Dan Magenheimer
    Cc: Bob Liu
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vladimir Davydov
     

23 Jan, 2015

1 commit


04 Jul, 2013

1 commit

  • …nux/kernel/git/konrad/xen

    Pull Xen bugfixes from Konrad Rzeszutek Wilk:
    - Fix memory leak when CPU hotplugging.
    - Compile bugs with various #ifdefs
    - Fix state changes in Xen PCI front not dealing well with new
    toolstack.
    - Cleanups in code (use pr_*, fix 80 characters splits, etc)
    - Long standing bug in double-reporting the steal time

    * tag 'stable/for-linus-3.11-rc0-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen/time: remove blocked time accounting from xen "clockchip"
    xen: Convert printks to pr_<level>
    xen: ifdef CONFIG_HIBERNATE_CALLBACKS xen_*_suspend
    xen/pcifront: Deal with toolstack missing 'XenbusStateClosing' state.
    xen/time: Free onlined per-cpu data structure if we want to online it again.
    xen/time: Check that the per_cpu data structure has data before freeing.
    xen/time: Don't leak interrupt name when offlining.
    xen/time: Encapsulate the struct clock_event_device in another structure.
    xen/spinlock: Don't leak interrupt name when offlining.
    xen/smp: Don't leak interrupt name when offlining.
    xen/smp: Set the per-cpu IRQ number to a valid default.
    xen/smp: Introduce a common structure to contain the IRQ name and interrupt line.
    xen/smp: Coalesce the free_irq calls in one function.
    xen-pciback: fix error return code in pcistub_irq_handler_switch()

    Linus Torvalds
     

28 Jun, 2013

1 commit

  • Convert printks to pr_ (excludes printk(KERN_DEBUG...)
    to be more consistent throughout the xen subsystem.

    Add pr_fmt with KBUILD_MODNAME or "xen:" KBUILD_MODNAME
    Coalesce formats and add missing word spaces
    Add missing newlines
    Align arguments and reflow to 80 columns
    Remove DRV_NAME from formats as pr_fmt adds the same content

    This does change some of the prefixes of these messages
    but it also does make them more consistent.

    Signed-off-by: Joe Perches
    Signed-off-by: Konrad Rzeszutek Wilk

    Joe Perches
     

10 Jun, 2013

1 commit

  • Commit 10a7a0771399a57a297fca9615450dbb3f88081a ("xen: tmem: enable Xen
    tmem shim to be built/loaded as a module") allows the tmem module
    to be loaded any time. For this work the frontswap API had to
    be able to asynchronously to call tmem_frontswap_init before
    or after the swap image had been set. That was added in git
    commit 905cd0e1bf9ffe82d6906a01fd974ea0f70be97a
    ("mm: frontswap: lazy initialization to allow tmem backends to build/run as modules").

    Which means we could do this (The common case):

    modprobe tmem [so calls frontswap_register_ops, no ->init]
    modifies tmem_frontswap_poolid = -1
    swapon /dev/xvda1 [__frontswap_init, calls -> init, tmem_frontswap_poolid is
    < 0 so tmem hypercall done]

    Or the failing one:

    swapon /dev/xvda1 [calls __frontswap_init, sets the need_init bitmap]
    modprobe tmem [calls frontswap_register_ops, -->init calls, finds out
    tmem_frontswap_poolid is 0, does not make a hypercall.
    Later in the module_init, sets tmem_frontswap_poolid=-1]

    Which meant that in the failing case we would not call the hypercall
    to initialize the pool and never be able to make any frontswap
    backend calls.

    Moving the frontswap_register_ops after setting the tmem_frontswap_poolid
    fixes it.

    Signed-off-by: Konrad Rzeszutek Wilk
    Reviewed-by: Bob Liu

    Konrad Rzeszutek Wilk
     

28 May, 2013

1 commit

  • In the (not so useful) kernel configuration where CONFIG_SWAP
    is undefined and CONFIG_XEN_SELFBALLOONING is defined,
    xen_tmem_init would use undefined variable 'static bool frontswap'.

    Added #else to have #define frontswap (0) in the case where
    CONFIG_FRONTSWAP is not defined.

    Signed-off-by: Frederico Cadete
    Signed-off-by: Konrad Rzeszutek Wilk

    Frederico Cadete
     

15 May, 2013

7 commits


01 May, 2013

4 commits

  • In the past it either used to be NULL or the "older" backend. Now we
    also return -Exx error codes.

    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Bob Liu
    Cc: Wanpeng Li
    Cc: Andor Daam
    Cc: Dan Magenheimer
    Cc: Florian Schmaus
    Cc: Minchan Kim
    Cc: Stefan Hengelein
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konrad Rzeszutek Wilk
     
  • Allow Xen tmem shim to be built/loaded as a module. Xen self-ballooning
    and frontswap-selfshrinking are now also "lazily" initialized when the
    Xen tmem shim is loaded as a module, unless explicitly disabled by
    module parameters.

    Note runtime dependency disallows loading if cleancache/frontswap lazy
    initialization patches are not present.

    If built-in (not built as a module), the original mechanism of enabling
    via a kernel boot parameter is retained, but this should be considered
    deprecated.

    Note that module unload is explicitly not yet supported.

    [v1: Removed the [CLEANCACHE|FRONTSWAP]_HAS_LAZY_INIT ifdef]
    [v2: Squashed the xen/tmem: Remove the subsys call patch in]
    [akpm@linux-foundation.org: fix build (disable_frontswap_selfshrinking undeclared)]
    Signed-off-by: Dan Magenheimer
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Bob Liu
    Cc: Wanpeng Li
    Cc: Andor Daam
    Cc: Florian Schmaus
    Cc: Minchan Kim
    Cc: Stefan Hengelein
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Dan Magenheimer
     
  • Instead of using a backend_registered to determine whether a backend is
    enabled. This allows us to remove the backend_register check and just
    do 'if (cleancache_ops)'

    [v1: Rebase on top of b97c4b430b0a (ramster->zcache move]
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Bob Liu
    Cc: Wanpeng Li
    Cc: Andor Daam
    Cc: Dan Magenheimer
    Cc: Florian Schmaus
    Cc: Minchan Kim
    Cc: Stefan Hengelein
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konrad Rzeszutek Wilk
     
  • This simplifies the code in the frontswap - we can get rid of the
    'backend_registered' test and instead check against frontswap_ops.

    [v1: Rebase on top of 703ba7fe5e0 (ramster->zcache move]
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Bob Liu
    Cc: Wanpeng Li
    Cc: Andor Daam
    Cc: Dan Magenheimer
    Cc: Florian Schmaus
    Cc: Minchan Kim
    Cc: Stefan Hengelein
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Konrad Rzeszutek Wilk
     

20 Feb, 2013

1 commit


22 Aug, 2012

1 commit


15 May, 2012

1 commit


25 Mar, 2012

1 commit

  • Pull more xen updates from Konrad Rzeszutek Wilk:
    "One tiny feature that accidentally got lost in the initial git pull:
    * Add fast-EOI acking of interrupts (clear a bit instead of
    hypercall)
    And bug-fixes:
    * Fix CPU bring-up code missing a call to notify other subsystems.
    * Fix reading /sys/hypervisor even if PVonHVM drivers are not loaded.
    * In Xen ACPI processor driver: remove too verbose WARN messages, fix
    up the Kconfig dependency to be a module by default, and add
    dependency on CPU_FREQ.
    * Disable CPU frequency drivers from loading when booting under Xen
    (as we want the Xen ACPI processor to be used instead).
    * Cleanups in tmem code."

    * tag 'stable/for-linus-3.4-tag-two' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen/acpi: Fix Kconfig dependency on CPU_FREQ
    xen: initialize platform-pci even if xen_emul_unplug=never
    xen/smp: Fix bringup bug in AP code.
    xen/acpi: Remove the WARN's as they just create noise.
    xen/tmem: cleanup
    xen: support pirq_eoi_map
    xen/acpi-processor: Do not depend on CPU frequency scaling drivers.
    xen/cpufreq: Disable the cpu frequency scaling drivers from loading.
    provide disable_cpufreq() function to disable the API.

    Linus Torvalds
     

21 Mar, 2012

1 commit

  • Use 'bool' for boolean variables. Do proper section placement.
    Eliminate an unnecessary export.

    Signed-off-by: Jan Beulich
    Acked-by: Dan Magenheimer
    Signed-off-by: Konrad Rzeszutek Wilk

    Jan Beulich
     

24 Jan, 2012

1 commit

  • Complete the renaming from "flush" to "invalidate" across
    both tmem frontends (cleancache and frontswap) and both tmem backends
    (Xen and zcache), as required by akpm.

    This change is completely cosmetic.

    [v10: no change]
    [v9: akpm@linux-foundation.org: change "flush" to "invalidate", part 3]
    Signed-off-by: Dan Magenheimer
    Cc: Kamezawa Hiroyuki
    Cc: Jan Beulich
    Acked-by: Seth Jennings
    Cc: Jeremy Fitzhardinge
    Cc: Hugh Dickins
    Cc: Johannes Weiner
    Cc: Nitin Gupta
    Cc: Matthew Wilcox
    Cc: Chris Mason
    Cc: Rik Riel
    Cc: Andrew Morton
    [v11: Remove the frontswap part]
    Signed-off-by: Konrad Rzeszutek Wilk

    Dan Magenheimer
     

18 Jun, 2011

1 commit


27 May, 2011

1 commit

  • This patch provides a shim between the kernel-internal cleancache
    API (see Documentation/mm/cleancache.txt) and the Xen Transcendent
    Memory ABI (see http://oss.oracle.com/projects/tmem).

    Xen tmem provides "hypervisor RAM" as an ephemeral page-oriented
    pseudo-RAM store for cleancache pages, shared cleancache pages,
    and frontswap pages. Tmem provides enterprise-quality concurrency,
    full save/restore and live migration support, compression
    and deduplication.

    A presentation showing up to 8% faster performance and up to 52%
    reduction in sectors read on a kernel compile workload, despite
    aggressive in-kernel page reclamation ("self-ballooning") can be
    found at:

    http://oss.oracle.com/projects/tmem/dist/documentation/presentations/TranscendentMemoryXenSummit2010.pdf

    Signed-off-by: Dan Magenheimer
    Reviewed-by: Jeremy Fitzhardinge
    Cc: Konrad Rzeszutek Wilk
    Cc: Matthew Wilcox
    Cc: Nick Piggin
    Cc: Mel Gorman
    Cc: Rik Van Riel
    Cc: Jan Beulich
    Cc: Chris Mason
    Cc: Andreas Dilger
    Cc: Ted Ts'o
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Nitin Gupta

    Dan Magenheimer