07 Mar, 2019

1 commit

  • Pull char/misc driver updates from Greg KH:
    "Here is the big char/misc driver patch pull request for 5.1-rc1.

    The largest thing by far is the new habanalabs driver for their AI
    accelerator chip. For now it is in the drivers/misc directory but will
    probably move to a new directory soon along with other drivers of this
    type.

    Other than that, just the usual set of individual driver updates and
    fixes. There's an "odd" merge in here from the DRM tree that they
    asked me to do as the MEI driver is starting to interact with the i915
    driver, and it needed some coordination. All of those patches have
    been properly acked by the relevant subsystem maintainers.

    All of these have been in linux-next with no reported issues, most for
    quite some time"

    * tag 'char-misc-5.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (219 commits)
    habanalabs: adjust Kconfig to fix build errors
    habanalabs: use %px instead of %p in error print
    habanalabs: use do_div for 64-bit divisions
    intel_th: gth: Fix an off-by-one in output unassigning
    habanalabs: fix little-endiancpu conversion warnings
    habanalabs: use NULL to initialize array of pointers
    habanalabs: fix little-endiancpu conversion warnings
    habanalabs: soft-reset device if context-switch fails
    habanalabs: print pointer using %p
    habanalabs: fix memory leak with CBs with unaligned size
    habanalabs: return correct error code on MMU mapping failure
    habanalabs: add comments in uapi/misc/habanalabs.h
    habanalabs: extend QMAN0 job timeout
    habanalabs: set DMA0 completion to SOB 1007
    habanalabs: fix validation of WREG32 to DMA completion
    habanalabs: fix mmu cache registers init
    habanalabs: disable CPU access on timeouts
    habanalabs: add MMU DRAM default page mapping
    habanalabs: Dissociate RAZWI info from event types
    misc/habanalabs: adjust Kconfig to fix build errors
    ...

    Linus Torvalds
     

06 Mar, 2019

1 commit

  • Mark inflated and never onlined pages PG_offline, to tell the world that
    the content is stale and should not be dumped.

    [david@redhat.com: use vmballoon_page_in_frames more widely]
    Link: http://lkml.kernel.org/r/20181122100627.5189-7-david@redhat.com
    Link: http://lkml.kernel.org/r/20181119101616.8901-7-david@redhat.com
    Signed-off-by: David Hildenbrand
    Acked-by: Nadav Amit
    Cc: Xavier Deguillard
    Cc: Nadav Amit
    Cc: Arnd Bergmann
    Cc: Greg Kroah-Hartman
    Cc: Julien Freche
    Cc: Matthew Wilcox
    Cc: Michal Hocko
    Cc: "Michael S. Tsirkin"
    Cc: Alexander Duyck
    Cc: Alexey Dobriyan
    Cc: Baoquan He
    Cc: Borislav Petkov
    Cc: Boris Ostrovsky
    Cc: Christian Hansen
    Cc: Dave Young
    Cc: David Rientjes
    Cc: Haiyang Zhang
    Cc: Jonathan Corbet
    Cc: Juergen Gross
    Cc: Kairui Song
    Cc: Kazuhito Hagio
    Cc: "Kirill A. Shutemov"
    Cc: Konstantin Khlebnikov
    Cc: "K. Y. Srinivasan"
    Cc: Len Brown
    Cc: Lianbo Jiang
    Cc: Michal Hocko
    Cc: Mike Rapoport
    Cc: Miles Chen
    Cc: Naoya Horiguchi
    Cc: Omar Sandoval
    Cc: Pankaj gupta
    Cc: Pavel Machek
    Cc: Pavel Tatashin
    Cc: Rafael J. Wysocki
    Cc: "Rafael J. Wysocki"
    Cc: Stefano Stabellini
    Cc: Stephen Hemminger
    Cc: Stephen Rothwell
    Cc: Vitaly Kuznetsov
    Cc: Vlastimil Babka
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    David Hildenbrand
     

12 Feb, 2019

1 commit


08 Feb, 2019

2 commits

  • Currently, the balloon driver would fail to run if memory is greater
    than 16TB of vRAM. Previous patches have already converted the balloon
    target and size to 64-bit, so all that is left to do add is to avoid
    asserting memory is smaller than 16TB if the hypervisor supports 64-bits
    target.

    The driver advertises a new capability VMW_BALLOON_64_BITS_TARGET.
    Hypervisors that support 16TB of memory or more will report that this
    capability is enabled.

    Signed-off-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Xavier Deguillard
     
  • Following the new kernel policy.

    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     

29 Dec, 2018

2 commits

  • Pull char/misc driver updates from Greg KH:
    "Here is the big set of char and misc driver patches for 4.21-rc1.

    Lots of different types of driver things in here, as this tree seems
    to be the "collection of various driver subsystems not big enough to
    have their own git tree" lately.

    Anyway, some highlights of the changes in here:

    - binderfs: is it a rule that all driver subsystems will eventually
    grow to have their own filesystem? Binder now has one to handle the
    use of it in containerized systems.

    This was discussed at the Plumbers conference a few months ago and
    knocked into mergable shape very fast by Christian Brauner. Who
    also has signed up to be another binder maintainer, showing a
    distinct lack of good judgement :)

    - binder updates and fixes

    - mei driver updates

    - fpga driver updates and additions

    - thunderbolt driver updates

    - soundwire driver updates

    - extcon driver updates

    - nvmem driver updates

    - hyper-v driver updates

    - coresight driver updates

    - pvpanic driver additions and reworking for more device support

    - lp driver updates. Yes really, it's _finally_ moved to the proper
    parallal port driver model, something I never thought I would see
    happen. Good stuff.

    - other tiny driver updates and fixes.

    All of these have been in linux-next for a while with no reported
    issues"

    * tag 'char-misc-4.21-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (116 commits)
    MAINTAINERS: add another Android binder maintainer
    intel_th: msu: Fix an off-by-one in attribute store
    stm class: Add a reference to the SyS-T document
    stm class: Fix a module refcount leak in policy creation error path
    char: lp: use new parport device model
    char: lp: properly count the lp devices
    char: lp: use first unused lp number while registering
    char: lp: detach the device when parallel port is removed
    char: lp: introduce list to save port number
    bus: qcom: remove duplicated include from qcom-ebi2.c
    VMCI: Use memdup_user() rather than duplicating its implementation
    char/rtc: Use of_node_name_eq for node name comparisons
    misc: mic: fix a DMA pool free failure
    ptp: fix an IS_ERR() vs NULL check
    genwqe: Fix size check
    binder: implement binderfs
    binder: fix use-after-free due to ksys_close() during fdget()
    bus: fsl-mc: remove duplicated include files
    bus: fsl-mc: explicitly define the fsl_mc_command endianness
    misc: ti-st: make array read_ver_cmd static, shrinks object size
    ...

    Linus Torvalds
     
  • totalram_pages and totalhigh_pages are made static inline function.

    Main motivation was that managed_page_count_lock handling was complicating
    things. It was discussed in length here,
    https://lore.kernel.org/patchwork/patch/995739/#1181785 So it seemes
    better to remove the lock and convert variables to atomic, with preventing
    poteintial store-to-read tearing as a bonus.

    [akpm@linux-foundation.org: coding style fixes]
    Link: http://lkml.kernel.org/r/1542090790-21750-4-git-send-email-arunks@codeaurora.org
    Signed-off-by: Arun KS
    Suggested-by: Michal Hocko
    Suggested-by: Vlastimil Babka
    Reviewed-by: Konstantin Khlebnikov
    Reviewed-by: Pavel Tatashin
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Cc: David Hildenbrand
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun KS
     

06 Dec, 2018

1 commit

  • We already have the DEFINE_SHOW_ATTRIBUTE.There is no need to define
    such a macro,so remove GENWQE_DEBUGFS_RO.Also use DEFINE_SHOW_ATTRIBUTE
    to simplify some code.

    Signed-off-by: Yangtao Li
    Signed-off-by: Greg Kroah-Hartman

    Yangtao Li
     

26 Sep, 2018

14 commits

  • It is useful to expose how many times the balloon resets. If it happens
    more than very rarely - this is an indication for a problem.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • Change all the remaining return values to int to avoid mistakes. Reduce
    indentation when possible.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • In preparation for supporting compaction and OOM notification, this
    patch reworks the inflate/deflate loops. The main idea is to separate
    the allocation, communication with the hypervisor, and the handling of
    errors from each other. Doing will allow us to perform concurrent
    inflation and deflation, excluding the actual communication with the
    hypervisor.

    To do so, we need to get rid of the remaining global state that is kept
    in the balloon struct, specifically the refuse_list. When the VM
    communicates with the hypervisor, it does not free or put back pages
    to the balloon list and instead only moves the pages whose status
    indicated failure into a refuse_list on the stack. Once the operation
    completes, the inflation or deflation functions handle the list
    appropriately.

    As we do that, we can consolidate the communication with the hypervisor
    for both the lock and unlock operations into a single function. We also
    reuse the deflation function for popping the balloon.

    As a preparation for preventing races, we hold a spinlock when the
    communication actually takes place, and use atomic operations for
    updating the balloon size. The balloon page list is still racy and will
    be handled in the next patch.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • To allow the balloon statistics to be updated concurrently, we change
    the statistics to be held per core and aggregate it when needed.

    To avoid the memory overhead of keeping the statistics per core, and
    since it is likely not used by most users, we start updating the
    statistics only after the first use. A read-write semaphore is used to
    protect the statistics initialization and avoid races. This semaphore is
    (and will) be used to protect configuration changes during reset.

    While we are at it, address some other issues: change the statistics
    update to inline functions instead of define; use ulong for saving the
    statistics; and clean the statistics printouts.

    Note that this patch changes the format of the outputs. If there are any
    automatic tools that use the statistics, they might fail.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • As we want to leave as little as possible on the global balloon
    structure, to avoid possible future races, we want to get rid sysinfo.
    We can actually get the total_ram directly, and simplify the logic of
    vmballoon_send_get_target() a little.

    While we are doing that, let's return int and avoid mistakes due to
    bool/int conversions.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • The required change in the balloon size is currently computed in
    vmballoon_work(), vmballoon_inflate() and vmballoon_deflate(). Refactor
    it to simplify the next patches.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • The name of the macro'd VMW_BALLOON_2M_SHIFT is misleading. The value
    reflects 2M huge-page order. Unfortunately, we cannot use
    HPAGE_PMD_ORDER, since it is not defined when transparent huge-pages are
    off, so we need to define our own one.

    Rename it to VMW_BALLOON_2M_ORDER. No functional change.

    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • Currently, when the hypervisor rejects a page during lock operation, the
    VM treats pages differently according to the error-code: in certain
    cases the page is immediately freed, and in others it is put on a
    rejection list and only freed later.

    The behavior does not make too much sense. If the page is freed
    immediately it is very likely to be used again in the next batch of
    allocations, and be rejected again.

    In addition, for support of compaction and OOM notifiers, we wish to
    separate the logic that communicates with the hypervisor (as well as
    analyzes the status of each page) from the logic that allocates or free
    pages.

    Treat all errors the same way, queuing the pages on the refuse list.
    Move to the next allocation size (4k) when too many pages are refused.
    Free the refused pages when moving to the next size to avoid situations
    in which too much memory is waiting to be freed on the refused list.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • The current abstractions for batch vs single operations seem suboptimal
    and complicate the implementation of additional features (OOM,
    compaction).

    The immediate problem of the current abstractions is that they cause
    differences in how operations are handled when batching is on or off.
    For example, the refused_alloc counter is not updated when batching is
    on. These discrepancies are caused by code redundancies.

    Instead, this patch presents three type of operations, according to
    whether batching is on or off: (1) add page, (2) communication with the
    hypervisor and (3) retrieving the status of a page.

    To avoid the overhead of virtual functions, and since we do not expect
    additional interfaces for communication with the hypervisor, we use
    static keys instead of virtual functions.

    Finally, while we are at it, change vmballoon_init_batching() to return
    int instead of bool, to be consistent in the return type and avoid
    potential coding errors.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • Splitting the allocations between sleeping and non-sleeping made some
    sort of sense as long as rate-limiting was enabled. Now that it is
    removed, we need to decide - either we want sleeping allocations or not.

    Since no other Linux balloon driver (hv, Xen, virtio) uses sleeping
    allocations, use the same approach.

    We do distinguish, however, between 2MB allocations and 4kB allocations
    and prevent reclamation on 2MB. In both cases, we avoid using emergency
    low-memory pools, as it may cause undesired effects.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • The use of accessors for batch entries complicates the code and makes it
    less readable. Remove it an instead use bit-fields.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • The lock and unlock code paths are very similar, so avoid the duplicate
    code by merging them together.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • Now that we have a single point, unify the tracing and collecting the
    statistics for commands and their failure. While it might somewhat
    reduce the control over debugging, it cleans the code a lot.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • By inlining the hypercall interface, we can unify several operations
    into one central point in the code:

    - Updating the target.
    - Updating when a reset is needed.
    - Update statistics (which will be done later in the patch-set).
    - Print debug-messages (although they cannot be enabled as selectively).

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     

16 Jul, 2018

1 commit


03 Jul, 2018

7 commits

  • Embarrassingly, the recent fix introduced worse problem than it solved,
    causing the balloon not to inflate. The VM informed the hypervisor that
    the pages for lock/unlock are sitting in the wrong address, as it used
    the page that is used the uninitialized page variable.

    Fixes: b23220fe054e9 ("vmw_balloon: fixing double free when batching mode is off")
    Cc: stable@vger.kernel.org
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • Removing the GPL wording and replace it with an SPDX tag. The immediate
    trigger for doing it now is the need to remove the list of maintainers
    from the source file, as the maintainer list changed.

    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • Since commit 33d268ed0019 ("VMware balloon: Do not limit the amount of
    frees and allocations in non-sleep mode."), the allocations are not
    increased, and therefore balloon inflation rate limiting is in practice
    broken.

    While we can restore rate limiting, in practice we see that it can
    result in adverse effect, as the hypervisor throttles down the VM if it
    does not respond well enough, or alternatively causes it to perform very
    poorly as the host swaps out the VM memory. Throttling the VM down can
    even have a cascading effect, in which the VM reclaims memory even
    slower and consequentially throttled down even further.

    We therefore remove all the rate limiting mechanisms, including the slow
    allocation cycles, as they are likely to do more harm than good.

    Fixes: 33d268ed0019 ("VMware balloon: Do not limit the amount of frees and allocations in non-sleep mode.")
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • Currently, when all modules, including VMCI and VMware balloon are built
    into the kernel, the initialization of the balloon happens before the
    VMCI is probed. As a result, the balloon fails to initialize the VMCI
    doorbell, which it uses to get asynchronous requests for balloon size
    changes.

    The problem can be seen in the logs, in the form of the following
    message:
    "vmw_balloon: failed to initialize vmci doorbell"

    The driver would work correctly but slightly less efficiently, probing
    for requests periodically. This patch changes the balloon to be
    initialized using late_initcall() instead of module_init() to address
    this issue. It does not address a situation in which VMCI is built as a
    module and the balloon is built into the kernel.

    Fixes: 48e3d668b790 ("VMware balloon: Enable notification via VMCI")
    Cc: stable@vger.kernel.org
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • When vmballoon_vmci_init() sets a doorbell using VMCI_DOORBELL_SET, for
    some reason it does not consider the status and looks at the result.
    However, the hypervisor does not update the result - it updates the
    status. This might cause VMCI doorbell not to be enabled, resulting in
    degraded performance.

    Fixes: 48e3d668b790 ("VMware balloon: Enable notification via VMCI")
    Cc: stable@vger.kernel.org
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • If the hypervisor sets 2MB batching is on, while batching is cleared,
    the balloon code breaks. In this case the legacy mechanism is used with
    2MB page. The VM would report a 2MB page is ballooned, and the
    hypervisor would only take the first 4KB.

    While the hypervisor should not report such settings, make the code more
    robust by not enabling 2MB support without batching.

    Fixes: 365bd7ef7ec8e ("VMware balloon: Support 2m page ballooning.")
    Cc: stable@vger.kernel.org
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     
  • When balloon batching is not supported by the hypervisor, the guest
    frame number (GFN) must fit in 32-bit. However, due to a bug, this check
    was mistakenly ignored. In practice, when total RAM is greater than
    16TB, the balloon does not work currently, making this bug unlikely to
    happen.

    Fixes: ef0f8f112984 ("VMware balloon: partially inline vmballoon_reserve_page.")
    Cc: stable@vger.kernel.org
    Reviewed-by: Xavier Deguillard
    Signed-off-by: Nadav Amit
    Signed-off-by: Greg Kroah-Hartman

    Nadav Amit
     

03 Jun, 2018

1 commit

  • The balloon.page field is used for two different purposes if batching is
    on or off. If batching is on, the field point to the page which is used
    to communicate with with the hypervisor. If it is off, balloon.page
    points to the page that is about to be (un)locked.

    Unfortunately, this dual-purpose of the field introduced a bug: when the
    balloon is popped (e.g., when the machine is reset or the balloon driver
    is explicitly removed), the balloon driver frees, unconditionally, the
    page that is held in balloon.page. As a result, if batching is
    disabled, this leads to double freeing the last page that is sent to the
    hypervisor.

    The following error occurs during rmmod when kernel checkers are on, and
    the balloon is not empty:

    [ 42.307653] ------------[ cut here ]------------
    [ 42.307657] Kernel BUG at ffffffffba1e4b28 [verbose debug info unavailable]
    [ 42.307720] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
    [ 42.312512] Modules linked in: vmw_vsock_vmci_transport vsock ppdev joydev vmw_balloon(-) input_leds serio_raw vmw_vmci parport_pc shpchp parport i2c_piix4 nfit mac_hid autofs4 vmwgfx drm_kms_helper hid_generic syscopyarea sysfillrect usbhid sysimgblt fb_sys_fops hid ttm mptspi scsi_transport_spi ahci mptscsih drm psmouse vmxnet3 libahci mptbase pata_acpi
    [ 42.312766] CPU: 10 PID: 1527 Comm: rmmod Not tainted 4.12.0+ #5
    [ 42.312803] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/30/2016
    [ 42.313042] task: ffff9bf9680f8000 task.stack: ffffbfefc1638000
    [ 42.313290] RIP: 0010:__free_pages+0x38/0x40
    [ 42.313510] RSP: 0018:ffffbfefc163be98 EFLAGS: 00010246
    [ 42.313731] RAX: 000000000000003e RBX: ffffffffc02b9720 RCX: 0000000000000006
    [ 42.313972] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9bf97e08e0a0
    [ 42.314201] RBP: ffffbfefc163be98 R08: 0000000000000000 R09: 0000000000000000
    [ 42.314435] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffc02b97e4
    [ 42.314505] R13: ffffffffc02b9748 R14: ffffffffc02b9728 R15: 0000000000000200
    [ 42.314550] FS: 00007f3af5fec700(0000) GS:ffff9bf97e080000(0000) knlGS:0000000000000000
    [ 42.314599] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    [ 42.314635] CR2: 00007f44f6f4ab24 CR3: 00000003a7d12000 CR4: 00000000000006e0
    [ 42.314864] Call Trace:
    [ 42.315774] vmballoon_pop+0x102/0x130 [vmw_balloon]
    [ 42.315816] vmballoon_exit+0x42/0xd64 [vmw_balloon]
    [ 42.315853] SyS_delete_module+0x1e2/0x250
    [ 42.315891] entry_SYSCALL_64_fastpath+0x23/0xc2
    [ 42.315924] RIP: 0033:0x7f3af5b0e8e7
    [ 42.315949] RSP: 002b:00007fffe6ce0148 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
    [ 42.315996] RAX: ffffffffffffffda RBX: 000055be676401e0 RCX: 00007f3af5b0e8e7
    [ 42.316951] RDX: 000000000000000a RSI: 0000000000000800 RDI: 000055be67640248
    [ 42.317887] RBP: 0000000000000003 R08: 0000000000000000 R09: 1999999999999999
    [ 42.318845] R10: 0000000000000883 R11: 0000000000000206 R12: 00007fffe6cdf130
    [ 42.319755] R13: 0000000000000000 R14: 0000000000000000 R15: 000055be676401e0
    [ 42.320606] Code: c0 74 1c f0 ff 4f 1c 74 02 5d c3 85 f6 74 07 e8 0f d8 ff ff 5d c3 31 f6 e8 c6 fb ff ff 5d c3 48 c7 c6 c8 0f c5 ba e8 58 be 02 00 0b 66 0f 1f 44 00 00 66 66 66 66 90 48 85 ff 75 01 c3 55 48
    [ 42.323462] RIP: __free_pages+0x38/0x40 RSP: ffffbfefc163be98
    [ 42.325735] ---[ end trace 872e008e33f81508 ]---

    To solve the bug, we eliminate the dual purpose of balloon.page.

    Fixes: f220a80f0c2e ("VMware balloon: add batching to the vmw_balloon.")
    Cc: stable@vger.kernel.org
    Reported-by: Oleksandr Natalenko
    Signed-off-by: Gil Kupfer
    Signed-off-by: Nadav Amit
    Reviewed-by: Xavier Deguillard
    Tested-by: Oleksandr Natalenko
    Signed-off-by: Greg Kroah-Hartman

    Gil Kupfer
     

10 Nov, 2017

1 commit

  • The x86_hyper pointer is only used for checking whether a virtual
    device is supporting the hypervisor the system is running on.

    Use an enum for that purpose instead and drop the x86_hyper pointer.

    Signed-off-by: Juergen Gross
    Acked-by: Thomas Gleixner
    Acked-by: Xavier Deguillard
    Cc: Linus Torvalds
    Cc: Peter Zijlstra
    Cc: akataria@vmware.com
    Cc: arnd@arndb.de
    Cc: boris.ostrovsky@oracle.com
    Cc: devel@linuxdriverproject.org
    Cc: dmitry.torokhov@gmail.com
    Cc: gregkh@linuxfoundation.org
    Cc: haiyangz@microsoft.com
    Cc: kvm@vger.kernel.org
    Cc: kys@microsoft.com
    Cc: linux-graphics-maintainer@vmware.com
    Cc: linux-input@vger.kernel.org
    Cc: moltmann@vmware.com
    Cc: pbonzini@redhat.com
    Cc: pv-drivers@vmware.com
    Cc: rkrcmar@redhat.com
    Cc: sthemmin@microsoft.com
    Cc: virtualization@lists.linux-foundation.org
    Cc: xen-devel@lists.xenproject.org
    Link: http://lkml.kernel.org/r/20171109132739.23465-3-jgross@suse.com
    Signed-off-by: Ingo Molnar

    Juergen Gross
     

07 Nov, 2015

1 commit

  • __GFP_WAIT was used to signal that the caller was in atomic context and
    could not sleep. Now it is possible to distinguish between true atomic
    context and callers that are not willing to sleep. The latter should
    clear __GFP_DIRECT_RECLAIM so kswapd will still wake. As clearing
    __GFP_WAIT behaves differently, there is a risk that people will clear the
    wrong flags. This patch renames __GFP_WAIT to __GFP_RECLAIM to clearly
    indicate what it does -- setting it allows all reclaim activity, clearing
    them prevents it.

    [akpm@linux-foundation.org: fix build]
    [akpm@linux-foundation.org: coding-style fixes]
    Signed-off-by: Mel Gorman
    Acked-by: Michal Hocko
    Acked-by: Vlastimil Babka
    Acked-by: Johannes Weiner
    Cc: Christoph Lameter
    Acked-by: David Rientjes
    Cc: Vitaly Wool
    Cc: Rik van Riel
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Mel Gorman
     

04 Oct, 2015

7 commits

  • Get notified immediately when a balloon target is set, instead of waiting for
    up to one second.

    The up-to 1 second gap could be long enough to cause swapping inside of the
    VM that receives the VM.

    Acked-by: Andy King
    Signed-off-by: Xavier Deguillard
    Tested-by: Siva Sankar Reddy B
    Signed-off-by: Greg Kroah-Hartman

    Philip P. Moltmann
     
  • Unify the behavior of the first start of the balloon and a reset. Also on
    unload, declare that the balloon driver does not have any capabilities
    anymore.

    Acked-by: Andy King
    Signed-off-by: Xavier Deguillard
    Signed-off-by: Greg Kroah-Hartman

    Philip P. Moltmann
     
  • 2m ballooning significantly reduces the hypervisor side (and guest side)
    overhead of ballooning and unballooning.

    hypervisor only:
    balloon unballoon
    4 KB 2 GB/s 2.6 GB/s
    2 MB 54 GB/s 767 GB/s

    Use 2 MB pages as the hypervisor is alwys 64bit and 2 MB is the smallest
    supported super-page size.

    The code has to run on older versions of ESX and old balloon drivers run on
    newer version of ESX. Hence match the capabilities with the host before 2m
    page ballooning could be enabled.

    Signed-off-by: Xavier Deguillard
    Signed-off-by: Greg Kroah-Hartman

    Philip P. Moltmann
     
  • When VMware's hypervisor requests a VM to reclaim memory this is preferrably done
    via ballooning. If the balloon driver does not return memory fast enough, more
    drastic methods, such as hypervisor-level swapping are needed. These other methods
    cause performance issues, e.g. hypervisor-level swapping requires the hypervisor to
    swap in a page syncronously while the virtual CPU is blocked.

    Hence it is in the interest of the VM to balloon memory as fast as possible. The
    problem with doing this is that the VM might end up doing nothing else than
    ballooning and the user might notice that the VM is stalled, esp. when the VM has
    only a single virtual CPU.

    This is less of a problem if the VM and the hypervisor perform balloon operations
    faster. Also the balloon driver yields regularly, hence on a single virtual CPU
    the Linux scheduler should be able to properly time-slice between ballooning and
    other tasks.

    Testing Done: quickly ballooned a lot of pages while wathing if there are any
    perceived hickups (periods of non-responsiveness) in the execution of the
    linux VM. No such hickups were seen.

    Signed-off-by: Xavier Deguillard
    Signed-off-by: Greg Kroah-Hartman

    Philip P. Moltmann
     
  • This helps with debugging vmw_balloon behavior, as it is clear what
    functionality is enabled.

    Acked-by: Andy King
    Signed-off-by: Xavier Deguillard
    Signed-off-by: Greg Kroah-Hartman

    Philip P. Moltmann
     
  • Instead of waiting for the next GET_TARGET command, we can react faster
    by exploiting the fact that each hypervisor call also returns the
    balloon target.

    Signed-off-by: Xavier Deguillard
    Acked-by: Dmitry Torokhov
    Signed-off-by: Philip P. Moltmann
    Acked-by: Andy King
    Signed-off-by: Greg Kroah-Hartman

    Xavier Deguillard
     
  • Introduce a new capability to the driver that allow sending 512 pages in
    one hypervisor call. This reduce the cost of the driver when reclaiming
    memory.

    Signed-off-by: Xavier Deguillard
    Acked-by: Dmitry Torokhov
    Signed-off-by: Philip P. Moltmann
    Signed-off-by: Greg Kroah-Hartman

    Xavier Deguillard