06 Apr, 2019

1 commit

  • [ Upstream commit fa13e665e02874c0a5f4d06d6967ae34a6cb3d6a ]

    If there are exported DMA buffers which are still in use and
    grant device is closed by either normal user-space close or by
    a signal this leads to the grant device context to be destroyed,
    thus making it not possible to correctly destroy those exported
    buffers when they are returned back to gntdev and makes the module
    crash:

    [ 339.617540] [] dmabuf_exp_ops_release+0x40/0xa8
    [ 339.617560] [] dma_buf_release+0x60/0x190
    [ 339.617577] [] __fput+0x88/0x1d0
    [ 339.617589] [] ____fput+0xc/0x18
    [ 339.617607] [] task_work_run+0x9c/0xc0
    [ 339.617622] [] do_notify_resume+0xfc/0x108

    Fix this by referencing gntdev on each DMA buffer export and
    unreferencing on buffer release.

    Signed-off-by: Oleksandr Andrushchenko
    Reviewed-by: Boris Ostrovsky@oracle.com>
    Signed-off-by: Juergen Gross
    Signed-off-by: Sasha Levin

    Oleksandr Andrushchenko
     

27 Feb, 2019

7 commits

  • [ Upstream commit b4711098066f1cf808d4dc11a1a842860a3292fe ]

    static checker warning:
    drivers/xen/pvcalls-front.c:373 alloc_active_ring()
    error: we previously assumed 'map->active.ring' could be null
    (see line 357)

    drivers/xen/pvcalls-front.c
    351 static int alloc_active_ring(struct sock_mapping *map)
    352 {
    353 void *bytes;
    354
    355 map->active.ring = (struct pvcalls_data_intf *)
    356 get_zeroed_page(GFP_KERNEL);
    357 if (!map->active.ring)
    ^^^^^^^^^^^^^^^^^
    Check

    358 goto out;
    359
    360 map->active.ring->ring_order = PVCALLS_RING_ORDER;
    361 bytes = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
    362 PVCALLS_RING_ORDER);
    363 if (!bytes)
    364 goto out;
    365
    366 map->active.data.in = bytes;
    367 map->active.data.out = bytes +
    368 XEN_FLEX_RING_SIZE(PVCALLS_RING_ORDER);
    369
    370 return 0;
    371
    372 out:
    --> 373 free_active_ring(map);
    ^^^
    Add null check on map->active.ring before dereferencing it to avoid
    any NULL pointer dereferences.

    Fixes: 9f51c05dc41a ("pvcalls-front: Avoid get_free_pages(GFP_KERNEL) under spinlock")
    Reported-by: Dan Carpenter
    Suggested-by: Boris Ostrovsky
    Signed-off-by: Wen Yang
    Reviewed-by: Boris Ostrovsky
    CC: Boris Ostrovsky
    CC: Juergen Gross
    CC: Stefano Stabellini
    CC: Dan Carpenter
    CC: xen-devel@lists.xenproject.org
    CC: linux-kernel@vger.kernel.org
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Sasha Levin

    Wen Yang
     
  • [ Upstream commit 9f51c05dc41a6d69423e3d03d18eb7ab22f9ec19 ]

    The problem is that we call this with a spin lock held.
    The call tree is:
    pvcalls_front_accept() holds bedata->socket_lock.
    -> create_active()
    -> __get_free_pages() uses GFP_KERNEL

    The create_active() function is only called from pvcalls_front_accept()
    with a spin_lock held, The allocation is not allowed to sleep and
    GFP_KERNEL is not sufficient.

    This issue was detected by using the Coccinelle software.

    v2: Add a function doing the allocations which is called
    outside the lock and passing the allocated data to
    create_active().

    v3: Use the matching deallocators i.e., free_page()
    and free_pages(), respectively.

    v4: It would be better to pre-populate map (struct sock_mapping),
    rather than introducing one more new struct.

    v5: Since allocating the data outside of this call it should also
    be freed outside, when create_active() fails.
    Move kzalloc(sizeof(*map2), GFP_ATOMIC) outside spinlock and
    use GFP_KERNEL instead.

    v6: Drop the superfluous calls.

    Suggested-by: Juergen Gross
    Suggested-by: Boris Ostrovsky
    Suggested-by: Stefano Stabellini
    Signed-off-by: Wen Yang
    Acked-by: Stefano Stabellini
    CC: Julia Lawall
    CC: Boris Ostrovsky
    CC: Juergen Gross
    CC: Stefano Stabellini
    CC: xen-devel@lists.xenproject.org
    CC: linux-kernel@vger.kernel.org
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Sasha Levin

    Wen Yang
     
  • [ Upstream commit 1f8ce09b36c41a026a37a24b20efa32000892a64 ]

    Fixes gcc '-Wunused-but-set-variable' warning:

    drivers/xen/pvcalls-back.c: In function 'pvcalls_sk_state_change':
    drivers/xen/pvcalls-back.c:286:28: warning:
    variable 'intf' set but not used [-Wunused-but-set-variable]

    It not used since e6587cdbd732 ("pvcalls-back: set -ENOTCONN in
    pvcalls_conn_back_read")

    Signed-off-by: YueHaibing
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Sasha Levin

    YueHaibing
     
  • [ Upstream commit e6587cdbd732eacb4c7ce592ed46f7bbcefb655f ]

    When a connection is closing we receive on pvcalls_sk_state_change
    notification. Instead of setting the connection as closed immediately
    (-ENOTCONN), let's read one more time from it: pvcalls_conn_back_read
    will set the connection as closed when necessary.

    That way, we avoid races between pvcalls_sk_state_change and
    pvcalls_back_ioworker.

    Signed-off-by: Stefano Stabellini
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Sasha Levin

    Stefano Stabellini
     
  • [ Upstream commit beee1fbe8f7d57d6ebaa5188f9f4db89c2077196 ]

    Don't use kzalloc: it ends up leaving sk->sk_prot not properly
    initialized. Use sk_alloc instead and define our own trivial struct
    proto.

    Signed-off-by: Stefano Stabellini
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Sasha Levin

    Stefano Stabellini
     
  • [ Upstream commit 96283f9a084e23d7cda2d3c5d1ffa6df6cf1ecec ]

    inflight_req_id is 0 when initialized. If inflight_req_id is 0, there is
    no accept_map to free. Fix the check in pvcalls_front_release.

    Signed-off-by: Stefano Stabellini
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Sasha Levin

    Stefano Stabellini
     
  • [ Upstream commit b79470b64fa9266948d1ce8d825ced94c4f63293 ]

    When a connection is closing in_error is set to ENOTCONN. There could
    still be outstanding data on the ring left by the backend. Before
    closing the connection on the frontend side, drain the ring.

    Signed-off-by: Stefano Stabellini
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Sasha Levin

    Stefano Stabellini
     

23 Jan, 2019

1 commit

  • commit 867cefb4cb1012f42cada1c7d1f35ac8dd276071 upstream.

    Commit f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable'
    sched_clock() interface") broke Xen guest time handling across
    migration:

    [ 187.249951] Freezing user space processes ... (elapsed 0.001 seconds) done.
    [ 187.251137] OOM killer disabled.
    [ 187.251137] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
    [ 187.252299] suspending xenstore...
    [ 187.266987] xen:grant_table: Grant tables using version 1 layout
    [18446743811.706476] OOM killer enabled.
    [18446743811.706478] Restarting tasks ... done.
    [18446743811.720505] Setting capacity to 16777216

    Fix that by setting xen_sched_clock_offset at resume time to ensure a
    monotonic clock value.

    [boris: replaced pr_info() with pr_info_once() in xen_callback_vector()
    to avoid printing with incorrect timestamp during resume (as we
    haven't re-adjusted the clock yet)]

    Fixes: f94c8d11699759 ("sched/clock, x86/tsc: Rework the x86 'unstable' sched_clock() interface")
    Cc: # 4.11
    Reported-by: Hans van Kranenburg
    Signed-off-by: Juergen Gross
    Tested-by: Hans van Kranenburg
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: Greg Kroah-Hartman

    Juergen Gross
     

17 Dec, 2018

3 commits

  • [ Upstream commit 975ef94a0284648fb0137bd5e949b18cef604e33 ]

    kfree() is incorrectly used to release the pages allocated by
    __get_free_page() and __get_free_pages(). Use the matching deallocators
    i.e., free_page() and free_pages(), respectively.

    Signed-off-by: Pan Bian
    Reviewed-by: Stefano Stabellini
    Signed-off-by: Juergen Gross
    Signed-off-by: Sasha Levin

    Pan Bian
     
  • [ Upstream commit 123664101aa2156d05251704fc63f9bcbf77741a ]

    This reverts commit b3cf8528bb21febb650a7ecbf080d0647be40b9f.

    That commit unintentionally broke Xen balloon memory hotplug with
    "hotplug_unpopulated" set to 1. As long as "System RAM" resource
    got assigned under a new "Unusable memory" resource in IO/Mem tree
    any attempt to online this memory would fail due to general kernel
    restrictions on having "System RAM" resources as 1st level only.

    The original issue that commit has tried to workaround fa564ad96366
    ("x86/PCI: Enable a 64bit BAR on AMD Family 15h (Models 00-1f, 30-3f,
    60-7f)") also got amended by the following 03a551734 ("x86/PCI: Move
    and shrink AMD 64-bit window to avoid conflict") which made the
    original fix to Xen ballooning unnecessary.

    Signed-off-by: Igor Druzhinin
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross
    Signed-off-by: Sasha Levin

    Igor Druzhinin
     
  • [ Upstream commit 72791ac854fea36034fa7976b748fde585008e78 ]

    Add a missing header otherwise compiler warns about missed prototype:

    drivers/xen/xlate_mmu.c:183:5: warning: no previous prototype for 'xen_xlate_unmap_gfn_range?' [-Wmissing-prototypes]
    int xen_xlate_unmap_gfn_range(struct vm_area_struct *vma,
    ^~~~~~~~~~~~~~~~~~~~~~~~~

    Signed-off-by: Srikanth Boddepalli
    Reviewed-by: Boris Ostrovsky
    Reviewed-by: Joey Pabalinas
    Signed-off-by: Juergen Gross
    Signed-off-by: Sasha Levin

    Srikanth Boddepalli
     

27 Nov, 2018

1 commit

  • [ Upstream commit d9cccfa7c4d1d9ef967ec9308df7304a18609b30 ]

    If a call to xenmem_reservation_increase() in gnttab_dma_free_pages()
    fails it triggers a message "Failed to decrease reservation..." which
    should be "Failed to increase reservation..."

    Fixes: 9bdc7304f536 ('xen/grant-table: Allow allocating buffers suitable for DMA')
    Reported-by: Ross Philipson
    Signed-off-by: Liam Merwick
    Reviewed-by: Mark Kanda
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross
    Signed-off-by: Sasha Levin

    Liam Merwick
     

14 Nov, 2018

3 commits

  • commit 3941552aec1e04d63999988a057ae09a1c56ebeb upstream.

    Currently the size of hypercall buffers allocated via
    /dev/xen/hypercall is limited to a default of 64 memory pages. For live
    migration of guests this might be too small as the page dirty bitmask
    needs to be sized according to the size of the guest. This means
    migrating a 8GB sized guest is already exhausting the default buffer
    size for the dirty bitmap.

    There is no sensible way to set a sane limit, so just remove it
    completely. The device node's usage is limited to root anyway, so there
    is no additional DOS scenario added by allowing unlimited buffers.

    While at it make the error path for the -ENOMEM case a little bit
    cleaner by setting n_pages to the number of successfully allocated
    pages instead of the target size.

    Fixes: c51b3c639e01f2 ("xen: add new hypercall buffer mapping device")
    Cc: #4.18
    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross
    Signed-off-by: Greg Kroah-Hartman

    Juergen Gross
     
  • commit 3aa6c19d2f38be9c6e9a8ad5fa8e3c9d29ee3c35 upstream.

    Xend-based toolstacks don't have static-max entry in xenstore. The
    equivalent node for those toolstacks is memory_static_max.

    Fixes: 5266b8e4445c (xen: fix booting ballooned down hvm guest)
    Signed-off-by: Boris Ostrovsky
    Cc: # 4.13
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross
    Signed-off-by: Greg Kroah-Hartman

    Boris Ostrovsky
     
  • commit 7250f422da0480d8512b756640f131b9b893ccda upstream.

    xen_swiotlb_{alloc,free}_coherent() allocate/free memory based on the
    order of the pages and not size argument (bytes). This is inconsistent with
    range_straddles_page_boundary and memset which use the 'size' value,
    which may lead to not exchanging memory with Xen (range_straddles_page_boundary()
    returned true). And then the call to xen_swiotlb_free_coherent() would
    actually try to exchange the memory with Xen, leading to the kernel
    hitting an BUG (as the hypercall returned an error).

    This patch fixes it by making the 'size' variable be of the same size
    as the amount of memory allocated.

    CC: stable@vger.kernel.org
    Signed-off-by: Joe Jin
    Cc: Konrad Rzeszutek Wilk
    Cc: Boris Ostrovsky
    Cc: Christoph Helwig
    Cc: Dongli Zhang
    Cc: John Sobecki
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Greg Kroah-Hartman

    Joe Jin
     

19 Sep, 2018

1 commit

  • When a driver domain (e.g. dom0) is running out of maptrack entries it
    can't map any more foreign domain pages. Instead of silently stalling
    the affected domUs issue a rate limited warning in this case in order
    to make it easier to detect that situation.

    Signed-off-by: Juergen Gross
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Juergen Gross
     

14 Sep, 2018

5 commits

  • Patch series "mmu_notifiers follow ups".

    Tetsuo has noticed some fallouts from 93065ac753e4 ("mm, oom: distinguish
    blockable mode for mmu notifiers"). One of them has been fixed and picked
    up by AMD/DRM maintainer [1]. XEN issue is fixed by patch 1. I have also
    clarified expectations about blockable semantic of invalidate_range_end.
    Finally the last patch removes MMU_INVALIDATE_DOES_NOT_BLOCK which is no
    longer used nor needed.

    [1] http://lkml.kernel.org/r/20180824135257.GU29735@dhcp22.suse.cz

    This patch (of 3):

    93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers") has
    introduced blockable parameter to all mmu_notifiers and the notifier has
    to back off when called in !blockable case and it could block down the
    road.

    The above commit implemented that for mn_invl_range_start but both
    in_range checks are done unconditionally regardless of the blockable mode
    and as such they would fail all the time for regular calls. Fix this by
    checking blockable parameter as well.

    Once we are there we can remove the stale TODO. The lock has to be
    sleepable because we wait for completion down in gnttab_unmap_refs_sync.

    Link: http://lkml.kernel.org/r/20180827112623.8992-2-mhocko@kernel.org
    Fixes: 93065ac753e4 ("mm, oom: distinguish blockable mode for mmu notifiers")
    Signed-off-by: Michal Hocko
    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: David Rientjes
    Cc: Jerome Glisse
    Cc: Tetsuo Handa
    Reviewed-by: Juergen Gross
    Signed-off-by: Boris Ostrovsky

    Michal Hocko
     
  • This patch removes duplicate macro useage in events_base.c.

    It also fixes gcc warning:
    variable ‘col’ set but not used [-Wunused-but-set-variable]

    Signed-off-by: Joshua Abraham
    Reviewed-by: Juergen Gross
    Signed-off-by: Boris Ostrovsky

    Josh Abraham
     
  • The command 'xl vcpu-set 0 0', issued in dom0, will crash dom0:

    BUG: unable to handle kernel NULL pointer dereference at 00000000000002d8
    PGD 0 P4D 0
    Oops: 0000 [#1] PREEMPT SMP NOPTI
    CPU: 7 PID: 65 Comm: xenwatch Not tainted 4.19.0-rc2-1.ga9462db-default #1 openSUSE Tumbleweed (unreleased)
    Hardware name: Intel Corporation S5520UR/S5520UR, BIOS S5500.86B.01.00.0050.050620101605 05/06/2010
    RIP: e030:device_offline+0x9/0xb0
    Code: 77 24 00 e9 ce fe ff ff 48 8b 13 e9 68 ff ff ff 48 8b 13 e9 29 ff ff ff 48 8b 13 e9 ea fe ff ff 90 66 66 66 66 90 41 54 55 53 87 d8 02 00 00 01 0f 85 88 00 00 00 48 c7 c2 20 09 60 81 31 f6
    RSP: e02b:ffffc90040f27e80 EFLAGS: 00010203
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
    RDX: ffff8801f3800000 RSI: ffffc90040f27e70 RDI: 0000000000000000
    RBP: 0000000000000000 R08: ffffffff820e47b3 R09: 0000000000000000
    R10: 0000000000007ff0 R11: 0000000000000000 R12: ffffffff822e6d30
    R13: dead000000000200 R14: dead000000000100 R15: ffffffff8158b4e0
    FS: 00007ffa595158c0(0000) GS:ffff8801f39c0000(0000) knlGS:0000000000000000
    CS: e033 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00000000000002d8 CR3: 00000001d9602000 CR4: 0000000000002660
    Call Trace:
    handle_vcpu_hotplug_event+0xb5/0xc0
    xenwatch_thread+0x80/0x140
    ? wait_woken+0x80/0x80
    kthread+0x112/0x130
    ? kthread_create_worker_on_cpu+0x40/0x40
    ret_from_fork+0x3a/0x50

    This happens because handle_vcpu_hotplug_event is called twice. In the
    first iteration cpu_present is still true, in the second iteration
    cpu_present is false which causes get_cpu_device to return NULL.
    In case of cpu#0, cpu_online is apparently always true.

    Fix this crash by checking if the cpu can be hotplugged, which is false
    for a cpu that was just removed.

    Also check if the cpu was actually offlined by device_remove, otherwise
    leave the cpu_present state as it is.

    Rearrange to code to do all work with device_hotplug_lock held.

    Signed-off-by: Olaf Hering
    Reviewed-by: Juergen Gross
    Signed-off-by: Boris Ostrovsky

    Olaf Hering
     
  • Scrubbing pages on initial balloon down can take some time, especially
    in nested virtualization case (nested EPT is slow). When HVM/PVH guest is
    started with memory= significantly lower than maxmem=, all the extra
    pages will be scrubbed before returning to Xen. But since most of them
    weren't used at all at that point, Xen needs to populate them first
    (from populate-on-demand pool). In nested virt case (Xen inside KVM)
    this slows down the guest boot by 15-30s with just 1.5GB needed to be
    returned to Xen.

    Add runtime parameter to enable/disable it, to allow initially disabling
    scrubbing, then enable it back during boot (for example in initramfs).
    Such usage relies on assumption that a) most pages ballooned out during
    initial boot weren't used at all, and b) even if they were, very few
    secrets are in the guest at that time (before any serious userspace
    kicks in).
    Convert CONFIG_XEN_SCRUB_PAGES to CONFIG_XEN_SCRUB_PAGES_DEFAULT (also
    enabled by default), controlling default value for the new runtime
    switch.

    Signed-off-by: Marek Marczykowski-Górecki
    Reviewed-by: Juergen Gross
    Signed-off-by: Boris Ostrovsky

    Marek Marczykowski-Górecki
     
  • When guest receives a sysrq request from the host it acknowledges it by
    writing '\0' to control/sysrq xenstore node. This, however, make xenstore
    watch fire again but xenbus_scanf() fails to parse empty value with "%c"
    format string:

    sysrq: SysRq : Emergency Sync
    Emergency Sync complete
    xen:manage: Error -34 reading sysrq code in control/sysrq

    Ignore -ERANGE the same way we already ignore -ENOENT, empty value in
    control/sysrq is totally legal.

    Signed-off-by: Vitaly Kuznetsov
    Reviewed-by: Wei Liu
    Signed-off-by: Boris Ostrovsky

    Vitaly Kuznetsov
     

31 Aug, 2018

1 commit

  • Pull xen fixes from Juergen Gross:

    - minor cleanup avoiding a warning when building with new gcc

    - a patch to add a new sysfs node for Xen frontend/backend drivers to
    make it easier to obtain the state of a pv device

    - two fixes for 32-bit pv-guests to avoid intermediate L1TF vulnerable
    PTEs

    * tag 'for-linus-4.19b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    x86/xen: remove redundant variable save_pud
    xen: export device state to sysfs
    x86/pae: use 64 bit atomic xchg function in native_ptep_get_and_clear
    x86/xen: don't write ptes directly in 32-bit PV guests

    Linus Torvalds
     

29 Aug, 2018

1 commit

  • Export device state to sysfs to allow for easier get device state.

    Signed-off-by: Joe Jin
    Reviewed-by: Boris Ostrovsky
    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: Konrad Rzeszutek Wilk
    Signed-off-by: Boris Ostrovsky

    Joe Jin
     

24 Aug, 2018

1 commit

  • Pull xen fixes and cleanups from Juergen Gross:
    "Some cleanups, some minor fixes and a fix for a bug introduced in this
    merge window hitting 32-bit PV guests"

    * tag 'for-linus-4.19b-rc1b-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    x86/xen: enable early use of set_fixmap in 32-bit Xen PV guest
    xen: remove unused hypercall functions
    x86/xen: remove unused function xen_auto_xlated_memory_setup()
    xen/ACPI: don't upload Px/Cx data for disabled processors
    x86/Xen: further refine add_preferred_console() invocations
    xen/mcelog: eliminate redundant setting of interface version
    x86/Xen: mark xen_setup_gdt() __init

    Linus Torvalds
     

23 Aug, 2018

1 commit

  • There are several blockable mmu notifiers which might sleep in
    mmu_notifier_invalidate_range_start and that is a problem for the
    oom_reaper because it needs to guarantee a forward progress so it cannot
    depend on any sleepable locks.

    Currently we simply back off and mark an oom victim with blockable mmu
    notifiers as done after a short sleep. That can result in selecting a new
    oom victim prematurely because the previous one still hasn't torn its
    memory down yet.

    We can do much better though. Even if mmu notifiers use sleepable locks
    there is no reason to automatically assume those locks are held. Moreover
    majority of notifiers only care about a portion of the address space and
    there is absolutely zero reason to fail when we are unmapping an unrelated
    range. Many notifiers do really block and wait for HW which is harder to
    handle and we have to bail out though.

    This patch handles the low hanging fruit.
    __mmu_notifier_invalidate_range_start gets a blockable flag and callbacks
    are not allowed to sleep if the flag is set to false. This is achieved by
    using trylock instead of the sleepable lock for most callbacks and
    continue as long as we do not block down the call chain.

    I think we can improve that even further because there is a common pattern
    to do a range lookup first and then do something about that. The first
    part can be done without a sleeping lock in most cases AFAICS.

    The oom_reaper end then simply retries if there is at least one notifier
    which couldn't make any progress in !blockable mode. A retry loop is
    already implemented to wait for the mmap_sem and this is basically the
    same thing.

    The simplest way for driver developers to test this code path is to wrap
    userspace code which uses these notifiers into a memcg and set the hard
    limit to hit the oom. This can be done e.g. after the test faults in all
    the mmu notifier managed memory and set the hard limit to something really
    small. Then we are looking for a proper process tear down.

    [akpm@linux-foundation.org: coding style fixes]
    [akpm@linux-foundation.org: minor code simplification]
    Link: http://lkml.kernel.org/r/20180716115058.5559-1-mhocko@kernel.org
    Signed-off-by: Michal Hocko
    Acked-by: Christian König # AMD notifiers
    Acked-by: Leon Romanovsky # mlx and umem_odp
    Reported-by: David Rientjes
    Cc: "David (ChunMing) Zhou"
    Cc: Paolo Bonzini
    Cc: Alex Deucher
    Cc: David Airlie
    Cc: Jani Nikula
    Cc: Joonas Lahtinen
    Cc: Rodrigo Vivi
    Cc: Doug Ledford
    Cc: Jason Gunthorpe
    Cc: Mike Marciniszyn
    Cc: Dennis Dalessandro
    Cc: Sudeep Dutt
    Cc: Ashutosh Dixit
    Cc: Dimitri Sivanich
    Cc: Boris Ostrovsky
    Cc: Juergen Gross
    Cc: "Jérôme Glisse"
    Cc: Andrea Arcangeli
    Cc: Felix Kuehling
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Michal Hocko
     

21 Aug, 2018

2 commits

  • This is unnecessary and triggers a warning in the hypervisor.

    Often systems have more processor entries in their ACPI tables than are
    actually installed/active. The ACPI_STA_DEVICE_PRESENT bit cannot be
    reliably used, but the ACPI_MADT_ENABLED bit can. In order to not
    introduce new functions in the main ACPI processor driver code, simply
    use acpi_get_phys_id(), which does more than we need, but which checks
    the MADT enabled bit in the process. Any CPU for which we can't
    determine the APIC ID is unlikely to work properly anyway, so the extra
    checks done by acpi_get_phys_id() should do no harm.

    Signed-off-by: Jan Beulich
    Reviewed-by: Juergen Gross
    Acked-by: Rafael J. Wysocki
    Signed-off-by: Boris Ostrovsky

    Jan Beulich
     
  • This already gets done in HYPERVISOR_mca().

    Signed-off-by: Jan Beulich
    Reviewed-by: Juergen Gross
    Signed-off-by: Boris Ostrovsky

    Jan Beulich
     

16 Aug, 2018

2 commits

  • Pull SCSI updates from James Bottomley:
    "This is mostly updates to the usual drivers: mpt3sas, lpfc, qla2xxx,
    hisi_sas, smartpqi, megaraid_sas, arcmsr.

    In addition, with the continuing absence of Nic we have target updates
    for tcmu and target core (all with reviews and acks).

    The biggest observable change is going to be that we're (again) trying
    to switch to mulitqueue as the default (a user can still override the
    setting on the kernel command line).

    Other major core stuff is the removal of the remaining Microchannel
    drivers, an update of the internal timers and some reworks of
    completion and result handling"

    * tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (203 commits)
    scsi: core: use blk_mq_run_hw_queues in scsi_kick_queue
    scsi: ufs: remove unnecessary query(DM) UPIU trace
    scsi: qla2xxx: Fix issue reported by static checker for qla2x00_els_dcmd2_sp_done()
    scsi: aacraid: Spelling fix in comment
    scsi: mpt3sas: Fix calltrace observed while running IO & reset
    scsi: aic94xx: fix an error code in aic94xx_init()
    scsi: st: remove redundant pointer STbuffer
    scsi: qla2xxx: Update driver version to 10.00.00.08-k
    scsi: qla2xxx: Migrate NVME N2N handling into state machine
    scsi: qla2xxx: Save frame payload size from ICB
    scsi: qla2xxx: Fix stalled relogin
    scsi: qla2xxx: Fix race between switch cmd completion and timeout
    scsi: qla2xxx: Fix Management Server NPort handle reservation logic
    scsi: qla2xxx: Flush mailbox commands on chip reset
    scsi: qla2xxx: Fix unintended Logout
    scsi: qla2xxx: Fix session state stuck in Get Port DB
    scsi: qla2xxx: Fix redundant fc_rport registration
    scsi: qla2xxx: Silent erroneous message
    scsi: qla2xxx: Prevent sysfs access when chip is down
    scsi: qla2xxx: Add longer window for chip reset
    ...

    Linus Torvalds
     
  • Pull drm updates from Dave Airlie:
    "This is the main drm pull request for 4.19.

    Rob has some new hardware support for new qualcomm hw that I'll send
    along separately. This has the display part of it, the remaining pull
    is for the acceleration engine.

    This also contains a wound-wait/wait-die mutex rework, Peter has acked
    it for merging via my tree.

    Otherwise mostly the usual level of activity. Summary:

    core:
    - Wound-wait/wait-die mutex rework
    - Add writeback connector type
    - Add "content type" property for HDMI
    - Move GEM bo to drm_framebuffer
    - Initial gpu scheduler documentation
    - GPU scheduler fixes for dying processes
    - Console deferred fbcon takeover support
    - Displayport support for CEC tunneling over AUX

    panel:
    - otm8009a panel driver fixes
    - Innolux TV123WAM and G070Y2-L01 panel driver
    - Ilitek ILI9881c panel driver
    - Rocktech RK070ER9427 LCD
    - EDT ETM0700G0EDH6 and EDT ETM0700G0BDH6
    - DLC DLC0700YZG-1
    - BOE HV070WSA-100
    - newhaven, nhd-4.3-480272ef-atxl LCD
    - DataImage SCF0700C48GGU18
    - Sharp LQ035Q7DB03
    - p079zca: Refactor to support multiple panels

    tinydrm:
    - ILI9341 display panel

    New driver:
    - vkms - virtual kms driver to testing.

    i915:
    - Icelake:
    Display enablement
    DSI support
    IRQ support
    Powerwell support
    - GPU reset fixes and improvements
    - Full ppgtt support refactoring
    - PSR fixes and improvements
    - Execlist improvments
    - GuC related fixes

    amdgpu:
    - Initial amdgpu documentation
    - JPEG engine support on VCN
    - CIK uses powerplay by default
    - Move to using core PCIE functionality for gens/lanes
    - DC/Powerplay interface rework
    - Stutter mode support for RV
    - Vega12 Powerplay updates
    - GFXOFF fixes
    - GPUVM fault debugging
    - Vega12 GFXOFF
    - DC improvements
    - DC i2c/aux changes
    - UVD 7.2 fixes
    - Powerplay fixes for Polaris12, CZ/ST
    - command submission bo_list fixes

    amdkfd:
    - Raven support
    - Power management fixes

    udl:
    - Cleanups and fixes

    nouveau:
    - misc fixes and cleanups.

    msm:
    - DPU1 support display controller in sdm845
    - GPU coredump support.

    vmwgfx:
    - Atomic modesetting validation fixes
    - Support for multisample surfaces

    armada:
    - Atomic modesetting support completed.

    exynos:
    - IPPv2 fixes
    - Move g2d to component framework
    - Suspend/resume support cleanups
    - Driver cleanups

    imx:
    - CSI configuration improvements
    - Driver cleanups
    - Use atomic suspend/resume helpers
    - ipu-v3 V4L2 XRGB32/XBGR32 support

    pl111:
    - Add Nomadik LCDC variant

    v3d:
    - GPU scheduler jobs management

    sun4i:
    - R40 display engine support
    - TCON TOP driver

    mediatek:
    - MT2712 SoC support

    rockchip:
    - vop fixes

    omapdrm:
    - Workaround for DRA7 errata i932
    - Fix mm_list locking

    mali-dp:
    - Writeback implementation
    PM improvements
    - Internal error reporting debugfs

    tilcdc:
    - Single fix for deferred probing

    hdlcd:
    - Teardown fixes

    tda998x:
    - Converted to a bridge driver.

    etnaviv:
    - Misc fixes"

    * tag 'drm-next-2018-08-15' of git://anongit.freedesktop.org/drm/drm: (1506 commits)
    drm/amdgpu/sriov: give 8s for recover vram under RUNTIME
    drm/scheduler: fix param documentation
    drm/i2c: tda998x: correct PLL divider calculation
    drm/i2c: tda998x: get rid of private fill_modes function
    drm/i2c: tda998x: move mode_valid() to bridge
    drm/i2c: tda998x: register bridge outside of component helper
    drm/i2c: tda998x: cleanup from previous changes
    drm/i2c: tda998x: allocate tda998x_priv inside tda998x_create()
    drm/i2c: tda998x: convert to bridge driver
    drm/scheduler: fix timeout worker setup for out of order job completions
    drm/amd/display: display connected to dp-1 does not light up
    drm/amd/display: update clk for various HDMI color depths
    drm/amd/display: program display clock on cache match
    drm/amd/display: Add NULL check for enabling dp ss
    drm/amd/display: add vbios table check for enabling dp ss
    drm/amd/display: Don't share clk source between DP and HDMI
    drm/amd/display: Fix DP HBR2 Eye Diagram Pattern on Carrizo
    drm/amd/display: Use calculated disp_clk_khz value for dce110
    drm/amd/display: Implement custom degamma lut on dcn
    drm/amd/display: Destroy aux_engines only once
    ...

    Linus Torvalds
     

08 Aug, 2018

1 commit

  • The current balloon code tries to calculate a delta factor for the
    balloon target when running in HVM mode in order to account for memory
    used by the firmware.

    This workaround for memory accounting doesn't work properly on a PVH
    Dom0, that has a static-max value different from the target value even
    at startup. Note that this is not a problem for DomUs because guests are
    started with a static-max value that matches the amount of RAM in the
    memory map.

    Fix this by forcefully setting target_diff for Dom0, regardless of
    it's mode.

    Reported-by: Gabriel Bercarug
    Signed-off-by: Roger Pau Monné
    Reviewed-by: Juergen Gross
    Signed-off-by: Boris Ostrovsky

    Roger Pau Monne
     

06 Aug, 2018

1 commit


03 Aug, 2018

2 commits

  • This converts drivers that were only calling transport_deregister_session
    to use target_remove_session. The calling of
    transport_deregister_session_configfs via target_remove_session for these
    types of drivers is ok, because they were not exporting info from fields
    like sess_acl_list, sess->se_tpg and sess->fabric_sess_ptr from configfs
    accessible functions, so they will see no difference.

    Signed-off-by: Mike Christie
    Reviewed-by: Bart Van Assche
    Reviewed-by: Christoph Hellwig
    Cc: Felipe Balbi
    Cc: Sebastian Andrzej Siewior
    Cc: Andrzej Pietrasiewicz
    Cc: Michael S. Tsirkin
    Cc: Juergen Gross
    Signed-off-by: Martin K. Petersen

    Mike Christie
     
  • Rename target_alloc_session to target_setup_session to avoid confusion with
    the other transport session allocation function that only allocates the
    session and because the target_alloc_session does so much more. It
    allocates the session, sets up the nacl and registers the session.

    The next patch will then add a remove function to match the setup in this
    one, so it should make sense for all drivers, except iscsi, to just call
    those 2 functions to setup and remove a session.

    iscsi will continue to be the odd driver.

    Signed-off-by: Mike Christie
    Reviewed-by: Bart Van Assche
    Reviewed-by: Christoph Hellwig
    Cc: Chris Boot
    Cc: Bryant G. Ly
    Cc: Michael Cyr
    Cc:
    Cc: Johannes Thumshirn
    Cc: Felipe Balbi
    Cc: Sebastian Andrzej Siewior
    Cc: Andrzej Pietrasiewicz
    Cc: Michael S. Tsirkin
    Cc: Juergen Gross
    Signed-off-by: Martin K. Petersen

    Mike Christie
     

01 Aug, 2018

1 commit

  • Currently when the allocation of gntdev_dmabuf fails, the error exit
    path will call dmabuf_imp_free_storage and causes a null pointer
    dereference on gntdev_dmabuf. Fix this by adding an error exit path
    that won't free gntdev_dmabuf.

    Detected by CoverityScan, CID#1472124 ("Dereference after null check")

    Fixes: bf8dc55b1358 ("xen/gntdev: Implement dma-buf import functionality")
    Signed-off-by: Colin Ian King
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Colin Ian King
     

27 Jul, 2018

5 commits

  • 1. Import a dma-buf with the file descriptor provided and export
    granted references to the pages of that dma-buf into the array
    of grant references.

    2. Add API to close all references to an imported buffer, so it can be
    released by the owner. This is only valid for buffers created with
    IOCTL_GNTDEV_DMABUF_IMP_TO_REFS.

    Signed-off-by: Oleksandr Andrushchenko
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Oleksandr Andrushchenko
     
  • 1. Create a dma-buf from grant references provided by the foreign
    domain. By default dma-buf is backed by system memory pages, but
    by providing GNTDEV_DMA_FLAG_XXX flags it can also be created
    as a DMA write-combine/coherent buffer, e.g. allocated with
    corresponding dma_alloc_xxx API.
    Export the resulting buffer as a new dma-buf.

    2. Implement waiting for the dma-buf to be released: block until the
    dma-buf with the file descriptor provided is released.
    If within the time-out provided the buffer is not released then
    -ETIMEDOUT error is returned. If the buffer with the file descriptor
    does not exist or has already been released, then -ENOENT is
    returned. For valid file descriptors this must not be treated as
    error.

    3. Make gntdev's common code and structures available to dma-buf.

    [boris: added 'args.fd = -1' to dmabuf_exp_from_refs() to avoid an
    unnecessary warning about it not being initialized on i386 with gcc 8.1.1]

    Signed-off-by: Oleksandr Andrushchenko
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Oleksandr Andrushchenko
     
  • Add UAPI and IOCTLs for dma-buf grant device driver extension:
    the extension allows userspace processes and kernel modules to
    use Xen backed dma-buf implementation. With this extension grant
    references to the pages of an imported dma-buf can be exported
    for other domain use and grant references coming from a foreign
    domain can be converted into a local dma-buf for local export.
    Implement basic initialization and stubs for Xen DMA buffers'
    support.

    Signed-off-by: Oleksandr Andrushchenko
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Oleksandr Andrushchenko
     
  • This is in preparation for adding support of DMA buffer
    functionality: make map/unmap related code and structures, used
    privately by gntdev, ready for dma-buf extension, which will re-use
    these. Rename corresponding structures as those become non-private
    to gntdev now.

    Signed-off-by: Oleksandr Andrushchenko
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Oleksandr Andrushchenko
     
  • Allow mappings for DMA backed buffers if grant table module
    supports such: this extends grant device to not only map buffers
    made of balloon pages, but also from buffers allocated with
    dma_alloc_xxx.

    Signed-off-by: Oleksandr Andrushchenko
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Boris Ostrovsky

    Oleksandr Andrushchenko