25 Dec, 2016

1 commit

  • When the state names got added a script was used to add the extra argument
    to the calls. The script basically converted the state constant to a
    string, but the cleanup to convert these strings into meaningful ones did
    not happen.

    Replace all the useless strings with 'subsys/xxx/yyy:state' strings which
    are used in all the other places already.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Sebastian Siewior
    Link: http://lkml.kernel.org/r/20161221192112.085444152@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

15 Dec, 2016

1 commit

  • Every single user of vmf->virtual_address typed that entry to unsigned
    long before doing anything with it so the type of virtual_address does
    not really provide us any additional safety. Just use masked
    vmf->address which already has the appropriate type.

    Link: http://lkml.kernel.org/r/1479460644-25076-3-git-send-email-jack@suse.cz
    Signed-off-by: Jan Kara
    Acked-by: Kirill A. Shutemov
    Cc: Dan Williams
    Cc: Ross Zwisler
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Jan Kara
     

14 Dec, 2016

2 commits

  • Pull xen updates from Juergen Gross:
    "Xen features and fixes for 4.10

    These are some fixes, a move of some arm related headers to share them
    between arm and arm64 and a series introducing a helper to make code
    more readable.

    The most notable change is David stepping down as maintainer of the
    Xen hypervisor interface. This results in me sending you the pull
    requests for Xen related code from now on"

    * tag 'for-linus-4.10-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (29 commits)
    xen/balloon: Only mark a page as managed when it is released
    xenbus: fix deadlock on writes to /proc/xen/xenbus
    xen/scsifront: don't request a slot on the ring until request is ready
    xen/x86: Increase xen_e820_map to E820_X_MAX possible entries
    x86: Make E820_X_MAX unconditionally larger than E820MAX
    xen/pci: Bubble up error and fix description.
    xen: xenbus: set error code on failure
    xen: set error code on failures
    arm/xen: Use alloc_percpu rather than __alloc_percpu
    arm/arm64: xen: Move shared architecture headers to include/xen/arm
    xen/events: use xen_vcpu_id mapping for EVTCHNOP_status
    xen/gntdev: Use VM_MIXEDMAP instead of VM_IO to avoid NUMA balancing
    xen-scsifront: Add a missing call to kfree
    MAINTAINERS: update XEN HYPERVISOR INTERFACE
    xenfs: Use proc_create_mount_point() to create /proc/xen
    xen-platform: use builtin_pci_driver
    xen-netback: fix error handling output
    xen: make use of xenbus_read_unsigned() in xenbus
    xen: make use of xenbus_read_unsigned() in xen-pciback
    xen: make use of xenbus_read_unsigned() in xen-fbfront
    ...

    Linus Torvalds
     
  • Pull swiotlb updates from Konrad Rzeszutek Wilk:

    - minor fixes (rate limiting), remove certain functions

    - support for DMA_ATTR_SKIP_CPU_SYNC which is an optimization
    in the DMA API

    * 'stable/for-linus-4.9' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/swiotlb:
    swiotlb: Minor fix-ups for DMA_ATTR_SKIP_CPU_SYNC support
    swiotlb: Add support for DMA_ATTR_SKIP_CPU_SYNC
    swiotlb-xen: Enforce return of DMA_ERROR_CODE in mapping function
    swiotlb: Drop unused functions swiotlb_map_sg and swiotlb_unmap_sg
    swiotlb: Rate-limit printing when running out of SW-IOMMU space

    Linus Torvalds
     

12 Dec, 2016

2 commits

  • Only mark a page as managed when it is released back to the allocator.
    This ensures that the managed page count does not get falsely increased
    when a VM is running. Correspondingly change it so that pages are
    marked as unmanaged after getting them from the allocator.

    Signed-off-by: Ross Lagerwall
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Juergen Gross

    Ross Lagerwall
     
  • /proc/xen/xenbus does not work correctly. A read blocked waiting for
    a xenstore message holds the mutex needed for atomic file position
    updates. This blocks any writes on the same file handle, which can
    deadlock if the write is needed to unblock the read.

    Clear FMODE_ATOMIC_POS when opening this device to always get
    character device like sematics.

    Signed-off-by: David Vrabel
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross

    David Vrabel
     

10 Dec, 2016

1 commit

  • One include less is always a good thing(tm). Good riddance.

    Signed-off-by: Thomas Gleixner
    Signed-off-by: Borislav Petkov
    Cc: Jiri Olsa
    Link: http://lkml.kernel.org/r/20161209182912.2726-6-bp@alien8.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

08 Dec, 2016

2 commits

  • Variable err is initialized with 0. As a result, the return value may
    be 0 even if get_zeroed_page() fails to allocate memory. This patch fixes
    the bug, initializing err with "-ENOMEM".

    Signed-off-by: Pan Bian
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross

    Pan Bian
     
  • Variable rc is reset in the loop, and its value will be non-negative
    during the second and after repeat of the loop. If it fails to allocate
    memory then, it may return a non-negative integer, which indicates no
    error. This patch fixes the bug, assigning "-ENOMEM" to rc when
    kzalloc() or alloc_page() returns NULL, and removing the initialization
    of rc outside of the loop.

    Signed-off-by: Pan Bian
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross

    Pan Bian
     

30 Nov, 2016

1 commit


28 Nov, 2016

1 commit

  • Commit 9c17d96500f7 ("xen/gntdev: Grant maps should not be subject to
    NUMA balancing") set VM_IO flag to prevent grant maps from being
    subjected to NUMA balancing.

    It was discovered recently that this flag causes get_user_pages() to
    always fail with -EFAULT.

    check_vma_flags
    __get_user_pages
    __get_user_pages_locked
    __get_user_pages_unlocked
    get_user_pages_fast
    iov_iter_get_pages
    dio_refill_pages
    do_direct_IO
    do_blockdev_direct_IO
    do_blockdev_direct_IO
    ext4_direct_IO_read
    generic_file_read_iter
    aio_run_iocb

    (which can happen if guest's vdisk has direct-io-safe option).

    To avoid this let's use VM_MIXEDMAP flag instead --- it prevents
    NUMA balancing just as VM_IO does and has no effect on
    check_vma_flags().

    Cc: stable@vger.kernel.org

    Reported-by: Olaf Hering
    Suggested-by: Hugh Dickins
    Signed-off-by: Boris Ostrovsky
    Acked-by: Hugh Dickins
    Tested-by: Olaf Hering
    Signed-off-by: Juergen Gross

    Boris Ostrovsky
     

18 Nov, 2016

1 commit

  • Upon removal of the is_idle flag, these routines became NOPs.

    Signed-off-by: Len Brown
    Acked-by: Ingo Molnar
    Acked-by: Peter Zijlstra (Intel)
    Link: http://lkml.kernel.org/r/822f2c22cc5890f7b8ea0eeec60277eb44505b4e.1479449716.git.len.brown@intel.com
    Signed-off-by: Thomas Gleixner

    Len Brown
     

17 Nov, 2016

2 commits

  • Mounting proc in user namespace containers fails if the xenbus
    filesystem is mounted on /proc/xen because this directory fails
    the "permanently empty" test. proc_create_mount_point() exists
    specifically to create such mountpoints in proc but is currently
    proc-internal. Export this interface to modules, then use it in
    xenbus when creating /proc/xen.

    Signed-off-by: Seth Forshee
    Signed-off-by: David Vrabel
    Signed-off-by: Juergen Gross

    Seth Forshee
     
  • Use builtin_pci_driver() helper to simplify the code.

    Signed-off-by: Geliang Tang
    Reviewed-by: Juergen Gross
    Signed-off-by: Juergen Gross

    Geliang Tang
     

11 Nov, 2016

1 commit

  • I am updating the paths so that instead of trying to pass
    "attr | DMA_ATTR_SKIP_CPU_SYNC" we instead just OR the value into attr and
    then pass it since attr will not be used after we make the unmap call.

    I realized there was one spot I had missed when I was applying the DMA
    attribute to the DMA mapping exception handling. This change corrects that.

    Finally it looks like there is a stray blank line at the end of the
    swiotlb_unmap_sg_attrs function that can be dropped.

    Signed-off-by: Alexander Duyck
    Signed-off-by: Konrad Rzeszutek Wilk

    Alexander Duyck
     

08 Nov, 2016

2 commits

  • As a first step to making DMA_ATTR_SKIP_CPU_SYNC apply to architectures
    beyond just ARM I need to make it so that the swiotlb will respect the
    flag. In order to do that I also need to update the swiotlb-xen since it
    heavily makes use of the functionality.

    Cc: Konrad Rzeszutek Wilk
    Signed-off-by: Alexander Duyck
    Signed-off-by: Konrad Rzeszutek Wilk

    Alexander Duyck
     
  • The mapping function should always return DMA_ERROR_CODE when a mapping has
    failed as this is what the DMA API expects when a DMA error has occurred.
    The current function for mapping a page in Xen was returning either
    DMA_ERROR_CODE or 0 depending on where it failed.

    On x86 DMA_ERROR_CODE is 0, but on other architectures such as ARM it is
    ~0. We need to make sure we return the same error value if either the
    mapping failed or the device is not capable of accessing the mapping.

    If we are returning DMA_ERROR_CODE as our error value we can drop the
    function for checking the error code as the default is to compare the
    return value against DMA_ERROR_CODE if no function is defined.

    Cc: Konrad Rzeszutek Wilk
    Signed-off-by: Alexander Duyck
    Signed-off-by: Konrad Rzeszutek Wilk

    Alexander Duyck
     

07 Nov, 2016

3 commits


25 Oct, 2016

1 commit


24 Oct, 2016

3 commits


07 Oct, 2016

1 commit

  • Pull xen updates from David Vrabel:
    "xen features and fixes for 4.9:

    - switch to new CPU hotplug mechanism

    - support driver_override in pciback

    - require vector callback for HVM guests (the alternate mechanism via
    the platform device has been broken for ages)"

    * tag 'for-linus-4.9-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/x86: Update topology map for PV VCPUs
    xen/x86: Initialize per_cpu(xen_vcpu, 0) a little earlier
    xen/pciback: support driver_override
    xen/pciback: avoid multiple entries in slot list
    xen/pciback: simplify pcistub device handling
    xen: Remove event channel notification through Xen PCI platform device
    xen/events: Convert to hotplug state machine
    xen/x86: Convert to hotplug state machine
    x86/xen: add missing \n at end of printk warning message
    xen/grant-table: Use kmalloc_array() in arch_gnttab_valloc()
    xen: Make VPMU init message look less scary
    xen: rename xen_pmu_init() in sys-hypervisor.c
    hotplug: Prevent alloc/free of irq descriptors during cpu up/down (again)
    xen/x86: Move irq allocation from Xen smp_op.cpu_up()

    Linus Torvalds
     

30 Sep, 2016

5 commits

  • Support the driver_override scheme introduced with commit 782a985d7af2
    ("PCI: Introduce new device binding path using pci_dev.driver_override")

    As pcistub_probe() is called for all devices (it has to check for a
    match based on the slot address rather than device type) it has to
    check for driver_override set to "pciback" itself.

    Up to now for assigning a pci device to pciback you need something like:

    echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind
    echo 0000:07:10.0 > /sys/bus/pci/drivers/pciback/new_slot
    echo 0000:07:10.0 > /sys/bus/pci/drivers_probe

    while with the patch you can use the same mechanism as for similar
    drivers like pci-stub and vfio-pci:

    echo pciback > /sys/bus/pci/devices/0000\:07\:10.0/driver_override
    echo 0000:07:10.0 > /sys/bus/pci/devices/0000\:07\:10.0/driver/unbind
    echo 0000:07:10.0 > /sys/bus/pci/drivers_probe

    So e.g. libvirt doesn't need special handling for pciback.

    Signed-off-by: Juergen Gross
    Signed-off-by: David Vrabel

    Juergen Gross
     
  • The Xen pciback driver has a list of all pci devices it is ready to
    seize. There is no check whether a to be added entry already exists.
    While this might be no problem in the common case it might confuse
    those which consume the list via sysfs.

    Modify the handling of this list by not adding an entry which already
    exists. As this will be needed later split out the list handling into
    a separate function.

    Signed-off-by: Juergen Gross
    Signed-off-by: David Vrabel

    Juergen Gross
     
  • The Xen pciback driver maintains a list of all its seized devices.
    There are two functions searching the list for a specific device with
    basically the same semantics just returning different structures in
    case of a match.

    Split out the search function.

    Signed-off-by: Juergen Gross
    Signed-off-by: David Vrabel

    Juergen Gross
     
  • Ever since commit 254d1a3f02eb ("xen/pv-on-hvm kexec: shutdown watches
    from old kernel") using the INTx interrupt from Xen PCI platform
    device for event channel notification would just lockup the guest
    during bootup. postcore_initcall now calls xs_reset_watches which
    will eventually try to read a value from XenStore and will get stuck
    on read_reply at XenBus forever since the platform driver is not
    probed yet and its INTx interrupt handler is not registered yet. That
    means that the guest can not be notified at this moment of any pending
    event channels and none of the per-event handlers will ever be invoked
    (including the XenStore one) and the reply will never be picked up by
    the kernel.

    The exact stack where things get stuck during xenbus_init:

    -xenbus_init
    -xs_init
    -xs_reset_watches
    -xenbus_scanf
    -xenbus_read
    -xs_single
    -xs_single
    -xs_talkv

    Vector callbacks have always been the favourite event notification
    mechanism since their introduction in commit 38e20b07efd5 ("x86/xen:
    event channels delivery on HVM.") and the vector callback feature has
    always been advertised for quite some time by Xen that's why INTx was
    broken for several years now without impacting anyone.

    Luckily this also means that event channel notification through INTx
    is basically dead-code which can be safely removed without impacting
    anybody since it has been effectively disabled for more than 4 years
    with nobody complaining about it (at least as far as I'm aware of).

    This commit removes event channel notification through Xen PCI
    platform device.

    Cc: Boris Ostrovsky
    Cc: David Vrabel
    Cc: Juergen Gross
    Cc: Thomas Gleixner
    Cc: Ingo Molnar
    Cc: "H. Peter Anvin"
    Cc: x86@kernel.org
    Cc: Konrad Rzeszutek Wilk
    Cc: Bjorn Helgaas
    Cc: Stefano Stabellini
    Cc: Julien Grall
    Cc: Vitaly Kuznetsov
    Cc: Paul Gortmaker
    Cc: Ross Lagerwall
    Cc: xen-devel@lists.xenproject.org
    Cc: linux-kernel@vger.kernel.org
    Cc: linux-pci@vger.kernel.org
    Cc: Anthony Liguori
    Signed-off-by: KarimAllah Ahmed
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: David Vrabel

    KarimAllah Ahmed
     
  • Install the callbacks via the state machine.

    Signed-off-by: Sebastian Andrzej Siewior
    Signed-off-by: Boris Ostrovsky
    Signed-off-by: David Vrabel

    Sebastian Andrzej Siewior
     

25 Aug, 2016

2 commits

  • There are two functions with name xen_pmu_init() in the kernel. Rename
    the one in drivers/xen/sys-hypervisor.c to avoid shadowing the one in
    arch/x86/xen/pmu.c

    To avoid the same problem in future rename some more functions.

    Signed-off-by: Juergen Gross
    Signed-off-by: David Vrabel

    Juergen Gross
     
  • This should really only be done for XS_TRANSACTION_END messages, or
    else at least some of the xenstore-* tools don't work anymore.

    Fixes: 0beef634b8 ("xenbus: don't BUG() on user mode induced condition")
    Reported-by: Richard Schütz
    Cc:
    Signed-off-by: Jan Beulich
    Tested-by: Richard Schütz
    Signed-off-by: David Vrabel

    Jan Beulich
     

04 Aug, 2016

1 commit

  • The dma-mapping core and the implementations do not change the DMA
    attributes passed by pointer. Thus the pointer can point to const data.
    However the attributes do not have to be a bitfield. Instead unsigned
    long will do fine:

    1. This is just simpler. Both in terms of reading the code and setting
    attributes. Instead of initializing local attributes on the stack
    and passing pointer to it to dma_set_attr(), just set the bits.

    2. It brings safeness and checking for const correctness because the
    attributes are passed by value.

    Semantic patches for this change (at least most of them):

    virtual patch
    virtual context

    @r@
    identifier f, attrs;

    @@
    f(...,
    - struct dma_attrs *attrs
    + unsigned long attrs
    , ...)
    {
    ...
    }

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
    )

    and

    // Options: --all-includes
    virtual patch
    virtual context

    @r@
    identifier f, attrs;
    type t;

    @@
    t f(..., struct dma_attrs *attrs);

    @@
    identifier r.f;
    @@
    f(...,
    - NULL
    + 0
    )

    Link: http://lkml.kernel.org/r/1468399300-5399-2-git-send-email-k.kozlowski@samsung.com
    Signed-off-by: Krzysztof Kozlowski
    Acked-by: Vineet Gupta
    Acked-by: Robin Murphy
    Acked-by: Hans-Christian Noren Egtvedt
    Acked-by: Mark Salter [c6x]
    Acked-by: Jesper Nilsson [cris]
    Acked-by: Daniel Vetter [drm]
    Reviewed-by: Bart Van Assche
    Acked-by: Joerg Roedel [iommu]
    Acked-by: Fabien Dessenne [bdisp]
    Reviewed-by: Marek Szyprowski [vb2-core]
    Acked-by: David Vrabel [xen]
    Acked-by: Konrad Rzeszutek Wilk [xen swiotlb]
    Acked-by: Joerg Roedel [iommu]
    Acked-by: Richard Kuo [hexagon]
    Acked-by: Geert Uytterhoeven [m68k]
    Acked-by: Gerald Schaefer [s390]
    Acked-by: Bjorn Andersson
    Acked-by: Hans-Christian Noren Egtvedt [avr32]
    Acked-by: Vineet Gupta [arc]
    Acked-by: Robin Murphy [arm64 and dma-iommu]
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Krzysztof Kozlowski
     

28 Jul, 2016

1 commit

  • Pull xen updates from David Vrabel:
    "Features and fixes for 4.8-rc0:

    - ACPI support for guests on ARM platforms.
    - Generic steal time support for arm and x86.
    - Support cases where kernel cpu is not Xen VCPU number (e.g., if
    in-guest kexec is used).
    - Use the system workqueue instead of a custom workqueue in various
    places"

    * tag 'for-linus-4.8-rc0-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: (47 commits)
    xen: add static initialization of steal_clock op to xen_time_ops
    xen/pvhvm: run xen_vcpu_setup() for the boot CPU
    xen/evtchn: use xen_vcpu_id mapping
    xen/events: fifo: use xen_vcpu_id mapping
    xen/events: use xen_vcpu_id mapping in events_base
    x86/xen: use xen_vcpu_id mapping when pointing vcpu_info to shared_info
    x86/xen: use xen_vcpu_id mapping for HYPERVISOR_vcpu_op
    xen: introduce xen_vcpu_id mapping
    x86/acpi: store ACPI ids from MADT for future usage
    x86/xen: update cpuid.h from Xen-4.7
    xen/evtchn: add IOCTL_EVTCHN_RESTRICT
    xen-blkback: really don't leak mode property
    xen-blkback: constify instance of "struct attribute_group"
    xen-blkfront: prefer xenbus_scanf() over xenbus_gather()
    xen-blkback: prefer xenbus_scanf() over xenbus_gather()
    xen: support runqueue steal time on xen
    arm/xen: add support for vm_assist hypercall
    xen: update xen headers
    xen-pciback: drop superfluous variables
    xen-pciback: short-circuit read path used for merging write values
    ...

    Linus Torvalds
     

27 Jul, 2016

1 commit

  • I have noticed that frontswap.h first declares "frontswap_enabled" as
    extern bool variable, and then overrides it with "#define
    frontswap_enabled (1)" for CONFIG_FRONTSWAP=Y or (0) when disabled. The
    bool variable isn't actually instantiated anywhere.

    This all looks like an unfinished attempt to make frontswap_enabled
    reflect whether a backend is instantiated. But in the current state,
    all frontswap hooks call unconditionally into frontswap.c just to check
    if frontswap_ops is non-NULL. This should at least be checked inline,
    but we can further eliminate the overhead when CONFIG_FRONTSWAP is
    enabled and no backend registered, using a static key that is initially
    disabled, and gets enabled only upon first backend registration.

    Thus, checks for "frontswap_enabled" are replaced with
    "frontswap_enabled()" wrapping the static key check. There are two
    exceptions:

    - xen's selfballoon_process() was testing frontswap_enabled in code guarded
    by #ifdef CONFIG_FRONTSWAP, which was effectively always true when reachable.
    The patch just removes this check. Using frontswap_enabled() does not sound
    correct here, as this can be true even without xen's own backend being
    registered.

    - in SYSCALL_DEFINE2(swapon), change the check to IS_ENABLED(CONFIG_FRONTSWAP)
    as it seems the bitmap allocation cannot currently be postponed until a
    backend is registered. This means that frontswap will still have some
    memory overhead by being configured, but without a backend.

    After the patch, we can expect that some functions in frontswap.c are
    called only when frontswap_ops is non-NULL. Change the checks there to
    VM_BUG_ONs. While at it, convert other BUG_ONs to VM_BUG_ONs as
    frontswap has been stable for some time.

    [akpm@linux-foundation.org: coding-style fixes]
    Link: http://lkml.kernel.org/r/1463152235-9717-1-git-send-email-vbabka@suse.cz
    Signed-off-by: Vlastimil Babka
    Cc: Konrad Rzeszutek Wilk
    Cc: Boris Ostrovsky
    Cc: David Vrabel
    Cc: Juergen Gross
    Cc: "Kirill A. Shutemov"
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Vlastimil Babka
     

26 Jul, 2016

1 commit


25 Jul, 2016

4 commits

  • Use the newly introduced xen_vcpu_id mapping to get Xen's idea of vCPU
    id for CPU0.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: David Vrabel

    Vitaly Kuznetsov
     
  • EVTCHNOP_init_control has vCPU id as a parameter and Xen's idea of
    vCPU id should be used. Use the newly introduced xen_vcpu_id mapping
    to convert it from Linux's id.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: David Vrabel

    Vitaly Kuznetsov
     
  • EVTCHNOP_bind_ipi and EVTCHNOP_bind_virq pass vCPU id as a parameter
    and Xen's idea of vCPU id should be used. Use the newly introduced
    xen_vcpu_id mapping to convert it from Linux's id.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: David Vrabel

    Vitaly Kuznetsov
     
  • HYPERVISOR_vcpu_op() passes Linux's idea of vCPU id as a parameter
    while Xen's idea is expected. In some cases these ideas diverge so we
    need to do remapping.

    Convert all callers of HYPERVISOR_vcpu_op() to use xen_vcpu_nr().

    Leave xen_fill_possible_map() and xen_filter_cpu_maps() intact as
    they're only being called by PV guests before perpu areas are
    initialized. While the issue could be solved by switching to
    early_percpu for xen_vcpu_id I think it's not worth it: PV guests will
    probably never get to the point where their idea of vCPU id diverges
    from Xen's.

    Signed-off-by: Vitaly Kuznetsov
    Signed-off-by: David Vrabel

    Vitaly Kuznetsov