18 Apr, 2014

1 commit

  • Pull Xen fixes from David Vrabel:
    "Xen regression and bug fixes for 3.15-rc1:

    - fix completely broken 32-bit PV guests caused by x86 refactoring
    32-bit thread_info.
    - only enable ticketlock slow path on Xen (not bare metal)
    - fix two bugs with PV guests not shutting down when requested
    - fix a minor memory leak in xen-pciback error path"

    * tag 'stable/for-linus-3.15-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/manage: Poweroff forcefully if user-space is not yet up.
    xen/xenbus: Avoid synchronous wait on XenBus stalling shutdown/restart.
    xen/spinlock: Don't enable them unconditionally.
    xen-pciback: silence an unwanted debug printk
    xen: fix memory leak in __xen_pcibk_add_pci_dev()
    x86/xen: Fix 32-bit PV guests's usage of kernel_stack

    Linus Torvalds
     

16 Apr, 2014

4 commits

  • The user can launch the guest in this sequence:

    xl create -p /vm.cfg [launch, but pause it]
    xl shutdown latest [sets control/shutdown=poweroff]
    xl unpause latest
    xl console latest [and see that the guest has completely
    ignored the shutdown request]

    In reality the guest hasn't ignored it. It registers a watch
    and gets a notification that there is value. It then calls
    the shutdown_handler which ends up calling orderly_shutdown.

    Unfortunately that is so early in the bootup that there
    are no user-space. Which means that the orderly_shutdown fails.
    But since the force flag was set to false it continues on without
    reporting.

    What we really want to is to use the force when we are in the
    SYSTEM_BOOTING state and not use the 'force' when SYSTEM_RUNNING.

    However, if we are in the running state - and the shutdown command
    has been given before the user-space has been setup, there is nothing
    we can do. Worst yet, we stop ignoring the 'xl shutdown' requests!

    As such, the other part of this patch is to only stop ignoring
    the 'xl shutdown' when we are truly in the power off sequence.

    That means the user can do multiple 'xl shutdown' and we will try
    to act on them instead of ignoring them.

    Fixes-Bug: http://bugs.xenproject.org/xen/bug/6
    Reported-by: Alex Bligh
    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: David Vrabel

    Konrad Rzeszutek Wilk
     
  • The 'read_reply' works with 'process_msg' to read of a reply in XenBus.
    'process_msg' is running from within the 'xenbus' thread. Whenever
    a message shows up in XenBus it is put on a xs_state.reply_list list
    and 'read_reply' picks it up.

    The problem is if the backend domain or the xenstored process is killed.
    In which case 'xenbus' is still awaiting - and 'read_reply' if called -
    stuck forever waiting for the reply_list to have some contents.

    This is normally not a problem - as the backend domain can come back
    or the xenstored process can be restarted. However if the domain
    is in process of being powered off/restarted/halted - there is no
    point of waiting on it coming back - as we are effectively being
    terminated and should not impede the progress.

    This patch solves this problem by checking whether the guest is the
    right domain. If it is an initial domain and hurtling towards death -
    there is no point of continuing the wait. All other type of guests
    continue with their behavior (as Xenstore is expected to still be
    running in another domain).

    Fixes-Bug: http://bugs.xenproject.org/xen/bug/8
    Signed-off-by: Konrad Rzeszutek Wilk
    Reviewed-by: Boris Ostrovsky
    Reviewed-by: David Vrabel
    Signed-off-by: David Vrabel

    Konrad Rzeszutek Wilk
     
  • There is a missing curly brace here so we might print some extra debug
    information.

    Signed-off-by: Dan Carpenter
    Signed-off-by: David Vrabel

    Dan Carpenter
     
  • It need to free dev_entry when it failed to assign to a new
    slot on the virtual PCI bus.

    smatch says:
    drivers/xen/xen-pciback/vpci.c:142 __xen_pcibk_add_pci_dev() warn:
    possible memory leak of 'dev_entry'

    Signed-off-by: Daeseok Youn
    Signed-off-by: David Vrabel

    Daeseok Youn
     

08 Apr, 2014

2 commits

  • Pull Xen build fix from David Vrabel:
    "Fix arm build of drivers/xen/events/

    The merge of irq-core-for-linus branch broke it"

    * tag 'stable/for-linus-3.15-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    Xen: do hv callback accounting only on x86

    Linus Torvalds
     
  • Pull CPU hotplug notifiers registration fixes from Rafael Wysocki:
    "The purpose of this single series of commits from Srivatsa S Bhat
    (with a small piece from Gautham R Shenoy) touching multiple
    subsystems that use CPU hotplug notifiers is to provide a way to
    register them that will not lead to deadlocks with CPU online/offline
    operations as described in the changelog of commit 93ae4f978ca7f ("CPU
    hotplug: Provide lockless versions of callback registration
    functions").

    The first three commits in the series introduce the API and document
    it and the rest simply goes through the users of CPU hotplug notifiers
    and converts them to using the new method"

    * tag 'cpu-hotplug-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (52 commits)
    net/iucv/iucv.c: Fix CPU hotplug callback registration
    net/core/flow.c: Fix CPU hotplug callback registration
    mm, zswap: Fix CPU hotplug callback registration
    mm, vmstat: Fix CPU hotplug callback registration
    profile: Fix CPU hotplug callback registration
    trace, ring-buffer: Fix CPU hotplug callback registration
    xen, balloon: Fix CPU hotplug callback registration
    hwmon, via-cputemp: Fix CPU hotplug callback registration
    hwmon, coretemp: Fix CPU hotplug callback registration
    thermal, x86-pkg-temp: Fix CPU hotplug callback registration
    octeon, watchdog: Fix CPU hotplug callback registration
    oprofile, nmi-timer: Fix CPU hotplug callback registration
    intel-idle: Fix CPU hotplug callback registration
    clocksource, dummy-timer: Fix CPU hotplug callback registration
    drivers/base/topology.c: Fix CPU hotplug callback registration
    acpi-cpufreq: Fix CPU hotplug callback registration
    zsmalloc: Fix CPU hotplug callback registration
    scsi, fcoe: Fix CPU hotplug callback registration
    scsi, bnx2fc: Fix CPU hotplug callback registration
    scsi, bnx2i: Fix CPU hotplug callback registration
    ...

    Linus Torvalds
     

07 Apr, 2014

2 commits

  • Patch 99c8b79d3c1 "xen: Add proper irq accounting for HYPERCALL vector"
    added a call to inc_irq_stat(irq_hv_callback_count) in common Xen code,
    however both the inc_irq_stat function and the irq_hv_callback_count
    counter are architecture specific.

    This makes the code build again on ARM by moving the call into the
    existing #ifdef CONFIG_X86. We may want to later do the same implementation
    on ARM that x86 has though.

    Signed-off-by: Arnd Bergmann
    Signed-off-by: David Vrabel
    Cc: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Konrad Rzeszutek Wilk
    Cc: Xen

    Arnd Bergmann
     
  • This merge of the irq-core-for-linus branch broke the ARM build when
    Xen is enabled.

    Conflicts:
    drivers/xen/events/events_base.c

    David Vrabel
     

04 Apr, 2014

1 commit

  • Pull Xen features and fixes from David Vrabel:
    "Support PCI devices with multiple MSIs, performance improvement for
    kernel-based backends (by not populated m2p overrides when mapping),
    and assorted minor bug fixes and cleanups"

    * tag 'stable/for-linus-3.15-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/acpi-processor: fix enabling interrupts on syscore_resume
    xen/grant-table: Refactor gnttab_[un]map_refs to avoid m2p_override
    xen: remove XEN_PRIVILEGED_GUEST
    xen: add support for MSI message groups
    xen-pciback: Use pci_enable_msix_exact() instead of pci_enable_msix()
    xen/xenbus: remove unused xenbus_bind_evtchn()
    xen/events: remove unnecessary call to bind_evtchn_to_cpu()
    xen/events: remove the unused resend_irq_on_evtchn()
    drivers:xen-selfballoon:reset 'frontswap_inertia_counter' after frontswap_shrink
    drivers: xen: Include appropriate header file in pcpu.c
    drivers: xen: Mark function as static in platform-pci.c

    Linus Torvalds
     

02 Apr, 2014

2 commits

  • Pull ACPI and power management updates from Rafael Wysocki:
    "The majority of this material spent some time in linux-next, some of
    it even several weeks. There are a few relatively fresh commits in
    it, but they are mostly fixes and simple cleanups.

    ACPI took the lead this time, both in terms of the number of commits
    and the number of modified lines of code, cpufreq follows and there
    are a few changes in the PM core and in cpuidle too.

    A new feature that already got some LWN.net's attention is the device
    PM QoS extension allowing latency tolerance requirements to be
    propagated from leaf devices to their ancestors with hardware
    interfaces for specifying latency tolerance. That should help systems
    with hardware-driven power management to avoid going too far with it
    in cases when there are latency tolerance constraints.

    There also are some significant changes in the ACPI core related to
    the way in which hotplug notifications are handled. They affect PCI
    hotplug (ACPIPHP) and the ACPI dock station code too. The bottom line
    is that all those notification now go through the root notify handler
    and are propagated to the interested subsystems by means of callbacks
    instead of having to install a notify handler for each device object
    that we can potentially get hotplug notifications for.

    In addition to that ACPICA will now advertise "Windows 2013"
    compatibility for _OSI, because some systems out there don't work
    correctly if that is not done (some of them don't even boot).

    On the system suspend side of things, all of the device suspend and
    resume callbacks, except for ->prepare() and ->complete(), are now
    going to be executed asynchronously as that turns out to speed up
    system suspend and resume on some platforms quite significantly and we
    have a few more optimizations in that area.

    Apart from that, there are some new device IDs and fixes and cleanups
    all over. In particular, the system suspend and resume handling by
    cpufreq should be improved and the cpuidle menu governor should be a
    bit more robust now.

    Specifics:

    - Device PM QoS support for latency tolerance constraints on systems
    with hardware interfaces allowing such constraints to be specified.
    That is necessary to prevent hardware-driven power management from
    becoming overly aggressive on some systems and to prevent power
    management features leading to excessive latencies from being used
    in some cases.

    - Consolidation of the handling of ACPI hotplug notifications for
    device objects. This causes all device hotplug notifications to go
    through the root notify handler (that was executed for all of them
    anyway before) that propagates them to individual subsystems, if
    necessary, by executing callbacks provided by those subsystems
    (those callbacks are associated with struct acpi_device objects
    during device enumeration). As a result, the code in question
    becomes both smaller in size and more straightforward and all of
    those changes should not affect users.

    - ACPICA update, including fixes related to the handling of _PRT in
    cases when it is broken and the addition of "Windows 2013" to the
    list of supported "features" for _OSI (which is necessary to
    support systems that work incorrectly or don't even boot without
    it). Changes from Bob Moore and Lv Zheng.

    - Consolidation of ACPI _OST handling from Jiang Liu.

    - ACPI battery and AC fixes allowing unusual system configurations to
    be handled by that code from Alexander Mezin.

    - New device IDs for the ACPI LPSS driver from Chiau Ee Chew.

    - ACPI fan and thermal optimizations related to system suspend and
    resume from Aaron Lu.

    - Cleanups related to ACPI video from Jean Delvare.

    - Assorted ACPI fixes and cleanups from Al Stone, Hanjun Guo, Lan
    Tianyu, Paul Bolle, Tomasz Nowicki.

    - Intel RAPL (Running Average Power Limits) driver cleanups from
    Jacob Pan.

    - intel_pstate fixes and cleanups from Dirk Brandewie.

    - cpufreq fixes related to system suspend/resume handling from Viresh
    Kumar.

    - cpufreq core fixes and cleanups from Viresh Kumar, Stratos
    Karafotis, Saravana Kannan, Rashika Kheria, Joe Perches.

    - cpufreq drivers updates from Viresh Kumar, Zhuoyu Zhang, Rob
    Herring.

    - cpuidle fixes related to the menu governor from Tuukka Tikkanen.

    - cpuidle fix related to coupled CPUs handling from Paul Burton.

    - Asynchronous execution of all device suspend and resume callbacks,
    except for ->prepare and ->complete, during system suspend and
    resume from Chuansheng Liu.

    - Delayed resuming of runtime-suspended devices during system suspend
    for the PCI bus type and ACPI PM domain.

    - New set of PM helper routines to allow device runtime PM callbacks
    to be used during system suspend and resume more easily from Ulf
    Hansson.

    - Assorted fixes and cleanups in the PM core from Geert Uytterhoeven,
    Prabhakar Lad, Philipp Zabel, Rashika Kheria, Sebastian Capella.

    - devfreq fix from Saravana Kannan"

    * tag 'pm+acpi-3.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (162 commits)
    PM / devfreq: Rewrite devfreq_update_status() to fix multiple bugs
    PM / sleep: Correct whitespace errors in
    intel_pstate: Set core to min P state during core offline
    cpufreq: Add stop CPU callback to cpufreq_driver interface
    cpufreq: Remove unnecessary braces
    cpufreq: Fix checkpatch errors and warnings
    cpufreq: powerpc: add cpufreq transition latency for FSL e500mc SoCs
    MAINTAINERS: Reorder maintainer addresses for PM and ACPI
    PM / Runtime: Update runtime_idle() documentation for return value meaning
    video / output: Drop display output class support
    fujitsu-laptop: Drop unneeded include
    acer-wmi: Stop selecting VIDEO_OUTPUT_CONTROL
    ACPI / gpu / drm: Stop selecting VIDEO_OUTPUT_CONTROL
    ACPI / video: fix ACPI_VIDEO dependencies
    cpufreq: remove unused notifier: CPUFREQ_{SUSPENDCHANGE|RESUMECHANGE}
    cpufreq: Do not allow ->setpolicy drivers to provide ->target
    cpufreq: arm_big_little: set 'physical_cluster' for each CPU
    cpufreq: arm_big_little: make vexpress driver depend on bL core driver
    ACPI / button: Add ACPI Button event via netlink routine
    ACPI: Remove duplicate definitions of PREFIX
    ...

    Linus Torvalds
     
  • Pull irq code updates from Thomas Gleixner:
    "The irq department proudly presents:

    - Another tree wide sweep of irq infrastructure abuse. Clear winner
    of the trainwreck engineering contest was:
    #include "../../../kernel/irq/settings.h"

    - Tree wide update of irq_set_affinity() callbacks which miss a cpu
    online check when picking a single cpu out of the affinity mask.

    - Tree wide consolidation of interrupt statistics.

    - Updates to the threaded interrupt infrastructure to allow explicit
    wakeup of the interrupt thread and a variant of synchronize_irq()
    which synchronizes only the hard interrupt handler. Both are
    needed to replace the homebrewn thread handling in the mmc/sdhci
    code.

    - New irq chip callbacks to allow proper support for GPIO based irqs.
    The GPIO based interrupts need to request/release GPIO resources
    from request/free_irq.

    - A few new ARM interrupt chips. No revolutionary new hardware, just
    differently wreckaged variations of the scheme.

    - Small improvments, cleanups and updates all over the place"

    I was hoping that that trainwreck engineering contest was a April Fools'
    joke. But no.

    * 'irq-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (68 commits)
    irqchip: sun7i/sun6i: Disable NMI before registering the handler
    ARM: sun7i/sun6i: dts: Fix IRQ number for sun6i NMI controller
    ARM: sun7i/sun6i: irqchip: Update the documentation
    ARM: sun7i/sun6i: dts: Add NMI irqchip support
    ARM: sun7i/sun6i: irqchip: Add irqchip driver for NMI controller
    genirq: Export symbol no_action()
    arm: omap: Fix typo in ams-delta-fiq.c
    m68k: atari: Fix the last kernel_stat.h fallout
    irqchip: sun4i: Simplify sun4i_irq_ack
    irqchip: sun4i: Use handle_fasteoi_irq for all interrupts
    genirq: procfs: Make smp_affinity values go+r
    softirq: Add linux/irq.h to make it compile again
    m68k: amiga: Add linux/irq.h to make it compile again
    irqchip: sun4i: Don't ack IRQs > 0, fix acking of IRQ 0
    irqchip: sun4i: Fix a comment about mask register initialization
    irqchip: sun4i: Fix irq 0 not working
    genirq: Add a new IRQCHIP_EOI_THREADED flag
    genirq: Document IRQCHIP_ONESHOT_SAFE flag
    ARM: sunxi: dt: Convert to the new irq controller compatibles
    irqchip: sunxi: Change compatibles
    ...

    Linus Torvalds
     

25 Mar, 2014

1 commit

  • Xen balloon driver will update ballooned out pages' P2M entries to point
    to scratch page for PV guests. In 24f69373e2 ("xen/balloon: don't alloc
    page while non-preemptible", kmap_flush_unused was moved after updating
    P2M table. In that case for 32 bit PV guest we might end up with

    P2M X -----> S (S is mfn of balloon scratch page)
    M2P Y -----> X (Y is mfn in persistent kmap entry)

    kmap_flush_unused() iterates through all the PTEs in the kmap address
    space, using pte_to_page() to obtain the page. If the p2m and the m2p
    are inconsistent the incorrect page is returned. This will clear
    page->address on the wrong page which may cause subsequent oopses if
    that page is currently kmap'ed.

    Move the flush back between get_page and __set_phys_to_machine to fix
    this.

    Signed-off-by: Wei Liu
    Signed-off-by: David Vrabel
    Cc: stable@vger.kernel.org # 3.12+

    Wei Liu
     

20 Mar, 2014

1 commit

  • Subsystems that want to register CPU hotplug callbacks, as well as perform
    initialization for the CPUs that are already online, often do it as shown
    below:

    get_online_cpus();

    for_each_online_cpu(cpu)
    init_cpu(cpu);

    register_cpu_notifier(&foobar_cpu_notifier);

    put_online_cpus();

    This is wrong, since it is prone to ABBA deadlocks involving the
    cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
    with CPU hotplug operations).

    The xen balloon driver doesn't take get/put_online_cpus() around this code,
    but that is also buggy, since it can miss CPU hotplug events in between the
    initialization and callback registration:

    for_each_online_cpu(cpu)
    init_cpu(cpu);
    ^
    | Race window; Can miss CPU hotplug events here.
    v
    register_cpu_notifier(&foobar_cpu_notifier);

    Interestingly, the balloon code in xen can simply be reorganized as shown
    below, to have a race-free method to register hotplug callbacks, without even
    taking get/put_online_cpus(). This is because the initialization performed for
    already online CPUs is exactly the same as that performed for CPUs that come
    online later. Moreover, the code has checks in place to avoid double
    initialization.

    register_cpu_notifier(&foobar_cpu_notifier);

    get_online_cpus();

    for_each_online_cpu(cpu)
    init_cpu(cpu);

    put_online_cpus();

    A hotplug operation that occurs between registering the notifier and calling
    get_online_cpus(), won't disrupt anything, because the code takes care to
    perform the memory allocations only once.

    So reorganize the balloon code in xen this way to fix the issues with CPU
    hotplug callback registration.

    Cc: Konrad Rzeszutek Wilk
    Cc: David Vrabel
    Cc: Ingo Molnar
    Reviewed-by: Boris Ostrovsky
    Signed-off-by: Srivatsa S. Bhat
    Signed-off-by: Rafael J. Wysocki

    Srivatsa S. Bhat
     

18 Mar, 2014

3 commits

  • syscore->resume() callback is expected to do not enable interrupts,
    it generates warning like below otherwise:

    [ 9386.365390] WARNING: CPU: 0 PID: 6733 at drivers/base/syscore.c:104 syscore_resume+0x9a/0xe0()
    [ 9386.365403] Interrupts enabled after xen_acpi_processor_resume+0x0/0x34 [xen_acpi_processor]
    ...
    [ 9386.365429] Call Trace:
    [ 9386.365434] [] dump_stack+0x45/0x56
    [ 9386.365437] [] warn_slowpath_common+0x7d/0xa0
    [ 9386.365439] [] warn_slowpath_fmt+0x4c/0x50
    [ 9386.365442] [] ? xen_upload_processor_pm_data+0x300/0x300 [xen_acpi_processor]
    [ 9386.365443] [] syscore_resume+0x9a/0xe0
    [ 9386.365445] [] suspend_devices_and_enter+0x402/0x470
    [ 9386.365447] [] pm_suspend+0x178/0x260

    On xen_acpi_processor_resume() we call various procedures, which are
    non atomic and can enable interrupts. To prevent the issue introduce
    separate resume notify called after we enable interrupts on resume
    and before we call other drivers resume callbacks.

    Signed-off-by: Stanislaw Gruszka
    Signed-off-by: Konrad Rzeszutek Wilk

    Stanislaw Gruszka
     
  • The grant mapping API does m2p_override unnecessarily: only gntdev needs it,
    for blkback and future netback patches it just cause a lock contention, as
    those pages never go to userspace. Therefore this series does the following:
    - the bulk of the original function (everything after the mapping hypercall)
    is moved to arch-dependent set/clear_foreign_p2m_mapping
    - the "if (xen_feature(XENFEAT_auto_translated_physmap))" branch goes to ARM
    - therefore the ARM function could be much smaller, the m2p_override stubs
    could be also removed
    - on x86 the set_phys_to_machine calls were moved up to this new funcion
    from m2p_override functions
    - and m2p_override functions are only called when there is a kmap_ops param

    It also removes a stray space from arch/x86/include/asm/xen/page.h.

    Signed-off-by: Zoltan Kiss
    Suggested-by: Anthony Liguori
    Suggested-by: David Vrabel
    Suggested-by: Stefano Stabellini
    Signed-off-by: David Vrabel
    Signed-off-by: Stefano Stabellini

    Zoltan Kiss
     
  • Add support for MSI message groups for Xen Dom0 using the
    MAP_PIRQ_TYPE_MULTI_MSI pirq map type.

    In order to keep track of which pirq is the first one in the group all
    pirqs in the MSI group except for the first one have the newly
    introduced PIRQ_MSI_GROUP flag set. This prevents calling
    PHYSDEVOP_unmap_pirq on them, since the unmap must be done with the
    first pirq in the group.

    Signed-off-by: Roger Pau Monné
    Signed-off-by: David Vrabel
    Cc: Boris Ostrovsky

    Roger Pau Monne
     

12 Mar, 2014

1 commit

  • The user space interface does not filter out offline cpus. It merily
    verifies that the mask contains at least one online cpu. So the
    selector in the irq chip implementation needs to make sure to pick
    only an online cpu because otherwise:

    Offline Core 1
    Set affinity to 0xe
    Selector will pick first set bit, i.e. core 1

    Signed-off-by: Thomas Gleixner
    Reviewed-by: David Vrabel
    Cc: Peter Zijlstra
    Cc: Konrad Rzeszutek Wilk
    Cc: Xen
    Link: http://lkml.kernel.org/r/20140304203100.978031089@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

05 Mar, 2014

3 commits

  • Signed-off-by: Thomas Gleixner
    Reviewed-by: David Vrabel
    Cc: Peter Zijlstra
    Cc: Konrad Rzeszutek Wilk
    Cc: Xen
    Link: http://lkml.kernel.org/r/20140223212738.808648133@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • Warn if any PIRQ cannot be bound to an event channel. Remove the check
    for irq_desc->action. This hypercall never fails in practice so we can
    emit a warning unconditionally.

    Remove a check for a valid irq desc. The only caller of
    xen_destroy_irq() will only do so if the irq was previously fully
    setup, which means the descriptor has been allocated as well.

    Signed-off-by: Thomas Gleixner
    Cc: Peter Zijlstra
    Cc: Konrad Rzeszutek Wilk
    Cc: Xen
    Cc: David Vrabel
    Link: http://lkml.kernel.org/r/20140223212738.579581220@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     
  • generic_handler_irq() already tests for !desc so use this instead of
    generic_handle_irq_desc().

    Use irq_get_irq_data() instead of desc->irq_data.

    Signed-off-by: Thomas Gleixner
    Reviewed-by: David Vrabel
    Cc: Peter Zijlstra
    Cc: Konrad Rzeszutek Wilk
    Cc: Xen
    Link: http://lkml.kernel.org/r/20140223212738.222412125@linutronix.de
    Signed-off-by: Thomas Gleixner

    Thomas Gleixner
     

01 Mar, 2014

7 commits

  • As result of deprecation of MSI-X/MSI enablement functions
    pci_enable_msix() and pci_enable_msi_block() all drivers
    using these two interfaces need to be updated to use the
    new pci_enable_msi_range() or pci_enable_msi_exact()
    and pci_enable_msix_range() or pci_enable_msix_exact()
    interfaces.

    Signed-off-by: Alexander Gordeev
    Cc: Konrad Rzeszutek Wilk
    Cc: Boris Ostrovsky
    Cc: David Vrabel
    Cc: linux-pci@vger.kernel.org
    Signed-off-by: Konrad Rzeszutek Wilk
    Reviewed-by: Boris Ostrovsky

    Alexander Gordeev
     
  • xenbus_bind_evtchn() has no callers so remove it.

    Signed-off-by: David Vrabel
    Signed-off-by: Konrad Rzeszutek Wilk
    Reviewed-by: Boris Ostrovsky

    David Vrabel
     
  • Since bind_evtchn_to_cpu() is always called after an event channel is
    bound, there is no need to call it after closing an event channel.

    Signed-off-by: David Vrabel
    Signed-off-by: Konrad Rzeszutek Wilk
    Reviewed-by: Boris Ostrovsky

    David Vrabel
     
  • resend_irq_on_evtchn() was only used by ia64 (which no longer has Xen
    support).

    Signed-off-by: David Vrabel
    Signed-off-by: Konrad Rzeszutek Wilk
    Reviewed-by: Boris Ostrovsky

    David Vrabel
     
  • When I looked at this issue https://lkml.org/lkml/2013/11/21/158, I found that
    frontswap_selfshrink() doesn't work as expected sometimes.
    Pages are continuously added to frontswap and gotten back soon. It's a waste of
    cpu time and increases the memory pressue of Guest OS.

    Take an example.
    First time in frontswap_selfshrink():
    1. last_frontswap_pages = cur_frontswap_pages = 0
    2. cur_frontswap_pages = frontswap_curr_pages() = 100

    When 'frontswap_inertia_counter' decreased to 0:
    1. last_frontswap_pages = cur_frontswap_pages = 100
    2. cur_frontswap_pages = frontswap_curr_pages() = 100
    3. call frontswap_shrink() and let's assumption that 10 pages are gotten back
    from frontswap.
    4. now frontswap_curr_pages() is 90.

    If then memory is not enough in Guest OS and 9 more pages(smaller than gotten
    back) added to frontswap.
    Now frontswap_curr_pages() is 99 and we don't expect to get back more pages from
    frontswap because geust os is under memory pressure.

    But next time in frontswap_selfshrink():
    1. last_frontswap_pages is set to the old value of cur_frontswap_pages(still
    100)
    2. cur_frontswap_pages(99) is still smaller than last_frontswap_pages.
    3. call frontswap_shrink() and continue to get back pages from frontswap!!

    Signed-off-by: Bob Liu
    Signed-off-by: Konrad Rzeszutek Wilk

    Bob Liu
     
  • Include appropriate header file in xen/pcpu.c because include/xen/acpi.h
    contains prototype declaration of functions defined in the file.

    This eliminates the following warning in xen/pcpu.c:
    drivers/xen/pcpu.c:336:6: warning: no previous prototype for ‘xen_pcpu_hotplug_sync’ [-Wmissing-prototypes]
    drivers/xen/pcpu.c:346:5: warning: no previous prototype for ‘xen_pcpu_id’ [-Wmissing-prototypes]

    Signed-off-by: Rashika Kheria
    Reviewed-by: Josh Triplett
    Signed-off-by: Konrad Rzeszutek Wilk
    Reviewed-by: David Vrabel

    Rashika Kheria
     
  • Mark function as static in xen/platform-pci.c because it is not used
    outside this file.

    This eliminates the following warning in xen/platform-pci.c:
    drivers/xen/platform-pci.c:48:15: warning: no previous prototype for ‘alloc_xen_mmio’ [-Wmissing-prototypes]

    Signed-off-by: Rashika Kheria
    Reviewed-by: Josh Triplett
    Signed-off-by: Konrad Rzeszutek Wilk
    Reviewed-by: David Vrabel

    Rashika Kheria
     

22 Feb, 2014

1 commit


21 Feb, 2014

1 commit


13 Feb, 2014

1 commit

  • Pull Xen bugfixes from Konrad Rzeszutek Wilk:
    "This has an healthy amount of code being removed - which we do not use
    anymore (the only user of it was ia64 Xen which had been removed
    already). The other bug-fixes are to make Xen ARM be able to use the
    new event channel mechanism and proper export of header files to
    user-space.

    Summary:
    - Fix ARM and Xen FIFO not working.
    - Remove more Xen ia64 vestigates.
    - Fix UAPI missing Xen files"

    * tag 'stable/for-linus-3.14-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    ia64/xen: Remove Xen support for ia64 even more
    xen: install xen/gntdev.h and xen/gntalloc.h
    xen/events: bind all new interdomain events to VCPU0

    Linus Torvalds
     

11 Feb, 2014

2 commits

  • Commit d52eefb47d4e ("ia64/xen: Remove Xen support for ia64") removed
    the Kconfig symbol XEN_XENCOMM. But it didn't remove the code depending
    on that symbol. Remove that code now.

    Signed-off-by: Paul Bolle
    Acked-by: David Vrabel
    Signed-off-by: Konrad Rzeszutek Wilk

    Paul Bolle
     
  • Commit fc087e10734a4d3e40693fc099461ec1270b3fff (xen/events: remove
    unnecessary init_evtchn_cpu_bindings()) causes a regression.

    The kernel-side VCPU binding was not being correctly set for newly
    allocated or bound interdomain events. In ARM guests where 2-level
    events were used, this would result in no interdomain events being
    handled because the kernel-side VCPU masks would all be clear.

    x86 guests would work because the irq affinity was set during irq
    setup and this would set the correct kernel-side VCPU binding.

    Fix this by properly initializing the kernel-side VCPU binding in
    bind_evtchn_to_irq().

    Reported-and-tested-by: Julien Grall
    Signed-off-by: David Vrabel
    Signed-off-by: Konrad Rzeszutek Wilk

    David Vrabel
     

06 Feb, 2014

1 commit


03 Feb, 2014

1 commit


01 Feb, 2014

1 commit

  • …inux/kernel/git/xen/tip

    Pull Xen bugfixes from Konrad Rzeszutek Wilk:
    "Bug-fixes for the new features that were added during this cycle.

    There are also two fixes for long-standing issues for which we have a
    solution: grant-table operations extra work that was not needed
    causing performance issues and the self balloon code was too
    aggressive causing OOMs.

    Details:
    - Xen ARM couldn't use the new FIFO events
    - Xen ARM couldn't use the SWIOTLB if compiled as 32-bit with 64-bit PCIe devices.
    - Grant table were doing needless M2P operations.
    - Ratchet down the self-balloon code so it won't OOM.
    - Fix misplaced kfree in Xen PVH error code paths"

    * tag 'stable/for-linus-3.14-rc0-late-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
    xen/pvh: Fix misplaced kfree from xlated_setup_gnttab_pages
    drivers: xen: deaggressive selfballoon driver
    xen/grant-table: Avoid m2p_override during mapping
    xen/gnttab: Use phys_addr_t to describe the grant frame base address
    xen: swiotlb: handle sizeof(dma_addr_t) != sizeof(phys_addr_t)
    arm/xen: Initialize event channels earlier

    Linus Torvalds
     

31 Jan, 2014

2 commits

  • Current xen-selfballoon driver is too aggressive which may cause OOM be
    triggered more often. Eg. this bug reported by James:
    https://lkml.org/lkml/2013/11/21/158

    There are two mainly reasons:
    1) The original goal_page didn't consider some pages used by kernel space, like
    slab pages and pages used by device drivers.

    2) The balloon driver may not give back memory to guest OS fast enough when the
    workload suddenly aquries a lot of physical memory.

    In both cases, the guest OS will suffer from memory pressure and OOM may
    be triggered.

    The fix is make xen-selfballoon driver not that aggressive by adding extra 10%
    of total ram pages to goal_page.
    It's more valuable to keep the guest system reliable and response faster than
    balloon out these 10% pages to XEN.

    Signed-off-by: Bob Liu
    Signed-off-by: Konrad Rzeszutek Wilk

    Bob Liu
     
  • The grant mapping API does m2p_override unnecessarily: only gntdev needs it,
    for blkback and future netback patches it just cause a lock contention, as
    those pages never go to userspace. Therefore this series does the following:
    - the original functions were renamed to __gnttab_[un]map_refs, with a new
    parameter m2p_override
    - based on m2p_override either they follow the original behaviour, or just set
    the private flag and call set_phys_to_machine
    - gnttab_[un]map_refs are now a wrapper to call __gnttab_[un]map_refs with
    m2p_override false
    - a new function gnttab_[un]map_refs_userspace provides the old behaviour

    It also removes a stray space from page.h and change ret to 0 if
    XENFEAT_auto_translated_physmap, as that is the only possible return value
    there.

    v2:
    - move the storing of the old mfn in page->index to gnttab_map_refs
    - move the function header update to a separate patch

    v3:
    - a new approach to retain old behaviour where it needed
    - squash the patches into one

    v4:
    - move out the common bits from m2p* functions, and pass pfn/mfn as parameter
    - clear page->private before doing anything with the page, so m2p_find_override
    won't race with this

    v5:
    - change return value handling in __gnttab_[un]map_refs
    - remove a stray space in page.h
    - add detail why ret = 0 now at some places

    v6:
    - don't pass pfn to m2p* functions, just get it locally

    Signed-off-by: Zoltan Kiss
    Suggested-by: David Vrabel
    Acked-by: David Vrabel
    Acked-by: Stefano Stabellini
    Signed-off-by: Konrad Rzeszutek Wilk

    Zoltan Kiss
     

30 Jan, 2014

2 commits

  • On ARM, address size can be 32 bits or 64 bits (if CONFIG_ARCH_PHYS_ADDR_T_64BIT
    is enabled).
    We can't assume that the grant frame base address will always fits in an
    unsigned long. Use phys_addr_t instead of unsigned long as argument for
    gnttab_setup_auto_xlat_frames.

    Signed-off-by: Julien Grall
    Signed-off-by: Stefano Stabellini
    Acked-by: Ian Campbell
    Reviewed-by: David Vrabel

    Julien Grall
     
  • The use of phys_to_machine and machine_to_phys in the physbus conversions
    causes us to lose the top bits of the DMA address if the size of a DMA address is not the same as the size of the phyiscal address.

    This can happen in practice on ARM where foreign pages can be above 4GB even
    though the local kernel does not have LPAE page tables enabled (which is
    totally reasonable if the guest does not itself have >4GB of RAM). In this
    case the kernel still maps the foreign pages at a phys addr below 4G (as it
    must) but the resulting DMA address (returned by the grant map operation) is
    much higher.

    This is analogous to a hardware device which has its view of RAM mapped up
    high for some reason.

    This patch makes I/O to foreign pages (specifically blkif) work on 32-bit ARM
    systems with more than 4GB of RAM.

    Signed-off-by: Ian Campbell
    Signed-off-by: Stefano Stabellini

    Ian Campbell