15 Sep, 2011

1 commit

  • When we allocate/change the IRQ informations, we do not
    need to use spinlocks. We can use a mutex (which is
    what the generic IRQ code does for allocations/changes).
    Fixes a slew of:

    BUG: sleeping function called from invalid context at /linux/kernel/mutex.c:271
    in_atomic(): 1, irqs_disabled(): 0, pid: 3216, name: xenstored
    2 locks held by xenstored/3216:
    #0: (&u->bind_mutex){......}, at: [] evtchn_ioctl+0x30/0x3a0 [xen_evtchn]
    #1: (irq_mapping_update_lock){......}, at: [] bind_evtchn_to_irq+0x24/0x90
    Pid: 3216, comm: xenstored Not tainted 3.1.0-rc6-00021-g437a3d1 #2
    Call Trace:
    [] __might_sleep+0x100/0x130
    [] mutex_lock_nested+0x2f/0x50
    [] __irq_alloc_descs+0x49/0x200
    [] ? evtchn_ioctl+0x30/0x3a0 [xen_evtchn]
    [] xen_allocate_irq_dynamic+0x34/0x70
    [] bind_evtchn_to_irq+0x5d/0x90
    [] ? evtchn_bind_to_user+0x60/0x60 [xen_evtchn]
    [] bind_evtchn_to_irqhandler+0x32/0x80
    [] evtchn_bind_to_user+0x49/0x60 [xen_evtchn]
    [] evtchn_ioctl+0x144/0x3a0 [xen_evtchn]
    [] ? vfsmount_lock_local_unlock+0x50/0x80
    [] do_vfs_ioctl+0x9a/0x5e0
    [] ? mntput+0x1f/0x30
    [] ? fput+0x199/0x240
    [] sys_ioctl+0xa1/0xb0
    [] system_call_fastpath+0x16/0x1b

    Reported-by: Jim Burns
    Acked-by: Ian Campbell
    Signed-off-by: Konrad Rzeszutek Wilk

    Konrad Rzeszutek Wilk
     

11 Aug, 2011

1 commit

  • Fix build errors (found when CONFIG_SYSFS is not enabled):

    drivers/xen/xen-selfballoon.c:446: warning: data definition has no type or storage class
    drivers/xen/xen-selfballoon.c:446: warning: type defaults to 'int' in declaration of 'EXPORT_SYMBOL'
    drivers/xen/xen-selfballoon.c:446: warning: parameter names (without types) in function declaration
    drivers/xen/xen-selfballoon.c:485: error: expected declaration specifiers or '...' before string constant
    drivers/xen/xen-selfballoon.c:485: warning: data definition has no type or storage class
    drivers/xen/xen-selfballoon.c:485: warning: type defaults to 'int' in declaration of 'MODULE_LICENSE'
    drivers/xen/xen-selfballoon.c:485: warning: function declaration isn't a prototype

    Signed-off-by: Randy Dunlap
    Signed-off-by: Konrad Rzeszutek Wilk

    Randy Dunlap
     

04 Aug, 2011

4 commits


26 Jul, 2011

1 commit

  • Memory hotplug support for Xen balloon driver. It should be mentioned
    that hotplugged memory is not onlined automatically. It should be onlined
    by user through standard sysfs interface.

    Memory could be hotplugged in following steps:

    1) dom0: xl mem-max
    where is >= requested memory size,

    2) dom0: xl mem-set
    where is requested memory size; alternatively memory
    could be added by writing proper value to
    /sys/devices/system/xen_memory/xen_memory0/target or
    /sys/devices/system/xen_memory/xen_memory0/target_kb on dumU,

    3) domU: for i in /sys/devices/system/memory/memory*/state; do \
    [ "`cat "$i"`" = offline ] && echo online > "$i"; done

    Memory could be onlined automatically on domU by adding following line to
    udev rules:

    SUBSYSTEM=="memory", ACTION=="add", RUN+="/bin/sh -c '[ -f /sys$devpath/state ] && echo online > /sys$devpath/state'"

    In that case step 3 should be omitted.

    Signed-off-by: Daniel Kiper
    Acked-by: Konrad Rzeszutek Wilk
    Cc: Ian Campbell
    Cc: Jeremy Fitzhardinge
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Daniel Kiper
     

23 Jul, 2011

1 commit

  • * 'stable/drivers' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen/pciback: Have 'passthrough' option instead of XEN_PCIDEV_BACKEND_PASS and XEN_PCIDEV_BACKEND_VPCI
    xen/pciback: Remove the DEBUG option.
    xen/pciback: Drop two backends, squash and cleanup some code.
    xen/pciback: Print out the MSI/MSI-X (PIRQ) values
    xen/pciback: Don't setup an fake IRQ handler for SR-IOV devices.
    xen: rename pciback module to xen-pciback.
    xen/pciback: Fine-grain the spinlocks and fix BUG: scheduling while atomic cases.
    xen/pciback: Allocate IRQ handler for device that is shared with guest.
    xen/pciback: Disable MSI/MSI-X when reseting a device
    xen/pciback: guest SR-IOV support for PV guest
    xen/pciback: Register the owner (domain) of the PCI device.
    xen/pciback: Cleanup the driver based on checkpatch warnings and errors.
    xen/pciback: xen pci backend driver.
    xen: tmem: self-ballooning and frontswap-selfshrinking
    xen: Add module alias to autoload backend drivers
    xen: Populate xenbus device attributes
    xen: Add __attribute__((format(printf... where appropriate
    xen: prepare tmem shim to handle frontswap
    xen: allow enable use of VGA console on dom0

    Linus Torvalds
     

21 Jul, 2011

1 commit

  • * stable/xen-pciback-0.6.3:
    xen/pciback: Have 'passthrough' option instead of XEN_PCIDEV_BACKEND_PASS and XEN_PCIDEV_BACKEND_VPCI
    xen/pciback: Remove the DEBUG option.
    xen/pciback: Drop two backends, squash and cleanup some code.
    xen/pciback: Print out the MSI/MSI-X (PIRQ) values
    xen/pciback: Don't setup an fake IRQ handler for SR-IOV devices.
    xen: rename pciback module to xen-pciback.
    xen/pciback: Fine-grain the spinlocks and fix BUG: scheduling while atomic cases.
    xen/pciback: Allocate IRQ handler for device that is shared with guest.
    xen/pciback: Disable MSI/MSI-X when reseting a device
    xen/pciback: guest SR-IOV support for PV guest
    xen/pciback: Register the owner (domain) of the PCI device.
    xen/pciback: Cleanup the driver based on checkpatch warnings and errors.
    xen/pciback: xen pci backend driver.

    Conflicts:
    drivers/xen/Kconfig

    Konrad Rzeszutek Wilk
     

20 Jul, 2011

13 commits

  • …N_PCIDEV_BACKEND_VPCI

    .. compile options. This way the user can decide during runtime whether they
    want the default 'vpci' (virtual pci passthrough) or where the PCI devices
    are passed in without any BDF renumbering. The option 'passthrough' allows
    the user to toggle the it from 0 (vpci) to 1 (passthrough).

    Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>

    Konrad Rzeszutek Wilk
     
  • The latter is easily fixed - by the developer compiling the
    module with -DDEBUG. And during runtime - the loglvl provides
    quite a lot of useful data.

    Signed-off-by: Konrad Rzeszutek Wilk

    Konrad Rzeszutek Wilk
     
  • - Remove the slot and controller controller backend as they
    are not used.
    - Document the find pciback_[read|write]_config_[byte|word|dword]
    to make it easier to find.
    - Collapse the code from conf_space_capability_msi into pciback_ops.c
    - Collapse conf_space_capability_[pm|vpd].c in conf_space_capability.c
    [and remove the conf_space_capability.h file]
    - Rename all visible functions from pciback to xen_pcibk.
    - Rename all the printk/pr_info, etc that use the "pciback" to say
    "xen-pciback".
    - Convert functions that are not referenced outside the code to be
    static to save on name space.
    - Do the same thing for structures that are internal to the driver.
    - Run checkpatch.pl after the renames and fixup its warnings and
    fix any compile errors caused by the variable rename
    - Cleanup any structs that checkpath.pl commented about or just
    look odd.

    Signed-off-by: Konrad Rzeszutek Wilk

    Konrad Rzeszutek Wilk
     
  • If the verbose_request is set (and loglevel high enough), print out
    the MSI/MSI-X values that are sent to the guest. This should aid in
    debugging issues.

    Signed-off-by: Konrad Rzeszutek Wilk

    Konrad Rzeszutek Wilk
     
  • If we try to setup an fake IRQ handler for legacy interrupts
    for devices that only have MSI-X (most if not all SR-IOV cards),
    we will fail with this:

    pciback[0000:01:10.0]: failed to install fake IRQ handler for IRQ 0! (rc:-38)

    Since those cards don't have anything in dev->irq.

    Signed-off-by: Konrad Rzeszutek Wilk

    Konrad Rzeszutek Wilk
     
  • pciback is rather generic for a modular distro style kernel.

    Signed-off-by: Ian Campbell
    Cc: Jeremy Fitzhardinge
    Cc: Konrad Rzeszutek Wilk
    Signed-off-by: Jeremy Fitzhardinge

    Ian Campbell
     
  • We were using coarse spinlocks that could end up with a deadlock.
    This patch fixes that and makes the spinlocks much more fine-grained.

    We also drop be->watchding state spinlocks as they are already
    guarded by the xenwatch_thread against multiple customers. Without
    that we would trigger the BUG: scheduling while atomic.

    Signed-off-by: Konrad Rzeszutek Wilk

    Konrad Rzeszutek Wilk
     
  • If the device that is to be shared with a guest is a level device and
    the IRQ is shared with the initial domain we need to take actions.
    Mainly we install a dummy IRQ handler that will ACK on the interrupt
    line so as to not have the initial domain disable the interrupt line.

    This dummy IRQ handler is not enabled when the device MSI/MSI-X lines
    are set, nor for edge interrupts. And also not for level interrupts
    that are not shared amongst devices. Lastly, if the user passes
    to the guest all of the PCI devices on the shared line the we won't
    install the dummy handler either.

    There is also SysFS instrumentation to check its state and turn
    IRQ ACKing on/off if necessary.

    Signed-off-by: Konrad Rzeszutek Wilk

    Konrad Rzeszutek Wilk
     
  • In cases where the guest is abruptly killed and has not disabled
    MSI/MSI-X interrupts we want to do it for it.

    Otherwise when the guest is started up and enables MSI, we would
    get a WARN() that the device already had been enabled.

    Signed-off-by: Konrad Rzeszutek Wilk

    Konrad Rzeszutek Wilk
     
  • These changes are for PV guest to use Virtual Function. Because the VF's
    vendor, device registers in cfg space are 0xffff, which are invalid and
    ignored by PCI device scan. Values in 'struct pci_dev' are fixed up by
    SR-IOV code, and using these values will present correct VID and DID to
    PV guest kernel.

    And command registers in the cfg space are read only 0, which means we
    have to emulate MMIO enable bit (VF only uses MMIO resource) so PV
    kernel can work properly.

    Acked-by: Jan Beulich
    Signed-off-by: Konrad Rzeszutek Wilk

    Zhao, Yu
     
  • When the front-end and back-end start negotiating we register
    the domain that will use the PCI device. Furthermore during shutdown
    of guest or unbinding of the PCI device (and unloading of module)
    from pciback we unregister the domain owner.

    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Jeremy Fitzhardinge

    Konrad Rzeszutek Wilk
     
  • Checkpatch found some extra warnings and errors. This mega
    patch fixes them all in one big swoop. We also spruce
    up the pcistub_ids to use DEFINE_PCI_DEVICE_TABLE macro
    (suggested by Jan Beulich).

    Signed-off-by: Konrad Rzeszutek Wilk

    Konrad Rzeszutek Wilk
     
  • This is the host side counterpart to the frontend driver in
    drivers/pci/xen-pcifront.c. The PV protocol is also implemented by
    frontend drivers in other OSes too, such as the BSDs.

    The PV protocol is rather simple. There is page shared with the guest,
    which has the 'struct xen_pci_sharedinfo' embossed in it. The backend
    has a thread that is kicked every-time the structure is changed and
    based on the operation field it performs specific tasks:

    XEN_PCI_OP_conf_[read|write]:
    Read/Write 0xCF8/0xCFC filtered data. (conf_space*.c)
    Based on which field is probed, we either enable/disable the PCI
    device, change power state, read VPD, etc. The major goal of this
    call is to provide a Physical IRQ (PIRQ) to the guest.

    The PIRQ is Xen hypervisor global IRQ value irrespective of the IRQ
    is tied in to the IO-APIC, or is a vector. For GSI type
    interrupts, the PIRQ==GSI holds. For MSI/MSI-X the
    PIRQ value != Linux IRQ number (thought PIRQ==vector).

    Please note, that with Xen, all interrupts (except those level shared ones)
    are injected directly to the guest - there is no host interaction.

    XEN_PCI_OP_[enable|disable]_msi[|x] (pciback_ops.c)
    Enables/disables the MSI/MSI-X capability of the device. These operations
    setup the MSI/MSI-X vectors for the guest and pass them to the frontend.

    When the device is activated, the interrupts are directly injected in the
    guest without involving the host.

    XEN_PCI_OP_aer_[detected|resume|mmio|slotreset]: In case of failure,
    perform the appropriate AER commands on the guest. Right now that is
    a cop-out - we just kill the guest.

    Besides implementing those commands, it can also

    - hide a PCI device from the host. When booting up, the user can specify
    xen-pciback.hide=(1:0:0)(BDF..) so that host does not try to use the
    device.

    The driver was lifted from linux-2.6.18.hg tree and fixed up
    so that it could compile under v3.0. Per suggestion from Jesse Barnes
    moved the driver to drivers/xen/xen-pciback.

    Signed-off-by: Konrad Rzeszutek Wilk
    Signed-off-by: Jeremy Fitzhardinge

    Konrad Rzeszutek Wilk
     

12 Jul, 2011

2 commits


09 Jul, 2011

2 commits

  • …nel/git/djm/tmem into stable/drivers

    * 'xen-tmem-selfballoon-v8' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem:
    xen: tmem: self-ballooning and frontswap-selfshrinking

    Konrad Rzeszutek Wilk
     
  • This patch introduces two in-kernel drivers for Xen transcendent memory
    ("tmem") functionality that complement cleancache and frontswap. Both
    use control theory to dynamically adjust and optimize memory utilization.
    Selfballooning controls the in-kernel Xen balloon driver, targeting a goal
    value (vm_committed_as), thus pushing less frequently used clean
    page cache pages (through the cleancache code) into Xen tmem where
    Xen can balance needs across all VMs residing on the physical machine.
    Frontswap-selfshrinking controls the number of pages in frontswap,
    driving it towards zero (effectively doing a partial swapoff) when
    in-kernel memory pressure subsides, freeing up RAM for other VMs.

    More detail is provided in the header comment of xen-selfballooning.c.

    Signed-off-by: Dan Magenheimer

    [v8: konrad.wilk@oracle.com: set default enablement depending on frontswap]
    [v7: konrad.wilk@oracle.com: fix capitalization and punctuation in comments]
    [v6: fix frontswap-selfshrinking initialization]
    [v6: konrad.wilk@oracle.com: fix init pr_infos; add comments about swap]
    [v5: konrad.wilk@oracle.com: add NULL to attr list; move inits up to decls]
    [v4: dkiper@net-space.pl: use strict_strtoul plus a few syntactic nits]
    [v3: konrad.wilk@oracle.com: fix potential divides-by-zero]
    [v3: konrad.wilk@oracle.com: add many more comments, fix nits]
    [v2: rebased to linux-3.0-rc1]
    [v2: Ian.Campbell@citrix.com: reorganize as new file (xen-selfballoon.c)]
    [v2: dkiper@net-space.pl: proper access to vm_committed_as]
    [v2: dkiper@net-space.pl: accounting fixes]
    Cc: Jan Beulich
    Cc: Jeremy Fitzhardinge
    Cc:

    Dan Magenheimer
     

01 Jul, 2011

2 commits

  • All the Xen backend drivers are assigned to a special bus type
    xen-backend. This patch exports xen-backend:* names through modalias and
    uevent to autoload them.

    Signed-off-by: Bastian Blank
    Acked-by: Ian Campbell
    Signed-off-by: Konrad Rzeszutek Wilk

    Bastian Blank
     
  • The xenbus bus type uses device_create_file to assign all used device
    attributes. However it does not remove them when the device goes away.

    This patch uses the dev_attrs field of the bus type to specify default
    attributes for all devices.

    Signed-off-by: Bastian Blank
    Acked-by: Ian Campbell
    Signed-off-by: Konrad Rzeszutek Wilk

    Bastian Blank
     

21 Jun, 2011

2 commits


18 Jun, 2011

1 commit


16 Jun, 2011

1 commit


10 Jun, 2011

1 commit


07 Jun, 2011

1 commit

  • By default the io_tlb_nslabs is set to zero, and gets set to
    whatever value is passed in via swiotlb_init_with_tbl function.
    The default value passed in is 64MB. However, if the user provides
    the 'swiotlb=' the default value is ignored and
    the value provided by the user is used... Except when the SWIOTLB
    is used under Xen - there the default value of 64MB is used and
    the Xen-SWIOTLB has no mechanism to get the 'io_tlb_nslabs' filled
    out by setup_io_tlb_npages functions. This patch provides a function
    for the Xen-SWIOTLB to call to see if the io_tlb_nslabs is set
    and if so use that value.

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Konrad Rzeszutek Wilk

    FUJITA Tomonori
     

31 May, 2011

1 commit


27 May, 2011

2 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/djm/tmem:
    xen: cleancache shim to Xen Transcendent Memory
    ocfs2: add cleancache support
    ext4: add cleancache support
    btrfs: add cleancache support
    ext3: add cleancache support
    mm/fs: add hooks to support cleancache
    mm: cleancache core ops functions and config
    fs: add field to superblock to support cleancache
    mm/fs: cleancache documentation

    Fix up trivial conflict in fs/btrfs/extent_io.c due to includes

    Linus Torvalds
     
  • This patch provides a shim between the kernel-internal cleancache
    API (see Documentation/mm/cleancache.txt) and the Xen Transcendent
    Memory ABI (see http://oss.oracle.com/projects/tmem).

    Xen tmem provides "hypervisor RAM" as an ephemeral page-oriented
    pseudo-RAM store for cleancache pages, shared cleancache pages,
    and frontswap pages. Tmem provides enterprise-quality concurrency,
    full save/restore and live migration support, compression
    and deduplication.

    A presentation showing up to 8% faster performance and up to 52%
    reduction in sectors read on a kernel compile workload, despite
    aggressive in-kernel page reclamation ("self-ballooning") can be
    found at:

    http://oss.oracle.com/projects/tmem/dist/documentation/presentations/TranscendentMemoryXenSummit2010.pdf

    Signed-off-by: Dan Magenheimer
    Reviewed-by: Jeremy Fitzhardinge
    Cc: Konrad Rzeszutek Wilk
    Cc: Matthew Wilcox
    Cc: Nick Piggin
    Cc: Mel Gorman
    Cc: Rik Van Riel
    Cc: Jan Beulich
    Cc: Chris Mason
    Cc: Andreas Dilger
    Cc: Ted Ts'o
    Cc: Mark Fasheh
    Cc: Joel Becker
    Cc: Nitin Gupta

    Dan Magenheimer
     

24 May, 2011

1 commit

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (39 commits)
    b43: fix comment typo reqest -> request
    Haavard Skinnemoen has left Atmel
    cris: typo in mach-fs Makefile
    Kconfig: fix copy/paste-ism for dell-wmi-aio driver
    doc: timers-howto: fix a typo ("unsgined")
    perf: Only include annotate.h once in tools/perf/util/ui/browsers/annotate.c
    md, raid5: Fix spelling error in comment ('Ofcourse' --> 'Of course').
    treewide: fix a few typos in comments
    regulator: change debug statement be consistent with the style of the rest
    Revert "arm: mach-u300/gpio: Fix mem_region resource size miscalculations"
    audit: acquire creds selectively to reduce atomic op overhead
    rtlwifi: don't touch with treewide double semicolon removal
    treewide: cleanup continuations and remove logging message whitespace
    ath9k_hw: don't touch with treewide double semicolon removal
    include/linux/leds-regulator.h: fix syntax in example code
    tty: fix typo in descripton of tty_termios_encode_baud_rate
    xtensa: remove obsolete BKL kernel option from defconfig
    m68k: fix comment typo 'occcured'
    arch:Kconfig.locks Remove unused config option.
    treewide: remove extra semicolons
    ...

    Linus Torvalds
     

20 May, 2011

2 commits

  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/suspend-2.6: (34 commits)
    PM: Introduce generic prepare and complete callbacks for subsystems
    PM: Allow drivers to allocate memory from .prepare() callbacks safely
    PM: Remove CONFIG_PM_VERBOSE
    Revert "PM / Hibernate: Reduce autotuned default image size"
    PM / Hibernate: Add sysfs knob to control size of memory for drivers
    PM / Wakeup: Remove useless synchronize_rcu() call
    kmod: always provide usermodehelper_disable()
    PM / ACPI: Remove acpi_sleep=s4_nonvs
    PM / Wakeup: Fix build warning related to the "wakeup" sysfs file
    PM: Print a warning if firmware is requested when tasks are frozen
    PM / Runtime: Rework runtime PM handling during driver removal
    Freezer: Use SMP barriers
    PM / Suspend: Do not ignore error codes returned by suspend_enter()
    PM: Fix build issue in clock_ops.c for CONFIG_PM_RUNTIME unset
    PM: Revert "driver core: platform_bus: allow runtime override of dev_pm_ops"
    OMAP1 / PM: Use generic clock manipulation routines for runtime PM
    PM: Remove sysdev suspend, resume and shutdown operations
    PM / PowerPC: Use struct syscore_ops instead of sysdevs for PM
    PM / UNICORE32: Use struct syscore_ops instead of sysdevs for PM
    PM / AVR32: Use struct syscore_ops instead of sysdevs for PM
    ...

    Linus Torvalds
     
  • …stable/mmu.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen

    * 'stable/irq' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen: do not clear and mask evtchns in __xen_evtchn_do_upcall

    * 'stable/p2m.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen/p2m: Create entries in the P2M_MFN trees's to track 1-1 mappings

    * 'stable/e820.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen/setup: Fix for incorrect xen_extra_mem_start initialization under 32-bit
    xen/setup: Ignore E820_UNUSABLE when setting 1-1 mappings.

    * 'stable/mmu.bugfixes' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
    xen mmu: fix a race window causing leave_mm BUG()

    Linus Torvalds