25 Jul, 2008

40 commits

  • To prepare for virtio_ring transport feature bits, hook in a call in
    all the users to manipulate them. This currently just clears all the
    bits, since it doesn't understand any features.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Rather than explicitly handing the features to the lower-level, we just
    hand the virtio_device and have it set the features. This make it clear
    that it has the chance to manipulate the features of the device at this
    point (and that all feature negotiation is already done).

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • We assign feature bits as required, but it makes sense to reserve some
    for the particular transport, rather than the particular device.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • This patch enables virtio_console as the default console on kvm for
    s390. We currently use the same notify hack as lguest for early
    console output. I will try to address this for lguest and s390 later.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Rusty Russell

    Christian Borntraeger
     
  • I also added a small Kconfig change that allows the user to specify the
    virtio console in menuconfig.

    (Fixes to export symbols from Stephen Rothwell )
    (Fixes for CONFIG_VIRTIO_CONSOLE=y vs CONFIG_VIRTIO=m from Christian himself)

    Signed-off-by: Rusty Russell
    Cc: Stephen Rothwell

    Christian Borntraeger
     
  • This patch exploits the new notifier callbacks of the hvc_console. We can
    use the virtio callbacks instead of the polling code.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Rusty Russell

    Christian Borntraeger
     
  • This patch tries to change hvc_console to not use request_irq/free_irq if
    the backend does not use irqs. This allows virtio_console to use hvc_console
    without having a linker reference to request_irq/free_irq.

    In addition, together with patch 2/3 it improves the performance for virtio
    console input. (an earlier version of this patch was tested by Yajin on lguest)

    The irq specific code is moved to hvc_irq.c and selected by the drivers that
    use irqs (System p, System i, XEN).

    I replaced "int irq" with the opaque "int data". The request_irq and
    free_irq calls are replaced with notifier_add and notifier_del. I have also
    changed the code a bit to call the notifier_add and notifier_del inside the
    spinlock area as the callbacks are found via hp->ops.

    Changes since last version:
    o remove ifdef
    o reintroduce "irq_requested" as "notified"
    o cleanups, sparse..

    I did not move the timer based polling into a separate polling scheme. I
    played with several variants, but it seems we need to sleep/schedule in
    a thread even for irq based consoles, as there are throttleing and buffer
    size constraints.

    I also kept hvc_struct defined in hvc_console.h so that hvc_irq.c can access
    the irq_requested element.

    Feedback is appreciated. virtio_console is currently the only available console
    for kvm on s390. I plan to push this change as soon as all affected parties
    agree on it. I would love to get test results from System p, Xen etc.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Rusty Russell

    Christian Borntraeger
     
  • Currently virtio_blk assumes a 512 byte hard sector size. This can cause
    trouble / performance issues if the backing has a different block size
    (like a file on an ext3 file system formatted with 4k block size or a dasd).

    Lets add a feature flag that tells the guest to use a different hard sector
    size than 512 byte.

    Signed-off-by: Christian Borntraeger
    Signed-off-by: Rusty Russell

    Christian Borntraeger
     
  • Hook up to the probe() and remove() methods in bus_type
    rather than device_driver. The latter has been preferred
    since 2.6.16.

    Signed-off-by: Mark McLoughlin
    Signed-off-by: Rusty Russell

    Mark McLoughlin
     
  • We force notification when the ring is full, even if the host has
    indicated it doesn't want to know. This seemed like a good idea at
    the time: if we fill the transmit ring, we should tell the host
    immediately.

    Unfortunately this logic also applies to the receiving ring, which is
    refilled constantly. We should introduce real notification thesholds
    to replace this logic. Meanwhile, removing the logic altogether breaks
    the heuristics which KVM uses, so we use a hack: only notify if there are
    outgoing parts of the new buffer.

    Here are the number of exits with lguest's crappy network implementation:
    Before:
    network xmit 7859051 recv 236420
    After:
    network xmit 7858610 recv 118136

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • We want others to implement and use virtio, so it makes sense to BSD
    license the non-__KERNEL__ parts of the headers to make this crystal
    clear.

    Signed-off-by: Rusty Russell
    Acked-by: Christian Borntraeger
    Acked-by: Mark McLoughlin
    Acked-by: Ryan Harper
    Acked-by: Eric Van Hensbergen
    Acked-by: Anthony Liguori

    Rusty Russell
     
  • If we hack the virtio_net driver to always allocate full-sized (64k+)
    skbuffs, the driver slows down (lguest numbers):

    Time to receive 1GB (small buffers): 10.85 seconds
    Time to receive 1GB (64k+ buffers): 24.75 seconds

    Of course, large buffers use up more space in the ring, so we increase
    that from 128 to 2048:

    Time to receive 1GB (64k+ buffers, 2k ring): 16.61 seconds

    If we recycle pages rather than using alloc_page/free_page:

    Time to receive 1GB (64k+ buffers, 2k ring, recycle pages): 10.81 seconds

    This demonstrates that with efficient allocation, we don't need to
    have a separate "small buffer" queue.

    Signed-off-by: Rusty Russell

    Rusty Russell
     
  • Finally this patch lets virtio_net receive GSO packets in addition
    to sending them. This can definitely be optimised for the non-GSO
    case. For comparison the Xen approach stores one page in each skb
    and uses subsequent skb's pages to construct an SG skb instead of
    preallocating the maximum amount of pages per skb.

    Signed-off-by: Rusty Russell (added feature bits)

    Herbert Xu
     
  • This patch adds some basic ethtool operations to virtio_net so
    I could test SG without GSO (which was really useful because TSO
    turned out to be buggy :)

    Signed-off-by: Rusty Russell (remove MTU setting)

    Herbert Xu
     
  • On Mon, 2008-05-26 at 17:42 +1000, Rusty Russell wrote:
    > If we fail to transmit a packet, we assume the queue is full and put
    > the skb into last_xmit_skb. However, if more space frees up before we
    > xmit it, we loop, and the result can be transmitting the same skb twice.
    >
    > Fix is simple: set skb to NULL if we've used it in some way, and check
    > before sending.
    ...
    > diff -r 564237b31993 drivers/net/virtio_net.c
    > --- a/drivers/net/virtio_net.c Mon May 19 12:22:00 2008 +1000
    > +++ b/drivers/net/virtio_net.c Mon May 19 12:24:58 2008 +1000
    > @@ -287,21 +287,25 @@ again:
    > free_old_xmit_skbs(vi);
    >
    > /* If we has a buffer left over from last time, send it now. */
    > - if (vi->last_xmit_skb) {
    > + if (unlikely(vi->last_xmit_skb)) {
    > if (xmit_skb(vi, vi->last_xmit_skb) != 0) {
    > /* Drop this skb: we only queue one. */
    > vi->dev->stats.tx_dropped++;
    > kfree_skb(skb);
    > + skb = NULL;
    > goto stop_queue;
    > }
    > vi->last_xmit_skb = NULL;

    With this, may drop an skb and then later in the function discover that
    we could have sent it after all. Poor wee skb :)

    How about the incremental patch below?

    Cheers,
    Mark.

    Subject: [PATCH] virtio_net: Delay dropping tx skbs

    Currently we drop the skb in start_xmit() if we have a
    queued buffer and fail to transmit it.

    However, if we delay dropping it until we've stopped the
    queue and enabled the tx notification callback, then there
    is a chance space might become available for it.

    Signed-off-by: Mark McLoughlin
    Signed-off-by: Rusty Russell

    Mark McLoughlin
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/bart/ide-2.6: (76 commits)
    ide: use proper printk() KERN_* levels in ide-probe.c
    ide: fix for EATA SCSI HBA in ATA emulating mode
    ide: remove stale comments from drivers/ide/Makefile
    ide: enable local IRQs in all handlers for TASKFILE_NO_DATA data phase
    ide-scsi: remove kmalloced struct request
    ht6560b: remove old history
    ht6560b: update email address
    ide-cd: fix oops when using growisofs
    gayle: release resources on ide_host_add() failure
    palm_bk3710: add UltraDMA/100 support
    ide: trivial sparse annotations
    ide: ide-tape.c sparse annotations and unaligned access removal
    ide: drop 'name' parameter from ->init_chipset method
    ide: prefix messages from IDE PCI host drivers by driver name
    it821x: remove DECLARE_ITE_DEV() macro
    it8213: remove DECLARE_ITE_DEV() macro
    ide: include PCI device name in messages from IDE PCI host drivers
    ide: remove for some archs
    ide-generic: remove ide_default_{io_base,irq}() inlines (take 3)
    ide-generic: is no longer needed on ppc32
    ...

    Linus Torvalds
     
  • * 'release-2.6.27' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-acpi-2.6:
    acpi: fix crash in core ACPI code, triggered by CONFIG_ACPI_PCI_SLOT=y
    ACPI: thinkpad-acpi: don't misdetect in get_thinkpad_model_data() on -ENOMEM
    ACPI: thinkpad-acpi: bump up version to 0.21
    ACPI: thinkpad-acpi: add bluetooth and WWAN rfkill support
    ACPI: thinkpad-acpi: WLSW overrides other rfkill switches
    ACPI: thinkpad-acpi: prepare for bluetooth and wwan rfkill support
    ACPI: thinkpad-acpi: consolidate wlsw notification function
    ACPI: thinkpad-acpi: minor refactor on radio switch init
    Revert "ACPI: don't walk tables if ACPI was disabled"
    Revert "dock: bay: Don't call acpi_walk_namespace() when ACPI is disabled."
    Revert "Fix FADT parsing"
    ACPI : Set FAN device to correct state in boot phase
    ACPI: Ignore _BQC object when registering backlight device
    ACPI: stop complaints about interrupt link End Tags and blank IRQ descriptors

    Linus Torvalds
     
  • * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jbarnes/pci-2.6:
    PCI: fixup sparse endianness warnings in proc.c
    PCI PM: make more PCI PM core functionality available to drivers
    PCI/DMAR: don't assume presence of RMRRs
    PCI hotplug: fix error path in pci_slot's register_slot

    Linus Torvalds
     
  • While at it:

    - fixup printk() messages in save_match() and hwif_init().

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • IDE probing code used to skip devices attached to EATA SCSI HBA
    in ATA emulating mode but because of warm-plug support port I/O
    resources are no longer freed if no devices are detected on a port
    and the decision about the driver to use is left up to the user.

    Remove no longer valid EATA SCSI HBA quirk from do_identify().

    Noticed-by: Alan Cox
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • It is already done by task_no_data_intr() and there is no reason
    not to do it in other TASKFILE_NO_DATA data phase handlers.

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • This converts ide-scsi to use blk_get/put_request instead of
    kmalloc/kfree.

    Signed-off-by: FUJITA Tomonori
    Signed-off-by: Bartlomiej Zolnierkiewicz

    FUJITA Tomonori
     
  • Remove the ancient version history. Git does a better job.

    From: Jan Evert van Grootheest
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Jan Evert van Grootheest
     
  • Update email address.

    From: Jan Evert van Grootheest
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Jan Evert van Grootheest
     
  • cdrom_read_capacity() will blindly return the capacity from the device
    without sanity-checking it. This later causes code in fs/buffer.c to
    oops.

    Fix this by checking that the device is telling us sensible things.

    From: Jens Axboe
    Cc: Michael Buesch
    Cc: Jan Kara
    Cc: Arnd Bergmann
    Cc:
    Cc: Borislav Petkov
    Signed-off-by: Andrew Morton
    [bart: print device name instead of driver name]
    Signed-off-by: Bartlomiej Zolnierkiewicz
    [harvey: blocklen is a big-endian value]
    Signed-off-by: Harvey Harrison
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Jens Axboe
     
  • "gayle: reserve memory resources at once" patch temporary removed
    freeing of resources on failure (to ease convertion to ide_host_add()
    interface). This patch fixes it.

    Thanks to Geert for noticing the issue.

    Noticed-by: Geert Uytterhoeven
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • This controller supports UltraDMA up to mode 5 but it should be clocked with
    at least twice the data strobe frequency, so enable mode 5 for 100+ MHz IDECLK.

    While at it, start passing the correct device to clk_get() -- it worked anyway
    but WTF? :-/

    Signed-off-by: Sergei Shtylyov
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Sergei Shtylyov
     
  • Signed-off-by: Harvey Harrison
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Harvey Harrison
     
  • If this is actually unaligned the access of speed/max_speed above
    is already broken and needs a get_unaligned. Otherwise it is
    aligned and they can be removed.

    Signed-off-by: Harvey Harrison
    Cc: Borislav Petkov
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Harvey Harrison
     
  • There should be no functional changes caused by this patch.

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • Prefix messages from IDE PCI host drivers by driver name instead of marketed
    chipset name (it is still possible to exactly identify the particular chipset
    basing on driver messages).

    As a bonus this provides nice code savings for some drivers:

    text data bss dec hex filename
    3826 112 8 3946 f6a drivers/ide/pci/amd74xx.o.before
    2786 112 8 2906 b5a drivers/ide/pci/amd74xx.o.after
    764 108 0 872 368 drivers/ide/pci/cs5520.o.before
    680 108 0 788 314 drivers/ide/pci/cs5520.o.after
    1680 112 4 1796 704 drivers/ide/pci/generic.o.before
    1155 112 4 1271 4f7 drivers/ide/pci/generic.o.after
    7128 792 0 7920 1ef0 drivers/ide/pci/hpt366.o.before
    6984 792 0 7776 1e60 drivers/ide/pci/hpt366.o.after
    2800 148 0 2948 b84 drivers/ide/pci/pdc202xx_new.o.before
    2523 148 0 2671 a6f drivers/ide/pci/pdc202xx_new.o.after
    2831 148 0 2979 ba3 drivers/ide/pci/pdc202xx_old.o.before
    2683 148 0 2831 b0f drivers/ide/pci/pdc202xx_old.o.after
    3776 112 4 3892 f34 drivers/ide/pci/piix.o.before
    2804 112 4 2920 b68 drivers/ide/pci/piix.o.after
    4693 116 0 4809 12c9 drivers/ide/pci/siimage.o.before
    4600 116 0 4716 126c drivers/ide/pci/siimage.o.after

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • While at it:

    * it821x_chipsets[] -> it821x_chipset.

    * Fix it821x_chipset's name field (as it is used for IT8211/8212).

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • While at it:

    * it8213_chipsets[] -> it8213_chipset.

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • While at it:

    * Apply small fixes to messages (s/dma/DMA/, remove trailing '.', etc).

    * Fix printk() call in ide_setup_pci_baseregs() to use KERN_INFO.

    * Move printk() call from ide_pci_clear_simplex() to the caller.

    * Cleanup do_ide_setup_pci_device() a bit.

    * amd74xx.c: remove superfluous PCI device revision information.

    * hpt366.c: fix two printk() calls in ->init_chipset to use KERN_INFO.

    * pdc202xx_new.c: fix printk() call in ->init_chipset to use KERN_INFO.

    * pdc202xx_old.c: fix driver message in pdc202xx_init_one().

    * via82cxxx.c: fix driver warning message in via_init_one().

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • * Remove include from ( includes
    which is enough).

    * Remove for alpha/blackfin/h8300/ia64/m32r/sh/x86/xtensa
    (this leaves us with arm/frv/m68k/mips/mn10300/parisc/powerpc/sparc[64]).

    There should be no functional changes caused by this patch.

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • Replace ide_default_{io_base,irq}() inlines by legacy_{bases,irqs}[].

    v2:
    Add missing zero-ing of hws[] (caught during testing by Borislav Petkov).

    v3:
    Fix zero-oing of hws[] for _real_ this time.

    There should be no functional changes caused by this patch.

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • Cc: Benjamin Herrenschmidt
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • PPC_PREP has been depending on BROKEN for some time now.

    Cc: Benjamin Herrenschmidt
    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz
     
  • * Now that ide_hwif_t instances are allocated dynamically
    the difference between MAX_HWIFS == 2 and MAX_HWIFS == 10
    is ~100 bytes (x86-32) so use MAX_HWIFS == 10 on all archs
    except these ones that use MAX_HWIFS == 1.

    * Define MAX_HWIFS in instead of .

    [ Please note that avr32/cris/v850 have no
    and alpha/ia64/sh always define CONFIG_IDE_MAX_HWIFS. ]

    Signed-off-by: Bartlomiej Zolnierkiewicz

    Bartlomiej Zolnierkiewicz