22 Sep, 2010

1 commit


14 Sep, 2010

1 commit


06 Sep, 2010

2 commits

  • vhost should set worker to NULL on cgroups attach failure,
    so that we won't try to destroy the worker again on close.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     
  • Since 2.6.36-rc1, non-root users of vhost-net fail to attach
    if they are in any cgroups.

    The reason is that when qemu uses vhost, vhost wants to attach
    its thread to all cgroups that qemu has. But we got the API backwards,
    so a non-priveledged process (Qemu) tried to control
    the priveledged one (vhost), which fails.

    Fix this by switching to the new cgroup_attach_task_all,
    and running it from the vhost thread.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

02 Sep, 2010

1 commit


05 Aug, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1443 commits)
    phy/marvell: add 88ec048 support
    igb: Program MDICNFG register prior to PHY init
    e1000e: correct MAC-PHY interconnect register offset for 82579
    hso: Add new product ID
    can: Add driver for esd CAN-USB/2 device
    l2tp: fix export of header file for userspace
    can-raw: Fix skb_orphan_try handling
    Revert "net: remove zap_completion_queue"
    net: cleanup inclusion
    phy/marvell: add 88e1121 interface mode support
    u32: negative offset fix
    net: Fix a typo from "dev" to "ndev"
    igb: Use irq_synchronize per vector when using MSI-X
    ixgbevf: fix null pointer dereference due to filter being set for VLAN 0
    e1000e: Fix irq_synchronize in MSI-X case
    e1000e: register pm_qos request on hardware activation
    ip_fragment: fix subtracting PPPOE_SES_HLEN from mtu twice
    net: Add getsockopt support for TCP thin-streams
    cxgb4: update driver version
    cxgb4: add new PCI IDs
    ...

    Manually fix up conflicts in:
    - drivers/net/e1000e/netdev.c: due to pm_qos registration
    infrastructure changes
    - drivers/net/phy/marvell.c: conflict between adding 88ec048 support
    and cleaning up the IDs
    - drivers/net/wireless/ipw2x00/ipw2100.c: trivial ipw2100_pm_qos_req
    conflict (registration change vs marking it static)

    Linus Torvalds
     

28 Jul, 2010

3 commits

  • This adds support for mergeable buffers in vhost-net: this is needed
    for older guests without indirect buffer support, as well
    as for zero copy with some devices.

    Includes changes by Michael S. Tsirkin to make the
    patch as low risk as possible (i.e., close to no changes
    when feature is disabled).

    Signed-off-by: David Stevens
    Signed-off-by: Michael S. Tsirkin

    David Stevens
     
  • Apply the cgroup of the owner task to the created vhost worker.

    Based on patches from Sridhar Samudrala's and Tejun Heo.
    Later we'll need to also apply cpumask and probably priority
    of the owner process.

    Discussion on the best way to do this is still ongoing.

    Signed-off-by: Michael S. Tsirkin
    Cc: Tejun Heo
    Cc: Sridhar Samudrala
    Cc: Li Zefan

    Michael S. Tsirkin
     
  • Replace vhost_workqueue with per-vhost kthread. Other than callback
    argument change from struct work_struct * to struct vhost_work *,
    there's no visible change to vhost_poll_*() interface.

    This conversion is to make each vhost use a dedicated kthread so that
    resource control via cgroup can be applied.

    Partially based on Sridhar Samudrala's patch.

    * Updated to use sub structure vhost_work instead of directly using
    vhost_poll at Michael's suggestion.

    * Added flusher wake_up() optimization at Michael's suggestion.

    Changes by MST:
    * Converted atomics/barrier use to a spinlock.
    * Create thread on SET_OWNER
    * Fix flushing

    Signed-off-by: Tejun Heo
    Signed-off-by: Michael S. Tsirkin
    Cc: Sridhar Samudrala

    Tejun Heo
     

22 Jul, 2010

1 commit


21 Jul, 2010

2 commits

  • Conflicts:
    drivers/vhost/net.c
    net/bridge/br_device.c

    Fix merge conflict in drivers/vhost/net.c with guidance from
    Stephen Rothwell.

    Revert the effects of net-2.6 commit 573201f36fd9c7c6d5218cdcd9948cee700b277d
    since net-next-2.6 has fixes that make bridge netpoll work properly thus
    we don't need it disabled.

    Signed-off-by: David S. Miller

    David S. Miller
     
  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (24 commits)
    bridge: Partially disable netpoll support
    tcp: fix crash in tcp_xmit_retransmit_queue
    IPv6: fix CoA check in RH2 input handler (mip6_rthdr_input())
    ibmveth: lost IRQ while closing/opening device leads to service loss
    rt2x00: Fix lockdep warning in rt2x00lib_probe_dev()
    vhost: avoid pr_err on condition guest can trigger
    ipmr: Don't leak memory if fib lookup fails.
    vhost-net: avoid flush under lock
    net: fix problem in reading sock TX queue
    net/core: neighbour update Oops
    net: skb_tx_hash() fix relative to skb_orphan_try()
    rfs: call sock_rps_record_flow() in tcp_splice_read()
    xfrm: do not assume that template resolving always returns xfrms
    hostap_pci: set dev->base_addr during probe
    axnet_cs: use spin_lock_irqsave in ax_interrupt
    dsa: Fix Kconfig dependencies.
    act_nat: not all of the ICMP packets need an IP header payload
    r8169: incorrect identifier for a 8168dp
    Phonet: fix skb leak in pipe endpoint accept()
    Bluetooth: Update sec_level/auth_type for already existing connections
    ...

    Linus Torvalds
     

16 Jul, 2010

1 commit


15 Jul, 2010

1 commit

  • We flush under vq mutex when changing backends.
    This creates a deadlock as workqueue being flushed
    needs this lock as well.

    https://bugzilla.redhat.com/show_bug.cgi?id=612421

    Drop the vq mutex before flush: we have the device mutex
    which is sufficient to prevent another ioctl from touching
    the vq.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

08 Jul, 2010

2 commits

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (35 commits)
    NET: SB1250: Initialize .owner
    vxge: show startup message with KERN_INFO
    ll_temac: Fix missing iounmaps
    bridge: Clear IPCB before possible entry into IP stack
    bridge br_multicast: BUG: unable to handle kernel NULL pointer dereference
    net: Fix definition of netif_vdbg() when VERBOSE_DEBUG is defined
    net/ne: fix memory leak in ne_drv_probe()
    xfrm: fix xfrm by MARK logic
    virtio_net: fix oom handling on tx
    virtio_net: do not reschedule rx refill forever
    s2io: resolve statistics issues
    linux/net.h: fix kernel-doc warnings
    net: decreasing real_num_tx_queues needs to flush qdisc
    sched: qdisc_reset_all_tx is calling qdisc_reset without qdisc_lock
    qlge: fix a eeh handler to not add a pending timer
    qlge: Replacing add_timer() to mod_timer()
    usbnet: Set parent device early for netdev_printk()
    net: Revert "rndis_host: Poll status channel before control channel"
    netfilter: ip6t_REJECT: fix a dst leak in ipv6 REJECT
    drivers: bluetooth: bluecard_cs.c: Fixed include error, changed to linux/io.h
    ...

    Linus Torvalds
     
  • David S. Miller
     

02 Jul, 2010

1 commit


27 Jun, 2010

1 commit

  • When ring parsing fails, we currently handle this
    as ring empty condition. This means that we enable
    kicks and recheck ring empty: if this not empty,
    we re-start polling which of course will fail again.

    Instead, let's return a negative error code and stop polling.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

09 Jun, 2010

1 commit

  • 10, 233 is allocated officially to /dev/kmview which is shipping in
    Ubuntu and Debian distributions. vhost_net seem to have borrowed it
    without making a proper request and this causes regressions in the other
    distributions.

    vhost_net can use a dynamic minor so use that instead. Also update the
    file with a comment to try and avoid future misunderstandings.

    cc: stable@kernel.org
    Signed-off-by: Alan Cox
    [ We should have caught this before 2.6.34 got released. - Linus ]
    Signed-off-by: Linus Torvalds

    Alan Cox
     

02 Jun, 2010

1 commit


29 May, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6: (22 commits)
    netlink: bug fix: wrong size was calculated for vfinfo list blob
    netlink: bug fix: don't overrun skbs on vf_port dump
    xt_tee: use skb_dst_drop()
    netdev/fec: fix ifconfig eth0 down hang issue
    cnic: Fix context memory init. on 5709.
    drivers/net: Eliminate a NULL pointer dereference
    drivers/net/hamradio: Eliminate a NULL pointer dereference
    be2net: Patch removes redundant while statement in loop.
    ipv6: Add GSO support on forwarding path
    net: fix __neigh_event_send()
    vhost: fix the memory leak which will happen when memory_access_ok fails
    vhost-net: fix to check the return value of copy_to/from_user() correctly
    vhost: fix to check the return value of copy_to/from_user() correctly
    vhost: Fix host panic if ioctl called with wrong index
    net: fix lock_sock_bh/unlock_sock_bh
    net/iucv: Add missing spin_unlock
    net: ll_temac: fix checksum offload logic
    net: ll_temac: fix interrupt bug when interrupt 0 is used
    sctp: dubious bitfields in sctp_transport
    ipmr: off by one in __ipmr_fill_mroute()
    ...

    Linus Torvalds
     

27 May, 2010

7 commits


25 May, 2010

1 commit


17 May, 2010

1 commit


12 May, 2010

1 commit

  • According to memory-barriers.txt, an smp memory barrier in guest
    should always be paired with an smp memory barrier in host,
    and I quote "a lack of appropriate pairing is almost certainly an
    error". In case of vhost, failure to flush out used index
    update before looking at the interrupt disable flag
    could result in missed interrupts, resulting in
    networking hang under stress.

    This might happen when flags read bypasses used index write.
    So we see interrupts disabled and do not interrupt, at the
    same time guest writes flags value to enable interrupt,
    reads an old used index value, thinks that
    used ring is empty and waits for interrupt.

    Note: the barrier we pair with here is in
    drivers/virtio/virtio_ring.c, function
    vring_enable_cb.

    Signed-off-by: Michael S. Tsirkin
    Acked-by: Juan Quintela

    Michael S. Tsirkin
     

15 Apr, 2010

1 commit


14 Apr, 2010

1 commit


11 Apr, 2010

1 commit


07 Apr, 2010

1 commit


30 Mar, 2010

1 commit

  • …it slab.h inclusion from percpu.h

    percpu.h is included by sched.h and module.h and thus ends up being
    included when building most .c files. percpu.h includes slab.h which
    in turn includes gfp.h making everything defined by the two files
    universally available and complicating inclusion dependencies.

    percpu.h -> slab.h dependency is about to be removed. Prepare for
    this change by updating users of gfp and slab facilities include those
    headers directly instead of assuming availability. As this conversion
    needs to touch large number of source files, the following script is
    used as the basis of conversion.

    http://userweb.kernel.org/~tj/misc/slabh-sweep.py

    The script does the followings.

    * Scan files for gfp and slab usages and update includes such that
    only the necessary includes are there. ie. if only gfp is used,
    gfp.h, if slab is used, slab.h.

    * When the script inserts a new include, it looks at the include
    blocks and try to put the new include such that its order conforms
    to its surrounding. It's put in the include block which contains
    core kernel includes, in the same order that the rest are ordered -
    alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
    doesn't seem to be any matching order.

    * If the script can't find a place to put a new include (mostly
    because the file doesn't have fitting include block), it prints out
    an error message indicating which .h file needs to be added to the
    file.

    The conversion was done in the following steps.

    1. The initial automatic conversion of all .c files updated slightly
    over 4000 files, deleting around 700 includes and adding ~480 gfp.h
    and ~3000 slab.h inclusions. The script emitted errors for ~400
    files.

    2. Each error was manually checked. Some didn't need the inclusion,
    some needed manual addition while adding it to implementation .h or
    embedding .c file was more appropriate for others. This step added
    inclusions to around 150 files.

    3. The script was run again and the output was compared to the edits
    from #2 to make sure no file was left behind.

    4. Several build tests were done and a couple of problems were fixed.
    e.g. lib/decompress_*.c used malloc/free() wrappers around slab
    APIs requiring slab.h to be added manually.

    5. The script was run on all .h files but without automatically
    editing them as sprinkling gfp.h and slab.h inclusions around .h
    files could easily lead to inclusion dependency hell. Most gfp.h
    inclusion directives were ignored as stuff from gfp.h was usually
    wildly available and often used in preprocessor macros. Each
    slab.h inclusion directive was examined and added manually as
    necessary.

    6. percpu.h was updated not to include slab.h.

    7. Build test were done on the following configurations and failures
    were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
    distributed build env didn't work with gcov compiles) and a few
    more options had to be turned off depending on archs to make things
    build (like ipr on powerpc/64 which failed due to missing writeq).

    * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
    * powerpc and powerpc64 SMP allmodconfig
    * sparc and sparc64 SMP allmodconfig
    * ia64 SMP allmodconfig
    * s390 SMP allmodconfig
    * alpha SMP allmodconfig
    * um on x86_64 SMP allmodconfig

    8. percpu.h modifications were reverted so that it could be applied as
    a separate patch and serve as bisection point.

    Given the fact that I had only a couple of failures from tests on step
    6, I'm fairly confident about the coverage of this conversion patch.
    If there is a breakage, it's likely to be something in one of the arch
    headers which should be easily discoverable easily on most builds of
    the specific arch.

    Signed-off-by: Tejun Heo <tj@kernel.org>
    Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
    Cc: Ingo Molnar <mingo@redhat.com>
    Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>

    Tejun Heo
     

18 Mar, 2010

1 commit


17 Mar, 2010

1 commit


07 Mar, 2010

1 commit


01 Mar, 2010

1 commit

  • guest to remote communication with vhost net sometimes stops until
    guest driver is restarted. This happens when we get guest kick precisely
    when the backend send queue is full, as a result handle_tx() returns without
    polling backend. This patch fixes this by restarting tx poll on this condition.

    Signed-off-by: Sridhar Samudrala
    Signed-off-by: Michael S. Tsirkin
    Tested-by: Tom Lendacky

    Sridhar Samudrala