11 Jul, 2013

1 commit


07 Jul, 2013

1 commit

  • Currently, vhost-net and vhost-scsi are sharing the vhost core code.
    However, vhost-scsi shares the code by including the vhost.c file
    directly.

    Making vhost a separate module makes it is easier to share code with
    other vhost devices.

    Signed-off-by: Asias He
    Signed-off-by: Michael S. Tsirkin

    Asias He
     

11 Jun, 2013

1 commit


06 May, 2013

4 commits


01 May, 2013

4 commits


30 Jan, 2013

1 commit

  • Currently, the polling errors were ignored, which can lead following issues:

    - vhost remove itself unconditionally from waitqueue when stopping the poll,
    this may crash the kernel since the previous attempt of starting may fail to
    add itself to the waitqueue
    - userspace may think the backend were successfully set even when the polling
    failed.

    Solve this by:

    - check poll->wqh before trying to remove from waitqueue
    - report polling errors in vhost_poll_start(), tx_poll_start(), the return value
    will be checked and returned when userspace want to set the backend

    After this fix, there still could be a polling failure after backend is set, it
    will addressed by the next patch.

    Signed-off-by: Jason Wang
    Acked-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Jason Wang
     

06 Dec, 2012

1 commit

  • vring changes already do a flush internally where appropriate, so we do
    not need a second flush.

    It's currently not very expensive but a follow-up patch makes flush more
    heavy-weight, so remove the extra flush here to avoid regressing
    performance if call or kick fds are changed on data path.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

03 Nov, 2012

4 commits


22 Jul, 2012

2 commits

  • The vhost work queue allows processing to be done in vhost worker thread
    context, which uses the owner process mm. Access to the vring and guest
    memory is typically only possible from vhost worker context so it is
    useful to allow work to be queued directly by users.

    Currently vhost_net only uses the poll wrappers which do not expose the
    work queue functions. However, for tcm_vhost (vhost_scsi) it will be
    necessary to queue custom work.

    Signed-off-by: Stefan Hajnoczi
    Cc: Zhi Yong Wu
    Cc: Michael S. Tsirkin
    Cc: Paolo Bonzini
    Signed-off-by: Nicholas Bellinger
    Signed-off-by: Michael S. Tsirkin

    Stefan Hajnoczi
     
  • In order for other vhost devices to use the VHOST_FEATURES bits the
    vhost-net specific bits need to be moved to their own VHOST_NET_FEATURES
    constant.

    (Asias: Update drivers/vhost/test.c to use VHOST_NET_FEATURES)

    Signed-off-by: Stefan Hajnoczi
    Cc: Zhi Yong Wu
    Cc: Michael S. Tsirkin
    Cc: Paolo Bonzini
    Cc: Asias He
    Signed-off-by: Nicholas A. Bellinger
    Signed-off-by: Michael S. Tsirkin

    Stefan Hajnoczi
     

14 Apr, 2012

1 commit

  • The skb struct ubuf_info callback gets passed struct ubuf_info
    itself, not the arg value as the field name and the function signature
    seem to imply. Rename the arg field to ctx to match usage,
    add documentation and change the callback argument type
    to make usage clear and to have compiler check correctness.

    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     

28 Feb, 2012

1 commit


27 Jul, 2011

1 commit

  • This allows us to move duplicated code in
    (atomic_inc_not_zero() for now) to

    Signed-off-by: Arun Sharma
    Reviewed-by: Eric Dumazet
    Cc: Ingo Molnar
    Cc: David Miller
    Cc: Eric Dumazet
    Acked-by: Mike Frysinger
    Signed-off-by: Andrew Morton
    Signed-off-by: Linus Torvalds

    Arun Sharma
     

19 Jul, 2011

2 commits

  • Move the used ring initialization after backend was set. This
    makes it possible to disable the backend and tweak the used ring,
    then restart. This will also make it possible to log the used ring
    write correctly.

    Signed-off-by: Jason Wang
    Signed-off-by: Michael S. Tsirkin

    Jason Wang
     
  • >From: Shirley Ma

    This adds experimental zero copy support in vhost-net,
    disabled by default. To enable, set
    experimental_zcopytx module option to 1.

    This patch maintains the outstanding userspace buffers in the
    sequence it is delivered to vhost. The outstanding userspace buffers
    will be marked as done once the lower device buffers DMA has finished.
    This is monitored through last reference of kfree_skb callback. Two
    buffer indices are used for this purpose.

    The vhost-net device passes the userspace buffers info to lower device
    skb through message control. DMA done status check and guest
    notification are handled by handle_tx: in the worst case is all buffers
    in the vq are in pending/done status, so we need to notify guest to
    release DMA done buffers first before we get any new buffers from the
    vq.

    One known problem is that if the guest stops submitting
    buffers, buffers might never get used until some
    further action, e.g. device reset. This does not
    seem to affect linux guests.

    Signed-off-by: Shirley
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin
     

30 May, 2011

1 commit


01 Feb, 2011

1 commit

  • When built with rcu checks enabled, vhost triggers
    bogus warnings as vhost features are read without
    dev->mutex sometimes, and private pointer is read
    with our kind of rcu where work serves as a
    read side critical section.

    Fixing it properly is not trivial.
    Disable the warnings by stubbing out the checks for now.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

09 Dec, 2010

1 commit


24 Oct, 2010

1 commit

  • * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6: (1699 commits)
    bnx2/bnx2x: Unsupported Ethtool operations should return -EINVAL.
    vlan: Calling vlan_hwaccel_do_receive() is always valid.
    tproxy: use the interface primary IP address as a default value for --on-ip
    tproxy: added IPv6 support to the socket match
    cxgb3: function namespace cleanup
    tproxy: added IPv6 support to the TPROXY target
    tproxy: added IPv6 socket lookup function to nf_tproxy_core
    be2net: Changes to use only priority codes allowed by f/w
    tproxy: allow non-local binds of IPv6 sockets if IP_TRANSPARENT is enabled
    tproxy: added tproxy sockopt interface in the IPV6 layer
    tproxy: added udp6_lib_lookup function
    tproxy: added const specifiers to udp lookup functions
    tproxy: split off ipv6 defragmentation to a separate module
    l2tp: small cleanup
    nf_nat: restrict ICMP translation for embedded header
    can: mcp251x: fix generation of error frames
    can: mcp251x: fix endless loop in interrupt handler if CANINTF_MERRF is set
    can-raw: add msg_flags to distinguish local traffic
    9p: client code cleanup
    rds: make local functions/variables static
    ...

    Fix up conflicts in net/core/dev.c, drivers/net/pcmcia/smc91c92_cs.c and
    drivers/net/wireless/ath/ath9k/debug.c as per David

    Linus Torvalds
     

05 Oct, 2010

1 commit

  • Qemu supports up to UIO_MAXIOV s/g so we have to match that because guest
    drivers may rely on this.

    Allocate indirect and log arrays dynamically to avoid using too much contigious
    memory and make the length of hdr array to match the header length since each
    iovec entry has a least one byte.

    Test with copying large files w/ and w/o migration in both linux and windows
    guests.

    Signed-off-by: Jason Wang
    Signed-off-by: Michael S. Tsirkin

    Jason Wang
     

22 Aug, 2010

1 commit


28 Jul, 2010

2 commits

  • This adds support for mergeable buffers in vhost-net: this is needed
    for older guests without indirect buffer support, as well
    as for zero copy with some devices.

    Includes changes by Michael S. Tsirkin to make the
    patch as low risk as possible (i.e., close to no changes
    when feature is disabled).

    Signed-off-by: David Stevens
    Signed-off-by: Michael S. Tsirkin

    David Stevens
     
  • Replace vhost_workqueue with per-vhost kthread. Other than callback
    argument change from struct work_struct * to struct vhost_work *,
    there's no visible change to vhost_poll_*() interface.

    This conversion is to make each vhost use a dedicated kthread so that
    resource control via cgroup can be applied.

    Partially based on Sridhar Samudrala's patch.

    * Updated to use sub structure vhost_work instead of directly using
    vhost_poll at Michael's suggestion.

    * Added flusher wake_up() optimization at Michael's suggestion.

    Changes by MST:
    * Converted atomics/barrier use to a spinlock.
    * Create thread on SET_OWNER
    * Fix flushing

    Signed-off-by: Tejun Heo
    Signed-off-by: Michael S. Tsirkin
    Cc: Sridhar Samudrala

    Tejun Heo
     

27 Jun, 2010

1 commit

  • When ring parsing fails, we currently handle this
    as ring empty condition. This means that we enable
    kicks and recheck ring empty: if this not empty,
    we re-start polling which of course will fail again.

    Instead, let's return a negative error code and stop polling.

    Signed-off-by: Michael S. Tsirkin

    Michael S. Tsirkin
     

15 Jan, 2010

1 commit

  • What it is: vhost net is a character device that can be used to reduce
    the number of system calls involved in virtio networking.
    Existing virtio net code is used in the guest without modification.

    There's similarity with vringfd, with some differences and reduced scope
    - uses eventfd for signalling
    - structures can be moved around in memory at any time (good for
    migration, bug work-arounds in userspace)
    - write logging is supported (good for migration)
    - support memory table and not just an offset (needed for kvm)

    common virtio related code has been put in a separate file vhost.c and
    can be made into a separate module if/when more backends appear. I used
    Rusty's lguest.c as the source for developing this part : this supplied
    me with witty comments I wouldn't be able to write myself.

    What it is not: vhost net is not a bus, and not a generic new system
    call. No assumptions are made on how guest performs hypercalls.
    Userspace hypervisors are supported as well as kvm.

    How it works: Basically, we connect virtio frontend (configured by
    userspace) to a backend. The backend could be a network device, or a tap
    device. Backend is also configured by userspace, including vlan/mac
    etc.

    Status: This works for me, and I haven't see any crashes.
    Compared to userspace, people reported improved latency (as I save up to
    4 system calls per packet), as well as better bandwidth and CPU
    utilization.

    Features that I plan to look at in the future:
    - mergeable buffers
    - zero copy
    - scalability tuning: figure out the best threading model to use

    Note on RCU usage (this is also documented in vhost.h, near
    private_pointer which is the value protected by this variant of RCU):
    what is happening is that the rcu_dereference() is being used in a
    workqueue item. The role of rcu_read_lock() is taken on by the start of
    execution of the workqueue item, of rcu_read_unlock() by the end of
    execution of the workqueue item, and of synchronize_rcu() by
    flush_workqueue()/flush_work(). In the future we might need to apply
    some gcc attribute or sparse annotation to the function passed to
    INIT_WORK(). Paul's ack below is for this RCU usage.

    (Includes fixes by Alan Cox ,
    David L Stevens ,
    Chris Wright )

    Acked-by: Rusty Russell
    Acked-by: Arnd Bergmann
    Acked-by: "Paul E. McKenney"
    Signed-off-by: Michael S. Tsirkin
    Signed-off-by: David S. Miller

    Michael S. Tsirkin