<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://linux-kvm.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Kongove</id>
	<title>KVM - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://linux-kvm.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Kongove"/>
	<link rel="alternate" type="text/html" href="https://linux-kvm.org/page/Special:Contributions/Kongove"/>
	<updated>2026-04-05T23:03:08Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.39.5</generator>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=23048</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=23048"/>
		<updated>2014-09-20T02:32:42Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* large-order allocations&lt;br /&gt;
   see 28d6427109d13b0f447cba5761f88d3548e83605&lt;br /&gt;
   Developer: MST&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* orphan packets less agressively (was make pktgen works for virtio-net ( or partially orphan ))&lt;br /&gt;
       virtio-net orphans all skbs during tx, this used to be optimal.&lt;br /&gt;
       Recent changes in guest networking stack and hardware advances&lt;br /&gt;
       such as APICv changed optimal behaviour for drivers.&lt;br /&gt;
       We need to revisit optimizations such as orphaning all packets early&lt;br /&gt;
       to have optimal behaviour.&lt;br /&gt;
&lt;br /&gt;
       this should also fix pktgen which is currently broken with virtio net:&lt;br /&gt;
       orphaning all skbs makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: bring back tx interrupt (partially)&lt;br /&gt;
       Jason&#039;s idea: introduce a flag to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developers: Jason Wang, MST&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver [Done]&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu part is merged by MST.&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Head of line blocking issue with zerocopy&lt;br /&gt;
       zerocopy has several defects that will cause head of line blocking problem:&lt;br /&gt;
       - limit the number of pending DMAs&lt;br /&gt;
       - complete in order&lt;br /&gt;
       This means is one of some of the DMAs were delayed, all other will also delayed. This could be reproduced with following case:&lt;br /&gt;
       - boot two VMS VM1(tap1) and VM2(tap2) on host1 (has eth0)&lt;br /&gt;
       - setup tbf to limit the tap2 bandwidth to 10Mbit/s&lt;br /&gt;
       - start two netperf instances one from VM1 to VM2, another from VM1 to an external host whose traffic go through eth0 on host&lt;br /&gt;
       Then you can see not only VM1 to VM2 is throttled, but also VM1 to external host were also throttled.&lt;br /&gt;
       For this issue, a solution is orphan the frags when en queuing to non work conserving qdisc.&lt;br /&gt;
       But we have have similar issues in other case:&lt;br /&gt;
       - The card has its own priority queues&lt;br /&gt;
       - Host has two interface, one is 1G another is 10G, so throttle 1G may lead traffic over 10G to be throttled.&lt;br /&gt;
       The final solution is to remove receive buffering at tun, and convert it to user NAPI&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       Reference: https://lkml.org/lkml/2014/1/17/105&lt;br /&gt;
&lt;br /&gt;
* Write a ethtool seftest for virtio-net&lt;br /&gt;
        Implement selftest ethtool method for virtio-net for regression test e.g the CVEs found for tun/macvtap, qemu and vhost.&lt;br /&gt;
        Developer: CSDN summer code project student &lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203 (applied by upstream)&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - rx busy polling for virtio-net [DONE]&lt;br /&gt;
    see https://git.kernel.org/cgit/linux/kernel/git/davem/net-next.git/commit/?id=91815639d8804d1eee7ce2e1f7f60b36771db2c9. 1 byte netperf TCP_RR shows 127% improvement.&lt;br /&gt;
    Future work is co-operate with host, and only does the busy polling when there&#039;s no other process in host cpu. &lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
  Reduce the number of interrupt&lt;br /&gt;
  Rx interrupt coalescing should be good for rx stream throughput.&lt;br /&gt;
  Tx interrupt coalescing will help the optimization of enabling tx interrupt conditionally.&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable tx interrupt conditionally&lt;br /&gt;
  Small packet TCP stream performance is not good. This is because virtio-net orphan the packet during ndo_start_xmit() which disable the TCP small packet optimizations like TCP small Queue and AutoCork. The idea is enable the tx interrupt to TCP small packets.&lt;br /&gt;
  Jason&#039;s idea: switch between poll and tx interrupt mode based on recent statistics.&lt;br /&gt;
  MST&#039;s idea: use a per descriptor flag for virtio to force interrupt for a specific packet.&lt;br /&gt;
  Developer: Jason Wang, MST&lt;br /&gt;
  &lt;br /&gt;
* use kvm eventfd support for injecting level-triggered interrupts&lt;br /&gt;
  aim: enable vhost by default for level interrupts.&lt;br /&gt;
  The benefit is security: we want to avoid using userspace&lt;br /&gt;
  virtio net so that vhost-net is always used.&lt;br /&gt;
&lt;br /&gt;
  Alex emulated (post &amp;amp; re-enable) level-triggered interrupt in KVM for&lt;br /&gt;
  skipping userspace. VFIO already enjoied the performance benefit,&lt;br /&gt;
  let&#039;s do it for virtio-pci. Current virtio-pci devices still use&lt;br /&gt;
  level-interrupt in userspace.&lt;br /&gt;
&lt;br /&gt;
 kernel:&lt;br /&gt;
  7a84428af [PATCH] KVM: Add resampling irqfds for level triggered interrupts&lt;br /&gt;
 qemu:&lt;br /&gt;
  68919cac [PATCH] hw/vfio: set interrupts using pci irq wrappers&lt;br /&gt;
           (virtio-pci didn&#039;t use the wrappers)&lt;br /&gt;
  e1d1e586 [PATCH] vfio-pci: Add KVM INTx acceleration&lt;br /&gt;
&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support more devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* network traffic throttling&lt;br /&gt;
  block implemented &amp;quot;continuous leaky bucket&amp;quot; for throttling&lt;br /&gt;
  we can use continuous leaky bucket to network&lt;br /&gt;
  IOPS/BPS * RX/TX/TOTAL&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Allocate mac_table dynamically&lt;br /&gt;
&lt;br /&gt;
  In the future, maybe we can allocate the mac_table dynamically instead&lt;br /&gt;
  of embed it in VirtIONet. Then we can just does a pointer swap and&lt;br /&gt;
  gfree() and can save a memcpy() here.&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
    Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
* add documentation for macvlan and macvtap&lt;br /&gt;
   recent docs here:&lt;br /&gt;
   http://backreference.org/2014/03/20/some-notes-on-macvlanmacvtap/&lt;br /&gt;
   need to integrate in iproute and kernel docs.&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
* Extend sndbuf scope to int64&lt;br /&gt;
&lt;br /&gt;
  Current sndbuf limit is INT_MAX in tap_set_sndbuf(),&lt;br /&gt;
  large values (like 8388607T) can be converted rightly by qapi from qemu commandline,&lt;br /&gt;
  If we want to support the large values, we should extend sndbuf limit from &#039;int&#039; to &#039;int64&#039;&lt;br /&gt;
&lt;br /&gt;
  Upstream discussion: https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg04192.html&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* change tcp_tso_should_defer for kvm: batch more&lt;br /&gt;
  aggressively.&lt;br /&gt;
  in particular, see below&lt;br /&gt;
&lt;br /&gt;
* tcp: increase gso buffering for cubic,reno&lt;br /&gt;
    At the moment we push out an skb whenever the limit becomes&lt;br /&gt;
    large enough to send a full-sized TSO skb even if the skb,&lt;br /&gt;
    in fact, is not full-sized.&lt;br /&gt;
    The reason for this seems to be that some congestion avoidance&lt;br /&gt;
    protocols rely on the number of packets in flight to calculate&lt;br /&gt;
    CWND, so if we underuse the available CWND it shrinks&lt;br /&gt;
    which degrades performance:&lt;br /&gt;
    http://www.mail-archive.com/netdev@vger.kernel.org/msg08738.html&lt;br /&gt;
&lt;br /&gt;
    However, there seems to be no reason to do this for&lt;br /&gt;
    protocols such as reno and cubic which don&#039;t rely on packets in flight,&lt;br /&gt;
    and so will simply increase CWND a bit more to compensate for the&lt;br /&gt;
    underuse.&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
=== high level issues: not clear what the project is, yet ===&lt;br /&gt;
&lt;br /&gt;
* security: iptables&lt;br /&gt;
At the moment most people disables iptables to get&lt;br /&gt;
good performance on 10G/s networking.&lt;br /&gt;
Any way to improve experience?&lt;br /&gt;
&lt;br /&gt;
* performance&lt;br /&gt;
Going through scheduler and full networking stack twice&lt;br /&gt;
(host+guest) adds a lot of overhead&lt;br /&gt;
Any way to allow bypassing some layers?&lt;br /&gt;
&lt;br /&gt;
* manageability&lt;br /&gt;
Still hard to figure out VM networking,&lt;br /&gt;
VM networking is through libvirt, host networking through NM&lt;br /&gt;
Any way to integrate?&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=20045</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=20045"/>
		<updated>2014-06-05T05:48:47Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* large-order allocations&lt;br /&gt;
   see 28d6427109d13b0f447cba5761f88d3548e83605&lt;br /&gt;
   Developer: MST&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* orphan packets less agressively (was make pktgen works for virtio-net ( or partially orphan ))&lt;br /&gt;
       virtio-net orphans all skbs during tx, this used to be optimal.&lt;br /&gt;
       Recent changes in guest networking stack and hardware advances&lt;br /&gt;
       such as APICv changed optimal behaviour for drivers.&lt;br /&gt;
       We need to revisit optimizations such as orphaning all packets early&lt;br /&gt;
       to have optimal behaviour.&lt;br /&gt;
&lt;br /&gt;
       this should also fix pktgen which is currently broken with virtio net:&lt;br /&gt;
       orphaning all skbs makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: brng back tx interrupt (partially)&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developers: Jason Wang, MST&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V8 new RFC posted here (limit the changes to virtio-net only)&lt;br /&gt;
       https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203 (applied by upstream)&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Jason has a draft path to enable low latency polling for virito-net.&lt;br /&gt;
  May also consider it for tun/macvtap.&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level-triggered interrupts&lt;br /&gt;
  aim: enable vhost by default for level interrupts.&lt;br /&gt;
  The benefit is security: we want to avoid using userspace&lt;br /&gt;
  virtio net so that vhost-net is always used.&lt;br /&gt;
&lt;br /&gt;
  Alex emulated (post &amp;amp; re-enable) level-triggered interrupt in KVM for&lt;br /&gt;
  skipping userspace. VFIO already enjoied the performance benefit,&lt;br /&gt;
  let&#039;s do it for virtio-pci. Current virtio-pci devices still use&lt;br /&gt;
  level-interrupt in userspace.&lt;br /&gt;
&lt;br /&gt;
 kernel:&lt;br /&gt;
  7a84428af [PATCH] KVM: Add resampling irqfds for level triggered interrupts&lt;br /&gt;
 qemu:&lt;br /&gt;
  68919cac [PATCH] hw/vfio: set interrupts using pci irq wrappers&lt;br /&gt;
           (virtio-pci didn&#039;t use the wrappers)&lt;br /&gt;
  e1d1e586 [PATCH] vfio-pci: Add KVM INTx acceleration&lt;br /&gt;
&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support more devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* network traffic throttling&lt;br /&gt;
  block implemented &amp;quot;continuous leaky bucket&amp;quot; for throttling&lt;br /&gt;
  we can use continuous leaky bucket to network&lt;br /&gt;
  IOPS/BPS * RX/TX/TOTAL&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Allocate mac_table dynamically&lt;br /&gt;
&lt;br /&gt;
  In the future, maybe we can allocate the mac_table dynamically instead&lt;br /&gt;
  of embed it in VirtIONet. Then we can just does a pointer swap and&lt;br /&gt;
  gfree() and can save a memcpy() here.&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
    Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
* Extend sndbuf scope to int64&lt;br /&gt;
&lt;br /&gt;
  Current sndbuf limit is INT_MAX in tap_set_sndbuf(),&lt;br /&gt;
  large values (like 8388607T) can be converted rightly by qapi from qemu commandline,&lt;br /&gt;
  If we want to support the large values, we should extend sndbuf limit from &#039;int&#039; to &#039;int64&#039;&lt;br /&gt;
&lt;br /&gt;
  Upstream discussion: https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg04192.html&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* change tcp_tso_should_defer for kvm: batch more&lt;br /&gt;
  aggressively.&lt;br /&gt;
  in particular, see below&lt;br /&gt;
&lt;br /&gt;
* tcp: increase gso buffering for cubic,reno&lt;br /&gt;
    At the moment we push out an skb whenever the limit becomes&lt;br /&gt;
    large enough to send a full-sized TSO skb even if the skb,&lt;br /&gt;
    in fact, is not full-sized.&lt;br /&gt;
    The reason for this seems to be that some congestion avoidance&lt;br /&gt;
    protocols rely on the number of packets in flight to calculate&lt;br /&gt;
    CWND, so if we underuse the available CWND it shrinks&lt;br /&gt;
    which degrades performance:&lt;br /&gt;
    http://www.mail-archive.com/netdev@vger.kernel.org/msg08738.html&lt;br /&gt;
&lt;br /&gt;
    However, there seems to be no reason to do this for&lt;br /&gt;
    protocols such as reno and cubic which don&#039;t rely on packets in flight,&lt;br /&gt;
    and so will simply increase CWND a bit more to compensate for the&lt;br /&gt;
    underuse.&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
=== high level issues: not clear what the project is, yet ===&lt;br /&gt;
&lt;br /&gt;
* security: iptables&lt;br /&gt;
At the moment most people disables iptables to get&lt;br /&gt;
good performance on 10G/s networking.&lt;br /&gt;
Any way to improve experience?&lt;br /&gt;
&lt;br /&gt;
* performance&lt;br /&gt;
Going through scheduler and full networking stack twice&lt;br /&gt;
(host+guest) adds a lot of overhead&lt;br /&gt;
Any way to allow bypassing some layers?&lt;br /&gt;
&lt;br /&gt;
* manageability&lt;br /&gt;
Still hard to figure out VM networking,&lt;br /&gt;
VM networking is through libvirt, host networking through NM&lt;br /&gt;
Any way to integrate?&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=14407</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=14407"/>
		<updated>2014-05-17T22:31:21Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* large-order allocations&lt;br /&gt;
   see 28d6427109d13b0f447cba5761f88d3548e83605&lt;br /&gt;
   Developer: MST&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* orphan packets less agressively (was make pktgen works for virtio-net ( or partially orphan ))&lt;br /&gt;
       virtio-net orphans all skbs during tx, this used to be optimal.&lt;br /&gt;
       Recent changes in guest networking stack and hardware advances&lt;br /&gt;
       such as APICv changed optimal behaviour for drivers.&lt;br /&gt;
       We need to revisit optimizations such as orphaning all packets early&lt;br /&gt;
       to have optimal behaviour.&lt;br /&gt;
&lt;br /&gt;
       this should also fix pktgen which is currently broken with virtio net:&lt;br /&gt;
       orphaning all skbs makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: brng back tx interrupt (partially)&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developers: Jason Wang, MST&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V8 new RFC posted here (limit the changes to virtio-net only)&lt;br /&gt;
       https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203 (applied by upstream)&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Jason has a draft path to enable low latency polling for virito-net.&lt;br /&gt;
  May also consider it for tun/macvtap.&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level-triggered interrupts&lt;br /&gt;
  aim: enable vhost by default for level interrupts.&lt;br /&gt;
  The benefit is security: we want to avoid using userspace&lt;br /&gt;
  virtio net so that vhost-net is always used.&lt;br /&gt;
&lt;br /&gt;
  Alex emulated (post &amp;amp; re-enable) level-triggered interrupt in KVM for&lt;br /&gt;
  skipping userspace. VFIO already enjoied the performance benefit,&lt;br /&gt;
  let&#039;s do it for virtio-pci. Current virtio-pci devices still use&lt;br /&gt;
  level-interrupt in userspace.&lt;br /&gt;
&lt;br /&gt;
 kernel:&lt;br /&gt;
  7a84428af [PATCH] KVM: Add resampling irqfds for level triggered interrupts&lt;br /&gt;
 qemu:&lt;br /&gt;
  68919cac [PATCH] hw/vfio: set interrupts using pci irq wrappers&lt;br /&gt;
           (virtio-pci didn&#039;t use the wrappers)&lt;br /&gt;
  e1d1e586 [PATCH] vfio-pci: Add KVM INTx acceleration&lt;br /&gt;
&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support more devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* network traffic throttling&lt;br /&gt;
  block implemented &amp;quot;continuous leaky bucket&amp;quot; for throttling&lt;br /&gt;
  we can use continuous leaky bucket to network&lt;br /&gt;
  IOPS/BPS * RX/TX/TOTAL&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Allocate mac_table dynamically&lt;br /&gt;
&lt;br /&gt;
  In the future, maybe we can allocate the mac_table dynamically instead&lt;br /&gt;
  of embed it in VirtIONet. Then we can just does a pointer swap and&lt;br /&gt;
  gfree() and can save a memcpy() here.&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
    Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
* Extend sndbuf scope to int64&lt;br /&gt;
&lt;br /&gt;
  Current sndbuf limit is INT_MAX in tap_set_sndbuf(),&lt;br /&gt;
  large values (like 8388607T) can be converted rightly by qapi from qemu commandline,&lt;br /&gt;
  If we want to support the large values, we should extend sndbuf limit from &#039;int&#039; to &#039;int64&#039;&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* change tcp_tso_should_defer for kvm: batch more&lt;br /&gt;
  aggressively.&lt;br /&gt;
  in particular, see below&lt;br /&gt;
&lt;br /&gt;
* tcp: increase gso buffering for cubic,reno&lt;br /&gt;
    At the moment we push out an skb whenever the limit becomes&lt;br /&gt;
    large enough to send a full-sized TSO skb even if the skb,&lt;br /&gt;
    in fact, is not full-sized.&lt;br /&gt;
    The reason for this seems to be that some congestion avoidance&lt;br /&gt;
    protocols rely on the number of packets in flight to calculate&lt;br /&gt;
    CWND, so if we underuse the available CWND it shrinks&lt;br /&gt;
    which degrades performance:&lt;br /&gt;
    http://www.mail-archive.com/netdev@vger.kernel.org/msg08738.html&lt;br /&gt;
&lt;br /&gt;
    However, there seems to be no reason to do this for&lt;br /&gt;
    protocols such as reno and cubic which don&#039;t rely on packets in flight,&lt;br /&gt;
    and so will simply increase CWND a bit more to compensate for the&lt;br /&gt;
    underuse.&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
=== high level issues: not clear what the project is, yet ===&lt;br /&gt;
&lt;br /&gt;
* security: iptables&lt;br /&gt;
At the moment most people disables iptables to get&lt;br /&gt;
good performance on 10G/s networking.&lt;br /&gt;
Any way to improve experience?&lt;br /&gt;
&lt;br /&gt;
* performance&lt;br /&gt;
Going through scheduler and full networking stack twice&lt;br /&gt;
(host+guest) adds a lot of overhead&lt;br /&gt;
Any way to allow bypassing some layers?&lt;br /&gt;
&lt;br /&gt;
* manageability&lt;br /&gt;
Still hard to figure out VM networking,&lt;br /&gt;
VM networking is through libvirt, host networking through NM&lt;br /&gt;
Any way to integrate?&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=12143</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=12143"/>
		<updated>2014-05-08T02:51:12Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* large-order allocations&lt;br /&gt;
   see 28d6427109d13b0f447cba5761f88d3548e83605&lt;br /&gt;
   Developer: MST&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* orphan packets less agressively (was make pktgen works for virtio-net ( or partially orphan ))&lt;br /&gt;
       virtio-net orphans all skbs during tx, this used to be optimal.&lt;br /&gt;
       Recent changes in guest networking stack and hardware advances&lt;br /&gt;
       such as APICv changed optimal behaviour for drivers.&lt;br /&gt;
       We need to revisit optimizations such as orphaning all packets early&lt;br /&gt;
       to have optimal behaviour.&lt;br /&gt;
&lt;br /&gt;
       this should also fix pktgen which is currently broken with virtio net:&lt;br /&gt;
       orphaning all skbs makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: brng back tx interrupt (partially)&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developers: Jason Wang, MST&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V8 new RFC posted here (limit the changes to virtio-net only)&lt;br /&gt;
       https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203 (applied by upstream)&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Jason has a draft path to enable low latency polling for virito-net.&lt;br /&gt;
  May also consider it for tun/macvtap.&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level-triggered interrupts&lt;br /&gt;
  aim: enable vhost by default for level interrupts.&lt;br /&gt;
  The benefit is security: we want to avoid using userspace&lt;br /&gt;
  virtio net so that vhost-net is always used.&lt;br /&gt;
&lt;br /&gt;
  Alex emulated (post &amp;amp; re-enable) level-triggered interrupt in KVM for&lt;br /&gt;
  skipping userspace. VFIO already enjoied the performance benefit,&lt;br /&gt;
  let&#039;s do it for virtio-pci. Current virtio-pci devices still use&lt;br /&gt;
  level-interrupt in userspace.&lt;br /&gt;
&lt;br /&gt;
 kernel:&lt;br /&gt;
  7a84428af [PATCH] KVM: Add resampling irqfds for level triggered interrupts&lt;br /&gt;
 qemu:&lt;br /&gt;
  68919cac [PATCH] hw/vfio: set interrupts using pci irq wrappers&lt;br /&gt;
           (virtio-pci didn&#039;t use the wrappers)&lt;br /&gt;
  e1d1e586 [PATCH] vfio-pci: Add KVM INTx acceleration&lt;br /&gt;
&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support more devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* network traffic throttling&lt;br /&gt;
  block implemented &amp;quot;continuous leaky bucket&amp;quot; for throttling&lt;br /&gt;
  we can use continuous leaky bucket to network&lt;br /&gt;
  IOPS/BPS * RX/TX/TOTAL&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Allocate mac_table dynamically&lt;br /&gt;
&lt;br /&gt;
  In the future, maybe we can allocate the mac_table dynamically instead&lt;br /&gt;
  of embed it in VirtIONet. Then we can just does a pointer swap and&lt;br /&gt;
  gfree() and can save a memcpy() here.&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Extend sndbuf scope to int64&lt;br /&gt;
&lt;br /&gt;
  Current sndbuf limit is INT_MAX in tap_set_sndbuf(),&lt;br /&gt;
  large values (like 8388607T) can be converted rightly by qapi from qemu commandline,&lt;br /&gt;
  If we want to support the large values, we should extend sndbuf limit from &#039;int&#039; to &#039;int64&#039;&lt;br /&gt;
  Developer:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* change tcp_tso_should_defer for kvm: batch more&lt;br /&gt;
  aggressively.&lt;br /&gt;
  in particular, see below&lt;br /&gt;
&lt;br /&gt;
* tcp: increase gso buffering for cubic,reno&lt;br /&gt;
    At the moment we push out an skb whenever the limit becomes&lt;br /&gt;
    large enough to send a full-sized TSO skb even if the skb,&lt;br /&gt;
    in fact, is not full-sized.&lt;br /&gt;
    The reason for this seems to be that some congestion avoidance&lt;br /&gt;
    protocols rely on the number of packets in flight to calculate&lt;br /&gt;
    CWND, so if we underuse the available CWND it shrinks&lt;br /&gt;
    which degrades performance:&lt;br /&gt;
    http://www.mail-archive.com/netdev@vger.kernel.org/msg08738.html&lt;br /&gt;
&lt;br /&gt;
    However, there seems to be no reason to do this for&lt;br /&gt;
    protocols such as reno and cubic which don&#039;t rely on packets in flight,&lt;br /&gt;
    and so will simply increase CWND a bit more to compensate for the&lt;br /&gt;
    underuse.&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
=== high level issues: not clear what the project is, yet ===&lt;br /&gt;
&lt;br /&gt;
* security: iptables&lt;br /&gt;
At the moment most people disables iptables to get&lt;br /&gt;
good performance on 10G/s networking.&lt;br /&gt;
Any way to improve experience?&lt;br /&gt;
&lt;br /&gt;
* performance&lt;br /&gt;
Going through scheduler and full networking stack twice&lt;br /&gt;
(host+guest) adds a lot of overhead&lt;br /&gt;
Any way to allow bypassing some layers?&lt;br /&gt;
&lt;br /&gt;
* manageability&lt;br /&gt;
Still hard to figure out VM networking,&lt;br /&gt;
VM networking is through libvirt, host networking through NM&lt;br /&gt;
Any way to integrate?&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=5937</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=5937"/>
		<updated>2014-03-29T02:29:39Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* large-order allocations&lt;br /&gt;
   see 28d6427109d13b0f447cba5761f88d3548e83605&lt;br /&gt;
   Developer: MST&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* orphan packets less agressively (was make pktgen works for virtio-net ( or partially orphan ))&lt;br /&gt;
       virtio-net orphans all skbs during tx, this used to be optimal.&lt;br /&gt;
       Recent changes in guest networking stack and hardware advances&lt;br /&gt;
       such as APICv changed optimal behaviour for drivers.&lt;br /&gt;
       We need to revisit optimizations such as orphaning all packets early&lt;br /&gt;
       to have optimal behaviour.&lt;br /&gt;
&lt;br /&gt;
       this should also fix pktgen which is currently broken with virtio net:&lt;br /&gt;
       orphaning all skbs makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: brng back tx interrupt (partially)&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developers: Jason Wang, MST&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V8 new RFC posted here (limit the changes to virtio-net only)&lt;br /&gt;
       https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203 (applied by upstream)&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Jason has a draft path to enable low latency polling for virito-net.&lt;br /&gt;
  May also consider it for tun/macvtap.&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level-triggered interrupts&lt;br /&gt;
  aim: enable vhost by default for level interrupts.&lt;br /&gt;
  The benefit is security: we want to avoid using userspace&lt;br /&gt;
  virtio net so that vhost-net is always used.&lt;br /&gt;
&lt;br /&gt;
  Alex emulated (post &amp;amp; re-enable) level-triggered interrupt in KVM for&lt;br /&gt;
  skipping userspace. VFIO already enjoied the performance benefit,&lt;br /&gt;
  let&#039;s do it for virtio-pci. Current virtio-pci devices still use&lt;br /&gt;
  level-interrupt in userspace.&lt;br /&gt;
&lt;br /&gt;
 kernel:&lt;br /&gt;
  7a84428af [PATCH] KVM: Add resampling irqfds for level triggered interrupts&lt;br /&gt;
 qemu:&lt;br /&gt;
  68919cac [PATCH] hw/vfio: set interrupts using pci irq wrappers&lt;br /&gt;
           (virtio-pci didn&#039;t use the wrappers)&lt;br /&gt;
  e1d1e586 [PATCH] vfio-pci: Add KVM INTx acceleration&lt;br /&gt;
&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support more devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* network traffic throttling&lt;br /&gt;
  block implemented &amp;quot;continuous leaky bucket&amp;quot; for throttling&lt;br /&gt;
  we can use continuous leaky bucket to network&lt;br /&gt;
  IOPS/BPS * RX/TX/TOTAL&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Allocate mac_table dynamically&lt;br /&gt;
&lt;br /&gt;
  In the future, maybe we can allocate the mac_table dynamically instead&lt;br /&gt;
  of embed it in VirtIONet. Then we can just does a pointer swap and&lt;br /&gt;
  gfree() and can save a memcpy() here.&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* change tcp_tso_should_defer for kvm: batch more&lt;br /&gt;
  aggressively.&lt;br /&gt;
  in particular, see below&lt;br /&gt;
&lt;br /&gt;
* tcp: increase gso buffering for cubic,reno&lt;br /&gt;
    At the moment we push out an skb whenever the limit becomes&lt;br /&gt;
    large enough to send a full-sized TSO skb even if the skb,&lt;br /&gt;
    in fact, is not full-sized.&lt;br /&gt;
    The reason for this seems to be that some congestion avoidance&lt;br /&gt;
    protocols rely on the number of packets in flight to calculate&lt;br /&gt;
    CWND, so if we underuse the available CWND it shrinks&lt;br /&gt;
    which degrades performance:&lt;br /&gt;
    http://www.mail-archive.com/netdev@vger.kernel.org/msg08738.html&lt;br /&gt;
&lt;br /&gt;
    However, there seems to be no reason to do this for&lt;br /&gt;
    protocols such as reno and cubic which don&#039;t rely on packets in flight,&lt;br /&gt;
    and so will simply increase CWND a bit more to compensate for the&lt;br /&gt;
    underuse.&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
=== high level issues: not clear what the project is, yet ===&lt;br /&gt;
&lt;br /&gt;
* security: iptables&lt;br /&gt;
At the moment most people disables iptables to get&lt;br /&gt;
good performance on 10G/s networking.&lt;br /&gt;
Any way to improve experience?&lt;br /&gt;
&lt;br /&gt;
* performance&lt;br /&gt;
Going through scheduler and full networking stack twice&lt;br /&gt;
(host+guest) adds a lot of overhead&lt;br /&gt;
Any way to allow bypassing some layers?&lt;br /&gt;
&lt;br /&gt;
* manageability&lt;br /&gt;
Still hard to figure out VM networking,&lt;br /&gt;
VM networking is through libvirt, host networking through NM&lt;br /&gt;
Any way to integrate?&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=5778</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=5778"/>
		<updated>2014-03-27T13:11:04Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* large-order allocations&lt;br /&gt;
   see 28d6427109d13b0f447cba5761f88d3548e83605&lt;br /&gt;
   Developer: MST&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* orphan packets less agressively (was make pktgen works for virtio-net ( or partially orphan ))&lt;br /&gt;
       virtio-net orphans all skbs during tx, this used to be optimal.&lt;br /&gt;
       Recent changes in guest networking stack and hardware advances&lt;br /&gt;
       such as APICv changed optimal behaviour for drivers.&lt;br /&gt;
       We need to revisit optimizations such as orphaning all packets early&lt;br /&gt;
       to have optimal behaviour.&lt;br /&gt;
&lt;br /&gt;
       this should also fix pktgen which is currently broken with virtio net:&lt;br /&gt;
       orphaning all skbs makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: brng back tx interrupt (partially)&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developers: Jason Wang, MST&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V8 new RFC posted here (limit the changes to virtio-net only)&lt;br /&gt;
       https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203 (applied by upstream)&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Jason has a draft path to enable low latency polling for virito-net.&lt;br /&gt;
  May also consider it for tun/macvtap.&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support more devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts.&lt;br /&gt;
  The benefit is security: we want to avoid using userspace&lt;br /&gt;
  virtio net so that vhost-net is always used.&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* network traffic throttling&lt;br /&gt;
  block implemented &amp;quot;continuous leaky bucket&amp;quot; for throttling&lt;br /&gt;
  we can use continuous leaky bucket to network&lt;br /&gt;
  IOPS/BPS * RX/TX/TOTAL&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Allocate mac_table dynamically&lt;br /&gt;
&lt;br /&gt;
  In the future, maybe we can allocate the mac_table dynamically instead&lt;br /&gt;
  of embed it in VirtIONet. Then we can just does a pointer swap and&lt;br /&gt;
  gfree() and can save a memcpy() here.&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* change tcp_tso_should_defer for kvm: batch more&lt;br /&gt;
  aggressively.&lt;br /&gt;
  in particular, see below&lt;br /&gt;
&lt;br /&gt;
* tcp: increase gso buffering for cubic,reno&lt;br /&gt;
    At the moment we push out an skb whenever the limit becomes&lt;br /&gt;
    large enough to send a full-sized TSO skb even if the skb,&lt;br /&gt;
    in fact, is not full-sized.&lt;br /&gt;
    The reason for this seems to be that some congestion avoidance&lt;br /&gt;
    protocols rely on the number of packets in flight to calculate&lt;br /&gt;
    CWND, so if we underuse the available CWND it shrinks&lt;br /&gt;
    which degrades performance:&lt;br /&gt;
    http://www.mail-archive.com/netdev@vger.kernel.org/msg08738.html&lt;br /&gt;
&lt;br /&gt;
    However, there seems to be no reason to do this for&lt;br /&gt;
    protocols such as reno and cubic which don&#039;t rely on packets in flight,&lt;br /&gt;
    and so will simply increase CWND a bit more to compensate for the&lt;br /&gt;
    underuse.&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
=== high level issues: not clear what the project is, yet ===&lt;br /&gt;
&lt;br /&gt;
* security: iptables&lt;br /&gt;
At the moment most people disables iptables to get&lt;br /&gt;
good performance on 10G/s networking.&lt;br /&gt;
Any way to improve experience?&lt;br /&gt;
&lt;br /&gt;
* performance&lt;br /&gt;
Going through scheduler and full networking stack twice&lt;br /&gt;
(host+guest) adds a lot of overhead&lt;br /&gt;
Any way to allow bypassing some layers?&lt;br /&gt;
&lt;br /&gt;
* manageability&lt;br /&gt;
Still hard to figure out VM networking,&lt;br /&gt;
VM networking is through libvirt, host networking through NM&lt;br /&gt;
Any way to integrate?&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=5777</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=5777"/>
		<updated>2014-03-27T13:10:03Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* large-order allocations&lt;br /&gt;
   see 28d6427109d13b0f447cba5761f88d3548e83605&lt;br /&gt;
   Developer: MST&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* orphan packets less agressively (was make pktgen works for virtio-net ( or partially orphan ))&lt;br /&gt;
       virtio-net orphans all skbs during tx, this used to be optimal.&lt;br /&gt;
       Recent changes in guest networking stack and hardware advances&lt;br /&gt;
       such as APICv changed optimal behaviour for drivers.&lt;br /&gt;
       We need to revisit optimizations such as orphaning all packets early&lt;br /&gt;
       to have optimal behaviour.&lt;br /&gt;
&lt;br /&gt;
       this should also fix pktgen which is currently broken with virtio net:&lt;br /&gt;
       orphaning all skbs makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: brng back tx interrupt (partially)&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developers: Jason Wang, MST&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V8 new RFC posted here (limit the changes to virtio-net only)&lt;br /&gt;
       https://lists.gnu.org/archive/html/qemu-devel/2014-03/msg02648.html&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203 (applied by upstream)&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Jason has a draft path to enable low latency polling for virito-net.&lt;br /&gt;
  May also consider it for tun/macvtap.&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support more devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* network traffic throttling&lt;br /&gt;
  block implemented &amp;quot;continuous leaky bucket&amp;quot; for throttling&lt;br /&gt;
  we can use continuous leaky bucket to network&lt;br /&gt;
  IOPS/BPS * RX/TX/TOTAL&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Allocate mac_table dynamically&lt;br /&gt;
&lt;br /&gt;
  In the future, maybe we can allocate the mac_table dynamically instead&lt;br /&gt;
  of embed it in VirtIONet. Then we can just does a pointer swap and&lt;br /&gt;
  gfree() and can save a memcpy() here.&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* change tcp_tso_should_defer for kvm: batch more&lt;br /&gt;
  aggressively.&lt;br /&gt;
  in particular, see below&lt;br /&gt;
&lt;br /&gt;
* tcp: increase gso buffering for cubic,reno&lt;br /&gt;
    At the moment we push out an skb whenever the limit becomes&lt;br /&gt;
    large enough to send a full-sized TSO skb even if the skb,&lt;br /&gt;
    in fact, is not full-sized.&lt;br /&gt;
    The reason for this seems to be that some congestion avoidance&lt;br /&gt;
    protocols rely on the number of packets in flight to calculate&lt;br /&gt;
    CWND, so if we underuse the available CWND it shrinks&lt;br /&gt;
    which degrades performance:&lt;br /&gt;
    http://www.mail-archive.com/netdev@vger.kernel.org/msg08738.html&lt;br /&gt;
&lt;br /&gt;
    However, there seems to be no reason to do this for&lt;br /&gt;
    protocols such as reno and cubic which don&#039;t rely on packets in flight,&lt;br /&gt;
    and so will simply increase CWND a bit more to compensate for the&lt;br /&gt;
    underuse.&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
=== high level issues: not clear what the project is, yet ===&lt;br /&gt;
&lt;br /&gt;
* security: iptables&lt;br /&gt;
At the moment most people disables iptables to get&lt;br /&gt;
good performance on 10G/s networking.&lt;br /&gt;
Any way to improve experience?&lt;br /&gt;
&lt;br /&gt;
* performance&lt;br /&gt;
Going through scheduler and full networking stack twice&lt;br /&gt;
(host+guest) adds a lot of overhead&lt;br /&gt;
Any way to allow bypassing some layers?&lt;br /&gt;
&lt;br /&gt;
* manageability&lt;br /&gt;
Still hard to figure out VM networking,&lt;br /&gt;
VM networking is through libvirt, host networking through NM&lt;br /&gt;
Any way to integrate?&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4988</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4988"/>
		<updated>2013-11-25T07:50:12Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* large-order allocations&lt;br /&gt;
   see 28d6427109d13b0f447cba5761f88d3548e83605&lt;br /&gt;
   Developer: MST&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* multiqueue support in macvtap&lt;br /&gt;
       multiqueue is only supported for tun.&lt;br /&gt;
       Add support for macvtap.&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* make pktgen works for virtio-net ( or partially orphan )&lt;br /&gt;
       virtio-net orphan the skb during tx,&lt;br /&gt;
       which will makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Add HW_VLAN_TX support for tap&lt;br /&gt;
       Eliminate the extra data moving for tagged packets&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203 (applied by upstream)&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support mode devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* network traffic throttling&lt;br /&gt;
  block implemented &amp;quot;continuous leaky bucket&amp;quot; for throttling&lt;br /&gt;
  we can use continuous leaky bucket to network&lt;br /&gt;
  IOPS/BPS * RX/TX/TOTAL&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Allocate mac_table dynamically&lt;br /&gt;
&lt;br /&gt;
  In the future, maybe we can allocate the mac_table dynamically instead&lt;br /&gt;
  of embed it in VirtIONet. Then we can just does a pointer swap and&lt;br /&gt;
  gfree() and can save a memcpy() here.&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Bug: e1000 &amp;amp; rtl8139: Change macaddr in guest, but not update to qemu (info network)&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
  https://bugzilla.redhat.com/show_bug.cgi?id=922589&lt;br /&gt;
  Status: patches applied&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4984</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4984"/>
		<updated>2013-11-14T06:18:35Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* large-order allocations&lt;br /&gt;
   see 28d6427109d13b0f447cba5761f88d3548e83605&lt;br /&gt;
   Developer: MST&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* multiqueue support in macvtap&lt;br /&gt;
       multiqueue is only supported for tun.&lt;br /&gt;
       Add support for macvtap.&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* make pktgen works for virtio-net ( or partially orphan )&lt;br /&gt;
       virtio-net orphan the skb during tx,&lt;br /&gt;
       which will makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Add HW_VLAN_TX support for tap&lt;br /&gt;
       Eliminate the extra data moving for tagged packets&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203 (applied by upstream)&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support mode devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* network traffic throttling&lt;br /&gt;
  block implemented &amp;quot;continuous leaky bucket&amp;quot; for throttling&lt;br /&gt;
  we can use continuous leaky bucket to network&lt;br /&gt;
  IOPS/BPS * RX/TX/TOTAL&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Bug: e1000 &amp;amp; rtl8139: Change macaddr in guest, but not update to qemu (info network)&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
  https://bugzilla.redhat.com/show_bug.cgi?id=922589&lt;br /&gt;
  Status: patches applied&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4914</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4914"/>
		<updated>2013-10-30T05:25:54Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* multiqueue support in macvtap&lt;br /&gt;
       multiqueue is only supported for tun.&lt;br /&gt;
       Add support for macvtap.&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* make pktgen works for virtio-net ( or partially orphan )&lt;br /&gt;
       virtio-net orphan the skb during tx,&lt;br /&gt;
       which will makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Add HW_VLAN_TX support for tap&lt;br /&gt;
       Eliminate the extra data moving for tagged packets&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203 (applied by upstream)&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support mode devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Bug: e1000 &amp;amp; rtl8139: Change macaddr in guest, but not update to qemu (info network)&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
  https://bugzilla.redhat.com/show_bug.cgi?id=922589&lt;br /&gt;
  Status: patches applied&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=GuestProgrammableMacVlanFiltering&amp;diff=4913</id>
		<title>GuestProgrammableMacVlanFiltering</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=GuestProgrammableMacVlanFiltering&amp;diff=4913"/>
		<updated>2013-10-30T05:24:45Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== guest programmable mac/vlan filtering with macvtap ==&lt;br /&gt;
&lt;br /&gt;
This would be nice to have to be able to do bridging or use macvlan inside the guest.&lt;br /&gt;
&lt;br /&gt;
We neet to be able to:&lt;br /&gt;
* change mac address of the guest virtio-net interface.&lt;br /&gt;
* create a vlan device on the guest virtio-net device&lt;br /&gt;
* set promiscuous mode on guest virtio-net device&lt;br /&gt;
* all this controllable by host admin&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
TODO:&lt;br /&gt;
* There&#039;s a patch [http://thread.gmane.org/gmane.comp.emulators.qemu/37714/focus=37719] proposed by Alex Williamson to do TX mac filtering in TUN. It&#039;s still in RFC state, no recent activity in thread. Try rewrite based on comments.&lt;br /&gt;
&lt;br /&gt;
* Implement filtering in macvtap. The filtering information will be received through TUNSETTXFILTER ioctl (by above patch).&lt;br /&gt;
&lt;br /&gt;
* Implement promiscuous mode in guest virtio-net driver. No ideas here, yet.&lt;br /&gt;
&lt;br /&gt;
* Control should be done via qemu/virtio features. Need a way to disable access that qemu can&#039;t override unless it has net admin capability.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
QEMU:&lt;br /&gt;
&lt;br /&gt;
* Amos Kong works on QEMU side [http://git.qemu.org/?p=qemu.git;a=commit;h=b1be42803b31a913bab65bab563a8760ad2e7f7f] to add event notification when guest change rx-filter config (main-mac, rx-mode, mac-table, vlan-table). Libvirt will query the rx-filter config from monitor (query-rx-filter), then sync the change to host device.&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4912</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4912"/>
		<updated>2013-10-30T05:21:46Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome! ===&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* multiqueue support in macvtap&lt;br /&gt;
       multiqueue is only supported for tun.&lt;br /&gt;
       Add support for macvtap.&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* make pktgen works for virtio-net ( or partially orphan )&lt;br /&gt;
       virtio-net orphan the skb during tx,&lt;br /&gt;
       which will makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Add HW_VLAN_TX support for tap&lt;br /&gt;
       Eliminate the extra data moving for tagged packets&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        https://git.kernel.org/cgit/virt/kvm/mst/qemu.git/patch/?id=1c0fa6b709d02fe4f98d4ce7b55a6cc3c925791c&lt;br /&gt;
        Status: qemu patch applied, [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support mode devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Bug: e1000 &amp;amp; rtl8139: Change macaddr in guest, but not update to qemu (info network)&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
  https://bugzilla.redhat.com/show_bug.cgi?id=922589&lt;br /&gt;
  Status: patches applied&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
  Search for &amp;quot;Xin Xiaohui: Provide a zero-copy method on KVM virtio-net&amp;quot;&lt;br /&gt;
  for a very old prototype&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear ===&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
* Migrate some of the performance regression autotest functionality into Netperf&lt;br /&gt;
  - Get the CPU-utilization of the Host and the other-party, and add them to the report. This is also true for other Host measures, such as vmexits, interrupts, ...&lt;br /&gt;
  - Run Netperf in demo-mode, and measure only the time when all the sessions are active (could be many seconds after the beginning of the tests)&lt;br /&gt;
  - Packaging of Netperf in Fedora / RHEL (exists in Fedora). Licensing could be an issue.&lt;br /&gt;
  - Make the scripts more visible&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4871</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4871"/>
		<updated>2013-09-11T07:57:17Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome!&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* multiqueue support in macvtap&lt;br /&gt;
       multiqueue is only supported for tun.&lt;br /&gt;
       Add support for macvtap.&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* make pktgen works for virtio-net ( or partially orphan )&lt;br /&gt;
       virtio-net orphan the skb during tx,&lt;br /&gt;
       which will makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Add HW_VLAN_TX support for tap&lt;br /&gt;
       Eliminate the extra data moving for tagged packets&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        https://git.kernel.org/cgit/virt/kvm/mst/qemu.git/patch/?id=1c0fa6b709d02fe4f98d4ce7b55a6cc3c925791c&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Bug: e1000 &amp;amp; rtl8139: Change macaddr in guest, but not update to qemu (info network)&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
  https://bugzilla.redhat.com/show_bug.cgi?id=922589&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
  Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support mode devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* virtio: preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
* bridging without promisc mode with OVS&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4845</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4845"/>
		<updated>2013-07-22T13:48:46Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome!&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Bandan Das&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* multiqueue support in macvtap&lt;br /&gt;
       multiqueue is only supported for tun.&lt;br /&gt;
       Add support for macvtap.&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* make pktgen works for virtio-net ( or partially orphan )&lt;br /&gt;
       virtio-net orphan the skb during tx,&lt;br /&gt;
       which will makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Add HW_VLAN_TX support for tap&lt;br /&gt;
       Eliminate the extra data moving for tagged packets&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        https://git.kernel.org/cgit/virt/kvm/mst/qemu.git/patch/?id=1c0fa6b709d02fe4f98d4ce7b55a6cc3c925791c&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Bug: e1000 &amp;amp; rtl8139: Change macaddr in guest, but not update to qemu (info network)&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
  https://bugzilla.redhat.com/show_bug.cgi?id=922589&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
* sharing config interrupts&lt;br /&gt;
  Support mode devices by sharing a single msi vector&lt;br /&gt;
  between multiple virtio devices.&lt;br /&gt;
  (Applies to virtio-blk too).&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
* non-virtio device support with vhost&lt;br /&gt;
  Use vhost interface for guests that don&#039;t use virtio-net&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        kernel part is done (Vlad Yasevich)&lt;br /&gt;
        teach qemu to notify libvirt to enable the filter (still to do) (existed NIC_RX_FILTER_CHANGED event contains vlan-tables)&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
* vxlan&lt;br /&gt;
  What could we do here?&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Write some unit tests for vhost-net/vhost-scsi&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=GuestProgrammableMacVlanFiltering&amp;diff=4830</id>
		<title>GuestProgrammableMacVlanFiltering</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=GuestProgrammableMacVlanFiltering&amp;diff=4830"/>
		<updated>2013-06-24T07:40:07Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== guest programmable mac/vlan filtering with macvtap ==&lt;br /&gt;
&lt;br /&gt;
This would be nice to have to be able to do bridging or use macvlan inside the guest.&lt;br /&gt;
&lt;br /&gt;
We neet to be able to:&lt;br /&gt;
* change mac address of the guest virtio-net interface.&lt;br /&gt;
* create a vlan device on the guest virtio-net device&lt;br /&gt;
* set promiscuous mode on guest virtio-net device&lt;br /&gt;
* all this controllable by host admin&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
TODO:&lt;br /&gt;
* There&#039;s a patch [http://thread.gmane.org/gmane.comp.emulators.qemu/37714/focus=37719] proposed by Alex Williamson to do TX mac filtering in TUN. It&#039;s still in RFC state, no recent activity in thread. Try rewrite based on comments.&lt;br /&gt;
&lt;br /&gt;
* Implement filtering in macvtap. The filtering information will be received through TUNSETTXFILTER ioctl (by above patch).&lt;br /&gt;
&lt;br /&gt;
* Implement promiscuous mode in guest virtio-net driver. No ideas here, yet.&lt;br /&gt;
&lt;br /&gt;
* Control should be done via qemu/virtio features. Need a way to disable access that qemu can&#039;t override unless it has net admin capability.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
QEMU:&lt;br /&gt;
&lt;br /&gt;
* Amos Kong works on QEMU side [https://git.kernel.org/cgit/virt/kvm/mst/qemu.git/patch/?id=1c0fa6b709d02fe4f98d4ce7b55a6cc3c925791cl] to add event notification when guest change rx-filter config (main-mac, rx-mode, mac-table, vlan-table). Libvirt will query the rx-filter config from monitor (query-rx-filter), then sync the change to host device.&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4829</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4829"/>
		<updated>2013-06-24T07:39:16Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome!&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Shirley Ma?, MST?&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* multiqueue support in macvtap&lt;br /&gt;
       multiqueue is only supported for tun.&lt;br /&gt;
       Add support for macvtap.&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* make pktgen works for virtio-net ( or partially orphan )&lt;br /&gt;
       virtio-net orphan the skb during tx,&lt;br /&gt;
       which will makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Add HW_VLAN_TX support for tap&lt;br /&gt;
       Eliminate the extra data moving for tagged packets&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Amos Kong&lt;br /&gt;
        qemu: https://bugzilla.redhat.com/show_bug.cgi?id=848203&lt;br /&gt;
        libvirt: https://bugzilla.redhat.com/show_bug.cgi?id=848199&lt;br /&gt;
        https://git.kernel.org/cgit/virt/kvm/mst/qemu.git/patch/?id=1c0fa6b709d02fe4f98d4ce7b55a6cc3c925791c&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Bug: e1000 &amp;amp; rtl8139: Change macaddr in guest, but not update to qemu (info network)&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
  https://bugzilla.redhat.com/show_bug.cgi?id=922589&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        IGMP snooping in bridge should take vlans into account&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4804</id>
		<title>Multiqueue</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4804"/>
		<updated>2013-06-13T05:44:16Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Multiqueue virtio-net =&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
&lt;br /&gt;
This page provides information about the design of multi-queue virtio-net, an approach enables packet sending/receiving processing to scale with the number of available vcpus of guest. This page provides an overview of multiqueue virtio-net and discusses the design of various parts involved. The page also contains some basic performance test result. This work is in progress and the design may changes.&lt;br /&gt;
&lt;br /&gt;
== Contact ==&lt;br /&gt;
* Jason Wang &amp;lt;jasowang@redhat.com&amp;gt;&lt;br /&gt;
* Amos Kong &amp;lt;akong@redhat.com&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Rationale ==&lt;br /&gt;
&lt;br /&gt;
Today&#039;s high-end server have more processors, guests running on them tend have an increasing number of vcpus. The scale of the protocol stack in guest in restricted because of the single queue virtio-net:&lt;br /&gt;
&lt;br /&gt;
* The network performance does not scale as the number of vcpus increasing: Guest can not transmit or retrieve  packets in parallel as virtio-net have only one TX and RX, virtio-net drivers must be synchronized before sending and receiving packets. Even through there&#039;s software technology to spread the loads into different processor such as RFS, such kind of method is only for transmission and is really expensive in guest as they depends on IPI which may brings extra overhead in virtualized environment.&lt;br /&gt;
* Multiqueue nic were more common used and is well supported by linux kernel, but current virtual nic can not utilize the multi queue support: the tap and virtio-net backend must serialize the co-current transmission/receiving request comes from different cpus.&lt;br /&gt;
&lt;br /&gt;
In order the remove those bottlenecks, we must allow the paralleled packet processing by introducing multi queue support for both back-end and guest drivers. Ideally, we may let the packet handing be done by processors in parallel without interleaving and scale the network performance as the number of vcpus increasing.&lt;br /&gt;
&lt;br /&gt;
== Git &amp;amp; Cmdline ==&lt;br /&gt;
&lt;br /&gt;
* kernel changes: git://github.com/jasowang/kernel-mq.git&lt;br /&gt;
* qemu-kvm changes: git://github.com/jasowang/qemu-kvm-mq.git&lt;br /&gt;
* qemu-kvm -netdev tap,id=hn0,queues=M -device virtio-net-pci,netdev=hn0,vectors=N ...&lt;br /&gt;
&lt;br /&gt;
== Status &amp;amp; Challenges ==&lt;br /&gt;
* Status&lt;br /&gt;
** Have patches for all part but need performance tuning for small packet transmission&lt;br /&gt;
** Several of new vhost threading model were proposed&lt;br /&gt;
&lt;br /&gt;
* Challenges&lt;br /&gt;
** small packet transmission&lt;br /&gt;
*** Reason:&lt;br /&gt;
**** spread the packets into different queue reduce the possibility of batching thus damage the performance.&lt;br /&gt;
**** some but not too much batching may help for the performance&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
***** Find a adaptive/dynamic algorithm to switch between one queue mode and multiqueue mode&lt;br /&gt;
****** Find the threshold to do the switch, not easy as the traffic were unexpected in real workloads&lt;br /&gt;
****** Avoid the packet re-ordering when switching packet&lt;br /&gt;
***** Current Status &amp;amp; working on:&lt;br /&gt;
****** Add ioctl to notify tap to switch to one queue mode&lt;br /&gt;
****** switch when needed&lt;br /&gt;
** vhost threading&lt;br /&gt;
*** Three model were proposed:&lt;br /&gt;
**** per vq pairs, each vhost thread is polling a tx/rx vq pairs&lt;br /&gt;
***** simple&lt;br /&gt;
***** regression in small packet transmission&lt;br /&gt;
***** lack the numa affinity as vhost does the copy&lt;br /&gt;
***** may not scale well when using multiqueue as we may create more vhost threads than the numer of host cpu&lt;br /&gt;
**** multi-workers: multiple worker thread for a device&lt;br /&gt;
***** wakeup all threads and the thread contend for the work&lt;br /&gt;
***** improve the parallism especially for RR tes&lt;br /&gt;
***** broadcast wakeup and contention&lt;br /&gt;
***** #vhost threads may greater than #cpu&lt;br /&gt;
***** no numa consideration&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
**** per-cpu vhost thread:&lt;br /&gt;
***** pick a random thread in the same socket (except for the cpu that initiated the request) to handle the request&lt;br /&gt;
***** best performance in most conditions&lt;br /&gt;
***** schedule by it self and bypass the host scheduler, only suitable for network load&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
**** More testing&lt;br /&gt;
&lt;br /&gt;
== Design Goals ==&lt;br /&gt;
&lt;br /&gt;
=== Parallel send/receive processing ===&lt;br /&gt;
&lt;br /&gt;
To make sure the whole stack could be worked in parallel, the parallelism of not only the front-end (guest driver) but  also the back-end (vhost and tap/macvtap) must be explored. This is done by:&lt;br /&gt;
&lt;br /&gt;
* Allowing multiple sockets to be attached to tap/macvtap&lt;br /&gt;
* Using multiple threaded vhost to serve as the backend of a multiqueue capable virtio-net adapter&lt;br /&gt;
* Use a multi-queue awared virtio-net driver to send and receive packets to/from each queue&lt;br /&gt;
&lt;br /&gt;
=== In order delivery ===&lt;br /&gt;
&lt;br /&gt;
Packets for a specific stream are delivered in order to the TCP/IP stack in guest.&lt;br /&gt;
&lt;br /&gt;
=== Low overhead ===&lt;br /&gt;
&lt;br /&gt;
The multiqueue implementation should be low overhead, cache locality and send-side scaling could be maintained by&lt;br /&gt;
&lt;br /&gt;
* making sure the packets form a single connection are mapped to a specific processor.&lt;br /&gt;
* the send completion (TCP ACK) were sent to the same vcpu who send the data&lt;br /&gt;
* other considerations such as NUMA and HT&lt;br /&gt;
&lt;br /&gt;
=== No assumption about the underlying hardware ===&lt;br /&gt;
&lt;br /&gt;
The implementation should not tagert for specific hardware/environment. For example we should not only optimize the the host nic with RSS or flow director support.&lt;br /&gt;
&lt;br /&gt;
=== Compatibility ===&lt;br /&gt;
* Guest ABI: Based on the virtio specification, the multiqueue implementation of virtio-net should keep the compatibility with the single queue. The ability of multiqueue must be enabled through feature negotiation which make sure single queue driver can work under multiqueue backend, and multiqueue driver can work in single queue backend.&lt;br /&gt;
* Userspace ABI: As the changes may touch tun/tap which may have non-virtualized users, the semantics of ioctl must be kept in order to not break the application that use them. New function must be doen through new ioctls.&lt;br /&gt;
&lt;br /&gt;
=== Management friendly ===&lt;br /&gt;
The backend (tap/macvtap) should provides an easy to changes the number of queues/sockets. and qemu with multiqueue support should also be management software friendly, qemu should have the ability to accept file descriptors through cmdline and SCM_RIGHTS.&lt;br /&gt;
    &lt;br /&gt;
== High level Design ==&lt;br /&gt;
The main goals of multiqueue is to explore the parallelism of each module who is involved in the packet transmission and reception:&lt;br /&gt;
* macvtap/tap: For single queue virtio-net, one socket of macvtap/tap was abstracted as a queue for both tx and rx. We can reuse and extend this abstraction to allow macvtap/tap can dequeue and enqueue packets from multiple sockets. Then each socket can be treated as a tx and rx, and macvtap/tap is fact a multi-queue device in the host. The host network codes can then transmit and receive packets in parallel.&lt;br /&gt;
* vhost: The parallelism could be done through using multiple vhost threads to handle multiple sockets. Currently, there&#039;s two choices in design.&lt;br /&gt;
** 1:1 mapping between vhost threads and sockets. This method does not need vhost changes and just launch the the same number of vhost threads as queues. Each vhost thread is just used to handle one tx ring and rx ring just as they are used for single queue virtio-net.&lt;br /&gt;
** M:N mapping between vhost threads and sockets. This methods allow a single vhost thread to poll more than one tx/rx rings and sockests and use separated threads to handle tx and rx request.&lt;br /&gt;
* qemu: qemu is in charge of the fllowing things&lt;br /&gt;
** allow multiple tap file descriptors to be used for a single emulated nic&lt;br /&gt;
** userspace multiqueue virtio-net implementation which is used to maintaining compatibility, doing management and migration&lt;br /&gt;
** control the vhost based on the userspace multiqueue virtio-net&lt;br /&gt;
* guest driver&lt;br /&gt;
** Allocate multiple rx/tx queues&lt;br /&gt;
** Assign each queue a MSI-X vector in order to parallize the packet processing in guest stack&lt;br /&gt;
&lt;br /&gt;
The big picture looks like:&lt;br /&gt;
[[Image:ver1.jpg|left]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Choices and considerations ===&lt;br /&gt;
* 1:1 or M:N, 1:1 is much more simpler than M:N for both coding and queue/vhost management in qemu. And in theory, it could provides better parallelism than M:N. Performance test is needed to be done by both of the implementation.&lt;br /&gt;
*  Whether use a per-cpu queue: Morden 10gb card ( and M$ RSS) suggest to use the abstract of per-cpu queue that tries to allocate as many tx/rx queues as cpu numbers and does a 1:1 mapping between them. This can provides better parallelism and cache locality. Also this could simplify the other design such as in order delivery and flow director. For virtio-net, at least for guest with small number of vcpus, per-cpu queues is a better choice.  &lt;br /&gt;
The big picture is shown as.&lt;br /&gt;
&lt;br /&gt;
== Current status ==&lt;br /&gt;
&lt;br /&gt;
* macvtap/macvlan have basic multiqueue support.&lt;br /&gt;
* bridge does not have queue, but when it use a multiqueue tap as one of it port, some optimization may be needed.&lt;br /&gt;
* 1:1 Implementation &lt;br /&gt;
** qemu parts: http://www.spinics.net/lists/kvm/msg52808.html&lt;br /&gt;
** tap and guest driver:  http://www.spinics.net/lists/kvm/msg59993.html&lt;br /&gt;
* M:N Implementation&lt;br /&gt;
** kk&#039;s newest series(qemu/vhost/guest drivers): http://www.spinics.net/lists/kvm/msg52094.html&lt;br /&gt;
&lt;br /&gt;
== Detail Design ==&lt;br /&gt;
&lt;br /&gt;
=== Multiqueue Macvtap ===&lt;br /&gt;
&lt;br /&gt;
==== Basic design: ====&lt;br /&gt;
* Each socket were abstracted as a queue and the basic is allow multiple sockets to be attached to a single macvtap device. &lt;br /&gt;
* Queue attaching is done through open the named inode many times&lt;br /&gt;
* Each time it is opened a new socket were attached to the deivce and a file descriptor were returned and use for a backend of virtio-net backend (qemu or vhost-net).&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
In order to make the tx path lockless, macvtap use NETIF_F_LLTX to avoid the tx lock contention when host transmit packets. So its indeed a multiqueue network device from the point view of host.&lt;br /&gt;
==== Queue selector ====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
=== Multiqueue tun/tap ===&lt;br /&gt;
==== Basic design ====&lt;br /&gt;
* Borrow the idea of macvtap, we just allow multiple sockets to be attached in a single tap device. &lt;br /&gt;
* As there&#039;s no named inode for tap device, new ioctls IFF_ATTACH_QUEUE/IFF_DETACH_QUEUE is introduced to attach or detach a socket from tun/tap which can be used by virtio-net backend to add or delete a queue.. &lt;br /&gt;
* All socket related structures were moved to the private_data of file and initialized during file open. &lt;br /&gt;
* In order to keep semantics of TUNSETIFF and make the changes transparent to the legacy user of tun/tap, the allocating and initializing of network device is still done in TUNSETIFF, and first queue is atomically attached.&lt;br /&gt;
* IFF_DETACH_QUEUE is used to attach an unattached file/socket to a tap device. * IFF_DETACH_QUEUE is used to detach a file from a tap device. IFF_DETACH_QUEUE is used for temporarily disable a queue that is useful for maintain backward compatibility of guest. ( Running single queue driver on a multiqueue device ).&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
pesudo codes to create a two queue tap device&lt;br /&gt;
# fd1 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd1, TUNSETIFF, &amp;quot;tap&amp;quot;)&lt;br /&gt;
# fd2 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd2, IFF_ATTACH_QUEUE, &amp;quot;tap&amp;quot;)&lt;br /&gt;
then we have two queues tap device with fd1 and fd2 as its queue sockets.&lt;br /&gt;
&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
Just like macvtap, NETIF_F_LLTX is also used to tun/tap to avoid tx lock contention. And tun/tap is also in fact a multiqueue network device of host.&lt;br /&gt;
==== Queue selector (same as macvtap)====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
&lt;br /&gt;
==== Further Optimization ? ===&lt;br /&gt;
&lt;br /&gt;
# rxhash can only used for distributing workloads into different vcpus. The target vcpu may not be the one who is expected to do the recvmsg(). So more optimizations may need as:&lt;br /&gt;
## A simple hash to queue table to record the cpu/queue used by the flow. It is updated when guest send packets, and when tun/tap transmit packets to guest, this table could be used to do the queue selection.&lt;br /&gt;
## Some co-operation between host and guest driver to pass information such as which vcpu is isssue a recvmsg().&lt;br /&gt;
&lt;br /&gt;
=== vhost ===&lt;br /&gt;
* 1:1 without changes&lt;br /&gt;
* M:N [TBD]&lt;br /&gt;
&lt;br /&gt;
=== qemu changes ===&lt;br /&gt;
The changes in qemu contains two part:&lt;br /&gt;
* Add generic multiqueue support for nic layer: As the receiving function of nic backend is only aware of VLANClientState, we must make it aware of queue index, so&lt;br /&gt;
** Store queue_index in VLANClientState&lt;br /&gt;
** Store multiple VLANClientState in NICState&lt;br /&gt;
** Let netdev parameters accept multiple netdev ids, and link those tap based VLANClientState to their peers in NICState&lt;br /&gt;
* Userspace multiqueue support in virtio-net&lt;br /&gt;
** Allocate multiple virtqueues&lt;br /&gt;
** Expose the queue numbers through config space&lt;br /&gt;
** Enable the multiple support of backend only when the feature negotiated&lt;br /&gt;
** Handle packet request based on the queue_index of virtqueue and VLANClientState&lt;br /&gt;
** migration hanling&lt;br /&gt;
* Vhost enable/disable&lt;br /&gt;
** launch multiple vhost threads&lt;br /&gt;
** setup eventfd and contol the start/stop of vhost_net backend&lt;br /&gt;
The using of this is like:&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100 -netdev tap,id=hn1,fd=101 -device virtio-net-pci,netdev=hn0#hn1,queue=2 .....&lt;br /&gt;
&lt;br /&gt;
TODO: more user-friendly cmdline such as&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100,fd=101 -device virtio-net-pci,netdev=hn0,queues=2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== guest driver ===&lt;br /&gt;
The changes in guest driver as mainly:&lt;br /&gt;
* Allocate the number of tx and rx queue based on the queue number in config space&lt;br /&gt;
* Assign each queue a MSI-X vector&lt;br /&gt;
* Per-queue handling of TX/RX request&lt;br /&gt;
* Simply use skb_tx_hash() to choose the queue&lt;br /&gt;
&lt;br /&gt;
==== Future Optimizations ====&lt;br /&gt;
* Per-vcpu queue: Allocate as many tx/rx queues as the vcpu numbers. And bind tx/rx queue pairs to a specific vcpu by:&lt;br /&gt;
** Set the MSI-X irq affinity for tx/rx.&lt;br /&gt;
** Use smp_processor_id() to choose the tx queue..&lt;br /&gt;
* Comments: In theory, this should improve the parallelism, [TBD]&lt;br /&gt;
&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
== Enable MQ feature ==&lt;br /&gt;
* create tap device with multiple queues, please reference&lt;br /&gt;
  Documentation/networking/tuntap.txt:(3.3 Multiqueue tuntap interface)&lt;br /&gt;
* enable mq in qemu cmdline:   -device virtio-net-pci,mq=on,...&lt;br /&gt;
* enable mq in guest by &#039;ethtool -L eth0 combined $queue_num&#039;&lt;br /&gt;
&lt;br /&gt;
== Test ==&lt;br /&gt;
* Test tool: netperf, iperf&lt;br /&gt;
* Test protocol: TCP_STREAM TCP_MAERTS TCP_RR&lt;br /&gt;
** between localhost and guest&lt;br /&gt;
** between external host and guest with a 10gb direct link&lt;br /&gt;
** regression criteria: throughout%cpu&lt;br /&gt;
* Test method:&lt;br /&gt;
** multiple sessions of netperf: 1 2 4 8 16&lt;br /&gt;
** compare with the single queue implementation&lt;br /&gt;
* Other&lt;br /&gt;
** numactl to bind the cpunode and memorynode&lt;br /&gt;
** autotest implemented a performance regression test, used T-test&lt;br /&gt;
** use netperf demo-mode to get more stable results&lt;br /&gt;
== Performance Numbers ==&lt;br /&gt;
[[Multiqueue-performance-Sep-13|Performance]]&lt;br /&gt;
== TODO ==&lt;br /&gt;
== Reference ==&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=GuestProgrammableMacVlanFiltering&amp;diff=4799</id>
		<title>GuestProgrammableMacVlanFiltering</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=GuestProgrammableMacVlanFiltering&amp;diff=4799"/>
		<updated>2013-06-05T11:59:12Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;== guest programmable mac/vlan filtering with macvtap ==&lt;br /&gt;
&lt;br /&gt;
This would be nice to have to be able to do bridging or use macvlan inside the guest.&lt;br /&gt;
&lt;br /&gt;
We neet to be able to:&lt;br /&gt;
* change mac address of the guest virtio-net interface.&lt;br /&gt;
* create a vlan device on the guest virtio-net device&lt;br /&gt;
* set promiscuous mode on guest virtio-net device&lt;br /&gt;
* all this controllable by host admin&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
TODO:&lt;br /&gt;
* There&#039;s a patch [http://thread.gmane.org/gmane.comp.emulators.qemu/37714/focus=37719] proposed by Alex Williamson to do TX mac filtering in TUN. It&#039;s still in RFC state, no recent activity in thread. Try rewrite based on comments.&lt;br /&gt;
&lt;br /&gt;
* Implement filtering in macvtap. The filtering information will be received through TUNSETTXFILTER ioctl (by above patch).&lt;br /&gt;
&lt;br /&gt;
* Implement promiscuous mode in guest virtio-net driver. No ideas here, yet.&lt;br /&gt;
&lt;br /&gt;
* Control should be done via qemu/virtio features. Need a way to disable access that qemu can&#039;t override unless it has net admin capability.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
QEMU:&lt;br /&gt;
&lt;br /&gt;
* Amos Kong works on QEMU side [http://lists.nongnu.org/archive/html/qemu-devel/2013-06/msg00658.html] to add event notification when guest change rx-filter config (main-mac, rx-mode, mac-table, vlan-table). Libvirt will query the rx-filter config from monitor (query-rx-filter), then sync the change to host device.&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4775</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4775"/>
		<updated>2013-05-23T13:00:19Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome!&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
      Developer: Shirley Ma?, MST&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* multiqueue support in macvtap&lt;br /&gt;
       multiqueue is only supported for tun.&lt;br /&gt;
       Add support for macvtap.&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Dragos Tatulea?, Amos Kong&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Bug: e1000 &amp;amp; rtl8139: Change macaddr in guest, but not update to qemu (info network)&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues &lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        IGMP snooping in bridge should take vlans into account&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4671</id>
		<title>Multiqueue</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4671"/>
		<updated>2013-04-12T00:34:39Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Multiqueue virtio-net =&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
&lt;br /&gt;
This page provides information about the design of multi-queue virtio-net, an approach enables packet sending/receiving processing to scale with the number of available vcpus of guest. This page provides an overview of multiqueue virtio-net and discusses the design of various parts involved. The page also contains some basic performance test result. This work is in progress and the design may changes.&lt;br /&gt;
&lt;br /&gt;
== Contact ==&lt;br /&gt;
* Jason Wang &amp;lt;jasowang@redhat.com&amp;gt;&lt;br /&gt;
* Amos Kong &amp;lt;akong@redhat.com&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Rationale ==&lt;br /&gt;
&lt;br /&gt;
Today&#039;s high-end server have more processors, guests running on them tend have an increasing number of vcpus. The scale of the protocol stack in guest in restricted because of the single queue virtio-net:&lt;br /&gt;
&lt;br /&gt;
* The network performance does not scale as the number of vcpus increasing: Guest can not transmit or retrieve  packets in parallel as virtio-net have only one TX and RX, virtio-net drivers must be synchronized before sending and receiving packets. Even through there&#039;s software technology to spread the loads into different processor such as RFS, such kind of method is only for transmission and is really expensive in guest as they depends on IPI which may brings extra overhead in virtualized environment.&lt;br /&gt;
* Multiqueue nic were more common used and is well supported by linux kernel, but current virtual nic can not utilize the multi queue support: the tap and virtio-net backend must serialize the co-current transmission/receiving request comes from different cpus.&lt;br /&gt;
&lt;br /&gt;
In order the remove those bottlenecks, we must allow the paralleled packet processing by introducing multi queue support for both back-end and guest drivers. Ideally, we may let the packet handing be done by processors in parallel without interleaving and scale the network performance as the number of vcpus increasing.&lt;br /&gt;
&lt;br /&gt;
== Git &amp;amp; Cmdline ==&lt;br /&gt;
&lt;br /&gt;
* kernel changes: git://github.com/jasowang/kernel-mq.git&lt;br /&gt;
* qemu-kvm changes: git://github.com/jasowang/qemu-kvm-mq.git&lt;br /&gt;
* qemu-kvm -netdev tap,id=hn0,queues=M -device virtio-net-pci,netdev=hn0,vectors=N ...&lt;br /&gt;
&lt;br /&gt;
== Status &amp;amp; Challenges ==&lt;br /&gt;
* Status&lt;br /&gt;
** Have patches for all part but need performance tuning for small packet transmission&lt;br /&gt;
** Several of new vhost threading model were proposed&lt;br /&gt;
&lt;br /&gt;
* Challenges&lt;br /&gt;
** small packet transmission&lt;br /&gt;
*** Reason:&lt;br /&gt;
**** spread the packets into different queue reduce the possibility of batching thus damage the performance.&lt;br /&gt;
**** some but not too much batching may help for the performance&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
***** Find a adaptive/dynamic algorithm to switch between one queue mode and multiqueue mode&lt;br /&gt;
****** Find the threshold to do the switch, not easy as the traffic were unexpected in real workloads&lt;br /&gt;
****** Avoid the packet re-ordering when switching packet&lt;br /&gt;
***** Current Status &amp;amp; working on:&lt;br /&gt;
****** Add ioctl to notify tap to switch to one queue mode&lt;br /&gt;
****** switch when needed&lt;br /&gt;
** vhost threading&lt;br /&gt;
*** Three model were proposed:&lt;br /&gt;
**** per vq pairs, each vhost thread is polling a tx/rx vq pairs&lt;br /&gt;
***** simple&lt;br /&gt;
***** regression in small packet transmission&lt;br /&gt;
***** lack the numa affinity as vhost does the copy&lt;br /&gt;
***** may not scale well when using multiqueue as we may create more vhost threads than the numer of host cpu&lt;br /&gt;
**** multi-workers: multiple worker thread for a device&lt;br /&gt;
***** wakeup all threads and the thread contend for the work&lt;br /&gt;
***** improve the parallism especially for RR tes&lt;br /&gt;
***** broadcast wakeup and contention&lt;br /&gt;
***** #vhost threads may greater than #cpu&lt;br /&gt;
***** no numa consideration&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
**** per-cpu vhost thread:&lt;br /&gt;
***** pick a random thread in the same socket (except for the cpu that initiated the request) to handle the request&lt;br /&gt;
***** best performance in most conditions&lt;br /&gt;
***** schedule by it self and bypass the host scheduler, only suitable for network load&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
**** More testing&lt;br /&gt;
&lt;br /&gt;
== Design Goals ==&lt;br /&gt;
&lt;br /&gt;
=== Parallel send/receive processing ===&lt;br /&gt;
&lt;br /&gt;
To make sure the whole stack could be worked in parallel, the parallelism of not only the front-end (guest driver) but  also the back-end (vhost and tap/macvtap) must be explored. This is done by:&lt;br /&gt;
&lt;br /&gt;
* Allowing multiple sockets to be attached to tap/macvtap&lt;br /&gt;
* Using multiple threaded vhost to serve as the backend of a multiqueue capable virtio-net adapter&lt;br /&gt;
* Use a multi-queue awared virtio-net driver to send and receive packets to/from each queue&lt;br /&gt;
&lt;br /&gt;
=== In order delivery ===&lt;br /&gt;
&lt;br /&gt;
Packets for a specific stream are delivered in order to the TCP/IP stack in guest.&lt;br /&gt;
&lt;br /&gt;
=== Low overhead ===&lt;br /&gt;
&lt;br /&gt;
The multiqueue implementation should be low overhead, cache locality and send-side scaling could be maintained by&lt;br /&gt;
&lt;br /&gt;
* making sure the packets form a single connection are mapped to a specific processor.&lt;br /&gt;
* the send completion (TCP ACK) were sent to the same vcpu who send the data&lt;br /&gt;
* other considerations such as NUMA and HT&lt;br /&gt;
&lt;br /&gt;
=== No assumption about the underlying hardware ===&lt;br /&gt;
&lt;br /&gt;
The implementation should not tagert for specific hardware/environment. For example we should not only optimize the the host nic with RSS or flow director support.&lt;br /&gt;
&lt;br /&gt;
=== Compatibility ===&lt;br /&gt;
* Guest ABI: Based on the virtio specification, the multiqueue implementation of virtio-net should keep the compatibility with the single queue. The ability of multiqueue must be enabled through feature negotiation which make sure single queue driver can work under multiqueue backend, and multiqueue driver can work in single queue backend.&lt;br /&gt;
* Userspace ABI: As the changes may touch tun/tap which may have non-virtualized users, the semantics of ioctl must be kept in order to not break the application that use them. New function must be doen through new ioctls.&lt;br /&gt;
&lt;br /&gt;
=== Management friendly ===&lt;br /&gt;
The backend (tap/macvtap) should provides an easy to changes the number of queues/sockets. and qemu with multiqueue support should also be management software friendly, qemu should have the ability to accept file descriptors through cmdline and SCM_RIGHTS.&lt;br /&gt;
    &lt;br /&gt;
== High level Design ==&lt;br /&gt;
The main goals of multiqueue is to explore the parallelism of each module who is involved in the packet transmission and reception:&lt;br /&gt;
* macvtap/tap: For single queue virtio-net, one socket of macvtap/tap was abstracted as a queue for both tx and rx. We can reuse and extend this abstraction to allow macvtap/tap can dequeue and enqueue packets from multiple sockets. Then each socket can be treated as a tx and rx, and macvtap/tap is fact a multi-queue device in the host. The host network codes can then transmit and receive packets in parallel.&lt;br /&gt;
* vhost: The parallelism could be done through using multiple vhost threads to handle multiple sockets. Currently, there&#039;s two choices in design.&lt;br /&gt;
** 1:1 mapping between vhost threads and sockets. This method does not need vhost changes and just launch the the same number of vhost threads as queues. Each vhost thread is just used to handle one tx ring and rx ring just as they are used for single queue virtio-net.&lt;br /&gt;
** M:N mapping between vhost threads and sockets. This methods allow a single vhost thread to poll more than one tx/rx rings and sockests and use separated threads to handle tx and rx request.&lt;br /&gt;
* qemu: qemu is in charge of the fllowing things&lt;br /&gt;
** allow multiple tap file descriptors to be used for a single emulated nic&lt;br /&gt;
** userspace multiqueue virtio-net implementation which is used to maintaining compatibility, doing management and migration&lt;br /&gt;
** control the vhost based on the userspace multiqueue virtio-net&lt;br /&gt;
* guest driver&lt;br /&gt;
** Allocate multiple rx/tx queues&lt;br /&gt;
** Assign each queue a MSI-X vector in order to parallize the packet processing in guest stack&lt;br /&gt;
&lt;br /&gt;
The big picture looks like:&lt;br /&gt;
[[Image:ver1.jpg|left]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Choices and considerations ===&lt;br /&gt;
* 1:1 or M:N, 1:1 is much more simpler than M:N for both coding and queue/vhost management in qemu. And in theory, it could provides better parallelism than M:N. Performance test is needed to be done by both of the implementation.&lt;br /&gt;
*  Whether use a per-cpu queue: Morden 10gb card ( and M$ RSS) suggest to use the abstract of per-cpu queue that tries to allocate as many tx/rx queues as cpu numbers and does a 1:1 mapping between them. This can provides better parallelism and cache locality. Also this could simplify the other design such as in order delivery and flow director. For virtio-net, at least for guest with small number of vcpus, per-cpu queues is a better choice.  &lt;br /&gt;
The big picture is shown as.&lt;br /&gt;
&lt;br /&gt;
== Current status ==&lt;br /&gt;
&lt;br /&gt;
* macvtap/macvlan have basic multiqueue support.&lt;br /&gt;
* bridge does not have queue, but when it use a multiqueue tap as one of it port, some optimization may be needed.&lt;br /&gt;
* 1:1 Implementation &lt;br /&gt;
** qemu parts: http://www.spinics.net/lists/kvm/msg52808.html&lt;br /&gt;
** tap and guest driver:  http://www.spinics.net/lists/kvm/msg59993.html&lt;br /&gt;
* M:N Implementation&lt;br /&gt;
** kk&#039;s newest series(qemu/vhost/guest drivers): http://www.spinics.net/lists/kvm/msg52094.html&lt;br /&gt;
&lt;br /&gt;
== Detail Design ==&lt;br /&gt;
&lt;br /&gt;
=== Multiqueue Macvtap ===&lt;br /&gt;
&lt;br /&gt;
==== Basic design: ====&lt;br /&gt;
* Each socket were abstracted as a queue and the basic is allow multiple sockets to be attached to a single macvtap device. &lt;br /&gt;
* Queue attaching is done through open the named inode many times&lt;br /&gt;
* Each time it is opened a new socket were attached to the deivce and a file descriptor were returned and use for a backend of virtio-net backend (qemu or vhost-net).&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
In order to make the tx path lockless, macvtap use NETIF_F_LLTX to avoid the tx lock contention when host transmit packets. So its indeed a multiqueue network device from the point view of host.&lt;br /&gt;
==== Queue selector ====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
=== Multiqueue tun/tap ===&lt;br /&gt;
==== Basic design ====&lt;br /&gt;
* Borrow the idea of macvtap, we just allow multiple sockets to be attached in a single tap device. &lt;br /&gt;
* As there&#039;s no named inode for tap device, new ioctls IFF_ATTACH_QUEUE/IFF_DETACH_QUEUE is introduced to attach or detach a socket from tun/tap which can be used by virtio-net backend to add or delete a queue.. &lt;br /&gt;
* All socket related structures were moved to the private_data of file and initialized during file open. &lt;br /&gt;
* In order to keep semantics of TUNSETIFF and make the changes transparent to the legacy user of tun/tap, the allocating and initializing of network device is still done in TUNSETIFF, and first queue is atomically attached.&lt;br /&gt;
* IFF_DETACH_QUEUE is used to attach an unattached file/socket to a tap device. * IFF_DETACH_QUEUE is used to detach a file from a tap device. IFF_DETACH_QUEUE is used for temporarily disable a queue that is useful for maintain backward compatibility of guest. ( Running single queue driver on a multiqueue device ).&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
pesudo codes to create a two queue tap device&lt;br /&gt;
# fd1 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd1, TUNSETIFF, &amp;quot;tap&amp;quot;)&lt;br /&gt;
# fd2 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd2, IFF_ATTACH_QUEUE, &amp;quot;tap&amp;quot;)&lt;br /&gt;
then we have two queues tap device with fd1 and fd2 as its queue sockets.&lt;br /&gt;
&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
Just like macvtap, NETIF_F_LLTX is also used to tun/tap to avoid tx lock contention. And tun/tap is also in fact a multiqueue network device of host.&lt;br /&gt;
==== Queue selector (same as macvtap)====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
&lt;br /&gt;
==== Further Optimization ? ===&lt;br /&gt;
&lt;br /&gt;
# rxhash can only used for distributing workloads into different vcpus. The target vcpu may not be the one who is expected to do the recvmsg(). So more optimizations may need as:&lt;br /&gt;
## A simple hash to queue table to record the cpu/queue used by the flow. It is updated when guest send packets, and when tun/tap transmit packets to guest, this table could be used to do the queue selection.&lt;br /&gt;
## Some co-operation between host and guest driver to pass information such as which vcpu is isssue a recvmsg().&lt;br /&gt;
&lt;br /&gt;
=== vhost ===&lt;br /&gt;
* 1:1 without changes&lt;br /&gt;
* M:N [TBD]&lt;br /&gt;
&lt;br /&gt;
=== qemu changes ===&lt;br /&gt;
The changes in qemu contains two part:&lt;br /&gt;
* Add generic multiqueue support for nic layer: As the receiving function of nic backend is only aware of VLANClientState, we must make it aware of queue index, so&lt;br /&gt;
** Store queue_index in VLANClientState&lt;br /&gt;
** Store multiple VLANClientState in NICState&lt;br /&gt;
** Let netdev parameters accept multiple netdev ids, and link those tap based VLANClientState to their peers in NICState&lt;br /&gt;
* Userspace multiqueue support in virtio-net&lt;br /&gt;
** Allocate multiple virtqueues&lt;br /&gt;
** Expose the queue numbers through config space&lt;br /&gt;
** Enable the multiple support of backend only when the feature negotiated&lt;br /&gt;
** Handle packet request based on the queue_index of virtqueue and VLANClientState&lt;br /&gt;
** migration hanling&lt;br /&gt;
* Vhost enable/disable&lt;br /&gt;
** launch multiple vhost threads&lt;br /&gt;
** setup eventfd and contol the start/stop of vhost_net backend&lt;br /&gt;
The using of this is like:&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100 -netdev tap,id=hn1,fd=101 -device virtio-net-pci,netdev=hn0#hn1,queue=2 .....&lt;br /&gt;
&lt;br /&gt;
TODO: more user-friendly cmdline such as&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100,fd=101 -device virtio-net-pci,netdev=hn0,queues=2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== guest driver ===&lt;br /&gt;
The changes in guest driver as mainly:&lt;br /&gt;
* Allocate the number of tx and rx queue based on the queue number in config space&lt;br /&gt;
* Assign each queue a MSI-X vector&lt;br /&gt;
* Per-queue handling of TX/RX request&lt;br /&gt;
* Simply use skb_tx_hash() to choose the queue&lt;br /&gt;
&lt;br /&gt;
==== Future Optimizations ====&lt;br /&gt;
* Per-vcpu queue: Allocate as many tx/rx queues as the vcpu numbers. And bind tx/rx queue pairs to a specific vcpu by:&lt;br /&gt;
** Set the MSI-X irq affinity for tx/rx.&lt;br /&gt;
** Use smp_processor_id() to choose the tx queue..&lt;br /&gt;
* Comments: In theory, this should improve the parallelism, [TBD]&lt;br /&gt;
&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
== Test ==&lt;br /&gt;
* Test tool: netperf, iperf&lt;br /&gt;
* Test protocol: TCP_STREAM TCP_MAERTS TCP_RR&lt;br /&gt;
** between localhost and guest&lt;br /&gt;
** between external host and guest with a 10gb direct link&lt;br /&gt;
** regression criteria: throughout%cpu&lt;br /&gt;
* Test method:&lt;br /&gt;
** multiple sessions of netperf: 1 2 4 8 16&lt;br /&gt;
** compare with the single queue implementation&lt;br /&gt;
* Other&lt;br /&gt;
** numactl to bind the cpunode and memorynode&lt;br /&gt;
** autotest implemented a performance regression test, used T-test&lt;br /&gt;
** use netperf demo-mode to get more stable results&lt;br /&gt;
== Performance Numbers ==&lt;br /&gt;
[[Multiqueue-performance-Sep-13|Performance]]&lt;br /&gt;
== TODO ==&lt;br /&gt;
== Reference ==&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4670</id>
		<title>Multiqueue</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4670"/>
		<updated>2013-04-12T00:25:14Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Multiqueue virtio-net =&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
&lt;br /&gt;
This page provides information about the design of multi-queue virtio-net, an approach enables packet sending/receiving processing to scale with the number of available vcpus of guest. This page provides an overview of multiqueue virtio-net and discusses the design of various parts involved. The page also contains some basic performance test result. This work is in progress and the design may changes.&lt;br /&gt;
&lt;br /&gt;
== Contact ==&lt;br /&gt;
* Jason Wang &amp;lt;jasowang@redhat.com&amp;gt;&lt;br /&gt;
* Amos Kong &amp;lt;akong@redhat.com&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Rationale ==&lt;br /&gt;
&lt;br /&gt;
Today&#039;s high-end server have more processors, guests running on them tend have an increasing number of vcpus. The scale of the protocol stack in guest in restricted because of the single queue virtio-net:&lt;br /&gt;
&lt;br /&gt;
* The network performance does not scale as the number of vcpus increasing: Guest can not transmit or retrieve  packets in parallel as virtio-net have only one TX and RX, virtio-net drivers must be synchronized before sending and receiving packets. Even through there&#039;s software technology to spread the loads into different processor such as RFS, such kind of method is only for transmission and is really expensive in guest as they depends on IPI which may brings extra overhead in virtualized environment.&lt;br /&gt;
* Multiqueue nic were more common used and is well supported by linux kernel, but current virtual nic can not utilize the multi queue support: the tap and virtio-net backend must serialize the co-current transmission/receiving request comes from different cpus.&lt;br /&gt;
&lt;br /&gt;
In order the remove those bottlenecks, we must allow the paralleled packet processing by introducing multi queue support for both back-end and guest drivers. Ideally, we may let the packet handing be done by processors in parallel without interleaving and scale the network performance as the number of vcpus increasing.&lt;br /&gt;
&lt;br /&gt;
== Git &amp;amp; Cmdline ==&lt;br /&gt;
&lt;br /&gt;
* kernel changes: git://github.com/jasowang/kernel-mq.git&lt;br /&gt;
* qemu-kvm changes: git://github.com/jasowang/qemu-kvm-mq.git&lt;br /&gt;
* qemu-kvm -netdev tap,id=hn0,queues=M -device virtio-net-pci,netdev=hn0,queues=M,vectors=N ...&lt;br /&gt;
&lt;br /&gt;
== Status &amp;amp; Challenges ==&lt;br /&gt;
* Status&lt;br /&gt;
** Have patches for all part but need performance tuning for small packet transmission&lt;br /&gt;
** Several of new vhost threading model were proposed&lt;br /&gt;
&lt;br /&gt;
* Challenges&lt;br /&gt;
** small packet transmission&lt;br /&gt;
*** Reason:&lt;br /&gt;
**** spread the packets into different queue reduce the possibility of batching thus damage the performance.&lt;br /&gt;
**** some but not too much batching may help for the performance&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
***** Find a adaptive/dynamic algorithm to switch between one queue mode and multiqueue mode&lt;br /&gt;
****** Find the threshold to do the switch, not easy as the traffic were unexpected in real workloads&lt;br /&gt;
****** Avoid the packet re-ordering when switching packet&lt;br /&gt;
***** Current Status &amp;amp; working on:&lt;br /&gt;
****** Add ioctl to notify tap to switch to one queue mode&lt;br /&gt;
****** switch when needed&lt;br /&gt;
** vhost threading&lt;br /&gt;
*** Three model were proposed:&lt;br /&gt;
**** per vq pairs, each vhost thread is polling a tx/rx vq pairs&lt;br /&gt;
***** simple&lt;br /&gt;
***** regression in small packet transmission&lt;br /&gt;
***** lack the numa affinity as vhost does the copy&lt;br /&gt;
***** may not scale well when using multiqueue as we may create more vhost threads than the numer of host cpu&lt;br /&gt;
**** multi-workers: multiple worker thread for a device&lt;br /&gt;
***** wakeup all threads and the thread contend for the work&lt;br /&gt;
***** improve the parallism especially for RR tes&lt;br /&gt;
***** broadcast wakeup and contention&lt;br /&gt;
***** #vhost threads may greater than #cpu&lt;br /&gt;
***** no numa consideration&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
**** per-cpu vhost thread:&lt;br /&gt;
***** pick a random thread in the same socket (except for the cpu that initiated the request) to handle the request&lt;br /&gt;
***** best performance in most conditions&lt;br /&gt;
***** schedule by it self and bypass the host scheduler, only suitable for network load&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
**** More testing&lt;br /&gt;
&lt;br /&gt;
== Design Goals ==&lt;br /&gt;
&lt;br /&gt;
=== Parallel send/receive processing ===&lt;br /&gt;
&lt;br /&gt;
To make sure the whole stack could be worked in parallel, the parallelism of not only the front-end (guest driver) but  also the back-end (vhost and tap/macvtap) must be explored. This is done by:&lt;br /&gt;
&lt;br /&gt;
* Allowing multiple sockets to be attached to tap/macvtap&lt;br /&gt;
* Using multiple threaded vhost to serve as the backend of a multiqueue capable virtio-net adapter&lt;br /&gt;
* Use a multi-queue awared virtio-net driver to send and receive packets to/from each queue&lt;br /&gt;
&lt;br /&gt;
=== In order delivery ===&lt;br /&gt;
&lt;br /&gt;
Packets for a specific stream are delivered in order to the TCP/IP stack in guest.&lt;br /&gt;
&lt;br /&gt;
=== Low overhead ===&lt;br /&gt;
&lt;br /&gt;
The multiqueue implementation should be low overhead, cache locality and send-side scaling could be maintained by&lt;br /&gt;
&lt;br /&gt;
* making sure the packets form a single connection are mapped to a specific processor.&lt;br /&gt;
* the send completion (TCP ACK) were sent to the same vcpu who send the data&lt;br /&gt;
* other considerations such as NUMA and HT&lt;br /&gt;
&lt;br /&gt;
=== No assumption about the underlying hardware ===&lt;br /&gt;
&lt;br /&gt;
The implementation should not tagert for specific hardware/environment. For example we should not only optimize the the host nic with RSS or flow director support.&lt;br /&gt;
&lt;br /&gt;
=== Compatibility ===&lt;br /&gt;
* Guest ABI: Based on the virtio specification, the multiqueue implementation of virtio-net should keep the compatibility with the single queue. The ability of multiqueue must be enabled through feature negotiation which make sure single queue driver can work under multiqueue backend, and multiqueue driver can work in single queue backend.&lt;br /&gt;
* Userspace ABI: As the changes may touch tun/tap which may have non-virtualized users, the semantics of ioctl must be kept in order to not break the application that use them. New function must be doen through new ioctls.&lt;br /&gt;
&lt;br /&gt;
=== Management friendly ===&lt;br /&gt;
The backend (tap/macvtap) should provides an easy to changes the number of queues/sockets. and qemu with multiqueue support should also be management software friendly, qemu should have the ability to accept file descriptors through cmdline and SCM_RIGHTS.&lt;br /&gt;
    &lt;br /&gt;
== High level Design ==&lt;br /&gt;
The main goals of multiqueue is to explore the parallelism of each module who is involved in the packet transmission and reception:&lt;br /&gt;
* macvtap/tap: For single queue virtio-net, one socket of macvtap/tap was abstracted as a queue for both tx and rx. We can reuse and extend this abstraction to allow macvtap/tap can dequeue and enqueue packets from multiple sockets. Then each socket can be treated as a tx and rx, and macvtap/tap is fact a multi-queue device in the host. The host network codes can then transmit and receive packets in parallel.&lt;br /&gt;
* vhost: The parallelism could be done through using multiple vhost threads to handle multiple sockets. Currently, there&#039;s two choices in design.&lt;br /&gt;
** 1:1 mapping between vhost threads and sockets. This method does not need vhost changes and just launch the the same number of vhost threads as queues. Each vhost thread is just used to handle one tx ring and rx ring just as they are used for single queue virtio-net.&lt;br /&gt;
** M:N mapping between vhost threads and sockets. This methods allow a single vhost thread to poll more than one tx/rx rings and sockests and use separated threads to handle tx and rx request.&lt;br /&gt;
* qemu: qemu is in charge of the fllowing things&lt;br /&gt;
** allow multiple tap file descriptors to be used for a single emulated nic&lt;br /&gt;
** userspace multiqueue virtio-net implementation which is used to maintaining compatibility, doing management and migration&lt;br /&gt;
** control the vhost based on the userspace multiqueue virtio-net&lt;br /&gt;
* guest driver&lt;br /&gt;
** Allocate multiple rx/tx queues&lt;br /&gt;
** Assign each queue a MSI-X vector in order to parallize the packet processing in guest stack&lt;br /&gt;
&lt;br /&gt;
The big picture looks like:&lt;br /&gt;
[[Image:ver1.jpg|left]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Choices and considerations ===&lt;br /&gt;
* 1:1 or M:N, 1:1 is much more simpler than M:N for both coding and queue/vhost management in qemu. And in theory, it could provides better parallelism than M:N. Performance test is needed to be done by both of the implementation.&lt;br /&gt;
*  Whether use a per-cpu queue: Morden 10gb card ( and M$ RSS) suggest to use the abstract of per-cpu queue that tries to allocate as many tx/rx queues as cpu numbers and does a 1:1 mapping between them. This can provides better parallelism and cache locality. Also this could simplify the other design such as in order delivery and flow director. For virtio-net, at least for guest with small number of vcpus, per-cpu queues is a better choice.  &lt;br /&gt;
The big picture is shown as.&lt;br /&gt;
&lt;br /&gt;
== Current status ==&lt;br /&gt;
&lt;br /&gt;
* macvtap/macvlan have basic multiqueue support.&lt;br /&gt;
* bridge does not have queue, but when it use a multiqueue tap as one of it port, some optimization may be needed.&lt;br /&gt;
* 1:1 Implementation &lt;br /&gt;
** qemu parts: http://www.spinics.net/lists/kvm/msg52808.html&lt;br /&gt;
** tap and guest driver:  http://www.spinics.net/lists/kvm/msg59993.html&lt;br /&gt;
* M:N Implementation&lt;br /&gt;
** kk&#039;s newest series(qemu/vhost/guest drivers): http://www.spinics.net/lists/kvm/msg52094.html&lt;br /&gt;
&lt;br /&gt;
== Detail Design ==&lt;br /&gt;
&lt;br /&gt;
=== Multiqueue Macvtap ===&lt;br /&gt;
&lt;br /&gt;
==== Basic design: ====&lt;br /&gt;
* Each socket were abstracted as a queue and the basic is allow multiple sockets to be attached to a single macvtap device. &lt;br /&gt;
* Queue attaching is done through open the named inode many times&lt;br /&gt;
* Each time it is opened a new socket were attached to the deivce and a file descriptor were returned and use for a backend of virtio-net backend (qemu or vhost-net).&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
In order to make the tx path lockless, macvtap use NETIF_F_LLTX to avoid the tx lock contention when host transmit packets. So its indeed a multiqueue network device from the point view of host.&lt;br /&gt;
==== Queue selector ====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
=== Multiqueue tun/tap ===&lt;br /&gt;
==== Basic design ====&lt;br /&gt;
* Borrow the idea of macvtap, we just allow multiple sockets to be attached in a single tap device. &lt;br /&gt;
* As there&#039;s no named inode for tap device, new ioctls IFF_ATTACH_QUEUE/IFF_DETACH_QUEUE is introduced to attach or detach a socket from tun/tap which can be used by virtio-net backend to add or delete a queue.. &lt;br /&gt;
* All socket related structures were moved to the private_data of file and initialized during file open. &lt;br /&gt;
* In order to keep semantics of TUNSETIFF and make the changes transparent to the legacy user of tun/tap, the allocating and initializing of network device is still done in TUNSETIFF, and first queue is atomically attached.&lt;br /&gt;
* IFF_DETACH_QUEUE is used to attach an unattached file/socket to a tap device. * IFF_DETACH_QUEUE is used to detach a file from a tap device. IFF_DETACH_QUEUE is used for temporarily disable a queue that is useful for maintain backward compatibility of guest. ( Running single queue driver on a multiqueue device ).&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
pesudo codes to create a two queue tap device&lt;br /&gt;
# fd1 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd1, TUNSETIFF, &amp;quot;tap&amp;quot;)&lt;br /&gt;
# fd2 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd2, IFF_ATTACH_QUEUE, &amp;quot;tap&amp;quot;)&lt;br /&gt;
then we have two queues tap device with fd1 and fd2 as its queue sockets.&lt;br /&gt;
&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
Just like macvtap, NETIF_F_LLTX is also used to tun/tap to avoid tx lock contention. And tun/tap is also in fact a multiqueue network device of host.&lt;br /&gt;
==== Queue selector (same as macvtap)====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
&lt;br /&gt;
==== Further Optimization ? ===&lt;br /&gt;
&lt;br /&gt;
# rxhash can only used for distributing workloads into different vcpus. The target vcpu may not be the one who is expected to do the recvmsg(). So more optimizations may need as:&lt;br /&gt;
## A simple hash to queue table to record the cpu/queue used by the flow. It is updated when guest send packets, and when tun/tap transmit packets to guest, this table could be used to do the queue selection.&lt;br /&gt;
## Some co-operation between host and guest driver to pass information such as which vcpu is isssue a recvmsg().&lt;br /&gt;
&lt;br /&gt;
=== vhost ===&lt;br /&gt;
* 1:1 without changes&lt;br /&gt;
* M:N [TBD]&lt;br /&gt;
&lt;br /&gt;
=== qemu changes ===&lt;br /&gt;
The changes in qemu contains two part:&lt;br /&gt;
* Add generic multiqueue support for nic layer: As the receiving function of nic backend is only aware of VLANClientState, we must make it aware of queue index, so&lt;br /&gt;
** Store queue_index in VLANClientState&lt;br /&gt;
** Store multiple VLANClientState in NICState&lt;br /&gt;
** Let netdev parameters accept multiple netdev ids, and link those tap based VLANClientState to their peers in NICState&lt;br /&gt;
* Userspace multiqueue support in virtio-net&lt;br /&gt;
** Allocate multiple virtqueues&lt;br /&gt;
** Expose the queue numbers through config space&lt;br /&gt;
** Enable the multiple support of backend only when the feature negotiated&lt;br /&gt;
** Handle packet request based on the queue_index of virtqueue and VLANClientState&lt;br /&gt;
** migration hanling&lt;br /&gt;
* Vhost enable/disable&lt;br /&gt;
** launch multiple vhost threads&lt;br /&gt;
** setup eventfd and contol the start/stop of vhost_net backend&lt;br /&gt;
The using of this is like:&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100 -netdev tap,id=hn1,fd=101 -device virtio-net-pci,netdev=hn0#hn1,queue=2 .....&lt;br /&gt;
&lt;br /&gt;
TODO: more user-friendly cmdline such as&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100,fd=101 -device virtio-net-pci,netdev=hn0,queues=2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== guest driver ===&lt;br /&gt;
The changes in guest driver as mainly:&lt;br /&gt;
* Allocate the number of tx and rx queue based on the queue number in config space&lt;br /&gt;
* Assign each queue a MSI-X vector&lt;br /&gt;
* Per-queue handling of TX/RX request&lt;br /&gt;
* Simply use skb_tx_hash() to choose the queue&lt;br /&gt;
&lt;br /&gt;
==== Future Optimizations ====&lt;br /&gt;
* Per-vcpu queue: Allocate as many tx/rx queues as the vcpu numbers. And bind tx/rx queue pairs to a specific vcpu by:&lt;br /&gt;
** Set the MSI-X irq affinity for tx/rx.&lt;br /&gt;
** Use smp_processor_id() to choose the tx queue..&lt;br /&gt;
* Comments: In theory, this should improve the parallelism, [TBD]&lt;br /&gt;
&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
== Test ==&lt;br /&gt;
* Test tool: netperf, iperf&lt;br /&gt;
* Test protocol: TCP_STREAM TCP_MAERTS TCP_RR&lt;br /&gt;
** between localhost and guest&lt;br /&gt;
** between external host and guest with a 10gb direct link&lt;br /&gt;
** regression criteria: throughout%cpu&lt;br /&gt;
* Test method:&lt;br /&gt;
** multiple sessions of netperf: 1 2 4 8 16&lt;br /&gt;
** compare with the single queue implementation&lt;br /&gt;
* Other&lt;br /&gt;
** numactl to bind the cpunode and memorynode&lt;br /&gt;
** autotest implemented a performance regression test, used T-test&lt;br /&gt;
** use netperf demo-mode to get more stable results&lt;br /&gt;
== Performance Numbers ==&lt;br /&gt;
[[Multiqueue-performance-Sep-13|Performance]]&lt;br /&gt;
== TODO ==&lt;br /&gt;
== Reference ==&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4616</id>
		<title>Multiqueue</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4616"/>
		<updated>2012-12-25T13:36:06Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Multiqueue virtio-net =&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
&lt;br /&gt;
This page provides information about the design of multi-queue virtio-net, an approach enables packet sending/receiving processing to scale with the number of available vcpus of guest. This page provides an overview of multiqueue virtio-net and discusses the design of various parts involved. The page also contains some basic performance test result. This work is in progress and the design may changes.&lt;br /&gt;
&lt;br /&gt;
== Contact ==&lt;br /&gt;
* Jason Wang &amp;lt;jasowang@redhat.com&amp;gt;&lt;br /&gt;
* Amos Kong &amp;lt;akong@redhat.com&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Rationale ==&lt;br /&gt;
&lt;br /&gt;
Today&#039;s high-end server have more processors, guests running on them tend have an increasing number of vcpus. The scale of the protocol stack in guest in restricted because of the single queue virtio-net:&lt;br /&gt;
&lt;br /&gt;
* The network performance does not scale as the number of vcpus increasing: Guest can not transmit or retrieve  packets in parallel as virtio-net have only one TX and RX, virtio-net drivers must be synchronized before sending and receiving packets. Even through there&#039;s software technology to spread the loads into different processor such as RFS, such kind of method is only for transmission and is really expensive in guest as they depends on IPI which may brings extra overhead in virtualized environment.&lt;br /&gt;
* Multiqueue nic were more common used and is well supported by linux kernel, but current virtual nic can not utilize the multi queue support: the tap and virtio-net backend must serialize the co-current transmission/receiving request comes from different cpus.&lt;br /&gt;
&lt;br /&gt;
In order the remove those bottlenecks, we must allow the paralleled packet processing by introducing multi queue support for both back-end and guest drivers. Ideally, we may let the packet handing be done by processors in parallel without interleaving and scale the network performance as the number of vcpus increasing.&lt;br /&gt;
&lt;br /&gt;
== Git &amp;amp; Cmdline ==&lt;br /&gt;
&lt;br /&gt;
* kernel changes: git://github.com/jasowang/kernel-mq.git&lt;br /&gt;
* qemu-kvm changes: git://github.com/jasowang/qemu-kvm-mq.git&lt;br /&gt;
* qemu-kvm -netdev tap,id=hn0,queues=M -device virtio-net-pci,netdev=hn0,queues=M,vectors=N ...&lt;br /&gt;
&lt;br /&gt;
== Status &amp;amp; Challenges ==&lt;br /&gt;
* Status&lt;br /&gt;
** Have patches for all part but need performance tuning for small packet transmission&lt;br /&gt;
** Several of new vhost threading model were proposed&lt;br /&gt;
&lt;br /&gt;
* Challenges&lt;br /&gt;
** small packet transmission&lt;br /&gt;
*** Reason:&lt;br /&gt;
**** spread the packets into different queue reduce the possibility of batching thus damage the performance.&lt;br /&gt;
**** some but not too much batching may help for the performance&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
***** Find a adaptive/dynamic algorithm to switch between one queue mode and multiqueue mode&lt;br /&gt;
****** Find the threshold to do the switch, not easy as the traffic were unexpected in real workloads&lt;br /&gt;
****** Avoid the packet re-ordering when switching packet&lt;br /&gt;
***** Current Status &amp;amp; working on:&lt;br /&gt;
****** Add ioctl to notify tap to switch to one queue mode&lt;br /&gt;
****** switch when needed&lt;br /&gt;
** vhost threading&lt;br /&gt;
*** Three model were proposed:&lt;br /&gt;
**** per vq pairs, each vhost thread is polling a tx/rx vq pairs&lt;br /&gt;
***** simple&lt;br /&gt;
***** regression in small packet transmission&lt;br /&gt;
***** lack the numa affinity as vhost does the copy&lt;br /&gt;
***** may not scale well when using multiqueue as we may create more vhost threads than the numer of host cpu&lt;br /&gt;
**** multi-workers: multiple worker thread for a device&lt;br /&gt;
***** wakeup all threads and the thread contend for the work&lt;br /&gt;
***** improve the parallism especially for RR tes&lt;br /&gt;
***** broadcast wakeup and contention&lt;br /&gt;
***** #vhost threads may greater than #cpu&lt;br /&gt;
***** no numa consideration&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
**** per-cpu vhost thread:&lt;br /&gt;
***** pick a random thread in the same socket (except for the cpu that initiated the request) to handle the request&lt;br /&gt;
***** best performance in most conditions&lt;br /&gt;
***** schedule by it self and bypass the host scheduler, only suitable for network load&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
**** More testing&lt;br /&gt;
&lt;br /&gt;
== Design Goals ==&lt;br /&gt;
&lt;br /&gt;
=== Parallel send/receive processing ===&lt;br /&gt;
&lt;br /&gt;
To make sure the whole stack could be worked in parallel, the parallelism of not only the front-end (guest driver) but  also the back-end (vhost and tap/macvtap) must be explored. This is done by:&lt;br /&gt;
&lt;br /&gt;
* Allowing multiple sockets to be attached to tap/macvtap&lt;br /&gt;
* Using multiple threaded vhost to serve as the backend of a multiqueue capable virtio-net adapter&lt;br /&gt;
* Use a multi-queue awared virtio-net driver to send and receive packets to/from each queue&lt;br /&gt;
&lt;br /&gt;
=== In order delivery ===&lt;br /&gt;
&lt;br /&gt;
Packets for a specific stream are delivered in order to the TCP/IP stack in guest.&lt;br /&gt;
&lt;br /&gt;
=== Low overhead ===&lt;br /&gt;
&lt;br /&gt;
The multiqueue implementation should be low overhead, cache locality and send-side scaling could be maintained by&lt;br /&gt;
&lt;br /&gt;
* making sure the packets form a single connection are mapped to a specific processor.&lt;br /&gt;
* the send completion (TCP ACK) were sent to the same vcpu who send the data&lt;br /&gt;
* other considerations such as NUMA and HT&lt;br /&gt;
&lt;br /&gt;
=== No assumption about the underlying hardware ===&lt;br /&gt;
&lt;br /&gt;
The implementation should not tagert for specific hardware/environment. For example we should not only optimize the the host nic with RSS or flow director support.&lt;br /&gt;
&lt;br /&gt;
=== Compatibility ===&lt;br /&gt;
* Guest ABI: Based on the virtio specification, the multiqueue implementation of virtio-net should keep the compatibility with the single queue. The ability of multiqueue must be enabled through feature negotiation which make sure single queue driver can work under multiqueue backend, and multiqueue driver can work in single queue backend.&lt;br /&gt;
* Userspace ABI: As the changes may touch tun/tap which may have non-virtualized users, the semantics of ioctl must be kept in order to not break the application that use them. New function must be doen through new ioctls.&lt;br /&gt;
&lt;br /&gt;
=== Management friendly ===&lt;br /&gt;
The backend (tap/macvtap) should provides an easy to changes the number of queues/sockets. and qemu with multiqueue support should also be management software friendly, qemu should have the ability to accept file descriptors through cmdline and SCM_RIGHTS.&lt;br /&gt;
    &lt;br /&gt;
== High level Design ==&lt;br /&gt;
The main goals of multiqueue is to explore the parallelism of each module who is involved in the packet transmission and reception:&lt;br /&gt;
* macvtap/tap: For single queue virtio-net, one socket of macvtap/tap was abstracted as a queue for both tx and rx. We can reuse and extend this abstraction to allow macvtap/tap can dequeue and enqueue packets from multiple sockets. Then each socket can be treated as a tx and rx, and macvtap/tap is fact a multi-queue device in the host. The host network codes can then transmit and receive packets in parallel.&lt;br /&gt;
* vhost: The parallelism could be done through using multiple vhost threads to handle multiple sockets. Currently, there&#039;s two choices in design.&lt;br /&gt;
** 1:1 mapping between vhost threads and sockets. This method does not need vhost changes and just launch the the same number of vhost threads as queues. Each vhost thread is just used to handle one tx ring and rx ring just as they are used for single queue virtio-net.&lt;br /&gt;
** M:N mapping between vhost threads and sockets. This methods allow a single vhost thread to poll more than one tx/rx rings and sockests and use separated threads to handle tx and rx request.&lt;br /&gt;
* qemu: qemu is in charge of the fllowing things&lt;br /&gt;
** allow multiple tap file descriptors to be used for a single emulated nic&lt;br /&gt;
** userspace multiqueue virtio-net implementation which is used to maintaining compatibility, doing management and migration&lt;br /&gt;
** control the vhost based on the userspace multiqueue virtio-net&lt;br /&gt;
* guest driver&lt;br /&gt;
** Allocate multiple rx/tx queues&lt;br /&gt;
** Assign each queue a MSI-X vector in order to parallize the packet processing in guest stack&lt;br /&gt;
&lt;br /&gt;
The big picture looks like:&lt;br /&gt;
[[Image:ver1.jpg|left]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Choices and considerations ===&lt;br /&gt;
* 1:1 or M:N, 1:1 is much more simpler than M:N for both coding and queue/vhost management in qemu. And in theory, it could provides better parallelism than M:N. Performance test is needed to be done by both of the implementation.&lt;br /&gt;
*  Whether use a per-cpu queue: Morden 10gb card ( and M$ RSS) suggest to use the abstract of per-cpu queue that tries to allocate as many tx/rx queues as cpu numbers and does a 1:1 mapping between them. This can provides better parallelism and cache locality. Also this could simplify the other design such as in order delivery and flow director. For virtio-net, at least for guest with small number of vcpus, per-cpu queues is a better choice.  &lt;br /&gt;
The big picture is shown as.&lt;br /&gt;
&lt;br /&gt;
== Current status ==&lt;br /&gt;
&lt;br /&gt;
* macvtap/macvlan have basic multiqueue support.&lt;br /&gt;
* bridge does not have queue, but when it use a multiqueue tap as one of it port, some optimization may be needed.&lt;br /&gt;
* 1:1 Implementation &lt;br /&gt;
** qemu parts: http://www.spinics.net/lists/kvm/msg52808.html&lt;br /&gt;
** tap and guest driver:  http://www.spinics.net/lists/kvm/msg59993.html&lt;br /&gt;
* M:N Implementation&lt;br /&gt;
** kk&#039;s newest series(qemu/vhost/guest drivers): http://www.spinics.net/lists/kvm/msg52094.html&lt;br /&gt;
&lt;br /&gt;
== Detail Design ==&lt;br /&gt;
&lt;br /&gt;
=== Multiqueue Macvtap ===&lt;br /&gt;
&lt;br /&gt;
==== Basic design: ====&lt;br /&gt;
* Each socket were abstracted as a queue and the basic is allow multiple sockets to be attached to a single macvtap device. &lt;br /&gt;
* Queue attaching is done through open the named inode many times&lt;br /&gt;
* Each time it is opened a new socket were attached to the deivce and a file descriptor were returned and use for a backend of virtio-net backend (qemu or vhost-net).&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
In order to make the tx path lockless, macvtap use NETIF_F_LLTX to avoid the tx lock contention when host transmit packets. So its indeed a multiqueue network device from the point view of host.&lt;br /&gt;
==== Queue selector ====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
=== Multiqueue tun/tap ===&lt;br /&gt;
==== Basic design ====&lt;br /&gt;
* Borrow the idea of macvtap, we just allow multiple sockets to be attached in a single tap device. &lt;br /&gt;
* As there&#039;s no named inode for tap device, new ioctls UNATTACHQUEUE/TUNDETACHQUEUE is introduced to attach or detach a socket from tun/tap which can be used by virtio-net backend to add or delete a queue.. &lt;br /&gt;
* All socket related structures were moved to the private_data of file and initialized during file open. &lt;br /&gt;
* In order to keep semantics of TUNSETIFF and make the changes transparent to the legacy user of tun/tap, the allocating and initializing of network device is still done in TUNSETIFF, and first queue is atomically attached.&lt;br /&gt;
* TUNATTACH is used to attach an unattached file/socket to a tap device. * TUNDETACH is used to detach a file from a tap device. TUNDETACH is used for temporarily disable a queue that is useful for maintain backward compatibility of guest. ( Running single queue driver on a multiqueue device ).&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
pesudo codes to create a two queue tap device&lt;br /&gt;
# fd1 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd1, TUNSETIFF, &amp;quot;tap&amp;quot;)&lt;br /&gt;
# fd2 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd2, TUNATTACHQUEUE, &amp;quot;tap&amp;quot;)&lt;br /&gt;
then we have two queues tap device with fd1 and fd2 as its queue sockets.&lt;br /&gt;
&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
Just like macvtap, NETIF_F_LLTX is also used to tun/tap to avoid tx lock contention. And tun/tap is also in fact a multiqueue network device of host.&lt;br /&gt;
==== Queue selector (same as macvtap)====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
&lt;br /&gt;
==== Further Optimization ? ===&lt;br /&gt;
&lt;br /&gt;
# rxhash can only used for distributing workloads into different vcpus. The target vcpu may not be the one who is expected to do the recvmsg(). So more optimizations may need as:&lt;br /&gt;
## A simple hash to queue table to record the cpu/queue used by the flow. It is updated when guest send packets, and when tun/tap transmit packets to guest, this table could be used to do the queue selection.&lt;br /&gt;
## Some co-operation between host and guest driver to pass information such as which vcpu is isssue a recvmsg().&lt;br /&gt;
&lt;br /&gt;
=== vhost ===&lt;br /&gt;
* 1:1 without changes&lt;br /&gt;
* M:N [TBD]&lt;br /&gt;
&lt;br /&gt;
=== qemu changes ===&lt;br /&gt;
The changes in qemu contains two part:&lt;br /&gt;
* Add generic multiqueue support for nic layer: As the receiving function of nic backend is only aware of VLANClientState, we must make it aware of queue index, so&lt;br /&gt;
** Store queue_index in VLANClientState&lt;br /&gt;
** Store multiple VLANClientState in NICState&lt;br /&gt;
** Let netdev parameters accept multiple netdev ids, and link those tap based VLANClientState to their peers in NICState&lt;br /&gt;
* Userspace multiqueue support in virtio-net&lt;br /&gt;
** Allocate multiple virtqueues&lt;br /&gt;
** Expose the queue numbers through config space&lt;br /&gt;
** Enable the multiple support of backend only when the feature negotiated&lt;br /&gt;
** Handle packet request based on the queue_index of virtqueue and VLANClientState&lt;br /&gt;
** migration hanling&lt;br /&gt;
* Vhost enable/disable&lt;br /&gt;
** launch multiple vhost threads&lt;br /&gt;
** setup eventfd and contol the start/stop of vhost_net backend&lt;br /&gt;
The using of this is like:&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100 -netdev tap,id=hn1,fd=101 -device virtio-net-pci,netdev=hn0#hn1,queue=2 .....&lt;br /&gt;
&lt;br /&gt;
TODO: more user-friendly cmdline such as&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100,fd=101 -device virtio-net-pci,netdev=hn0,queues=2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== guest driver ===&lt;br /&gt;
The changes in guest driver as mainly:&lt;br /&gt;
* Allocate the number of tx and rx queue based on the queue number in config space&lt;br /&gt;
* Assign each queue a MSI-X vector&lt;br /&gt;
* Per-queue handling of TX/RX request&lt;br /&gt;
* Simply use skb_tx_hash() to choose the queue&lt;br /&gt;
&lt;br /&gt;
==== Future Optimizations ====&lt;br /&gt;
* Per-vcpu queue: Allocate as many tx/rx queues as the vcpu numbers. And bind tx/rx queue pairs to a specific vcpu by:&lt;br /&gt;
** Set the MSI-X irq affinity for tx/rx.&lt;br /&gt;
** Use smp_processor_id() to choose the tx queue..&lt;br /&gt;
* Comments: In theory, this should improve the parallelism, [TBD]&lt;br /&gt;
&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
== Test ==&lt;br /&gt;
* Test tool: netperf, iperf&lt;br /&gt;
* Test protocol: TCP_STREAM TCP_MAERTS TCP_RR&lt;br /&gt;
** between localhost and guest&lt;br /&gt;
** between external host and guest with a 10gb direct link&lt;br /&gt;
** regression criteria: throughout%cpu&lt;br /&gt;
* Test method:&lt;br /&gt;
** multiple sessions of netperf: 1 2 4 8 16&lt;br /&gt;
** compare with the single queue implementation&lt;br /&gt;
* Other&lt;br /&gt;
** numactl to bind the cpunode and memorynode&lt;br /&gt;
** autotest implemented a performance regression test, used T-test&lt;br /&gt;
** use netperf demo-mode to get more stable results&lt;br /&gt;
== Performance Numbers ==&lt;br /&gt;
[[Multiqueue-performance-Sep-13|Performance]]&lt;br /&gt;
== TODO ==&lt;br /&gt;
== Reference ==&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4575</id>
		<title>Multiqueue</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4575"/>
		<updated>2012-09-01T02:53:16Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Multiqueue virtio-net =&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
&lt;br /&gt;
This page provides information about the design of multi-queue virtio-net, an approach enables packet sending/receiving processing to scale with the number of available vcpus of guest. This page provides an overview of multiqueue virtio-net and discusses the design of various parts involved. The page also contains some basic performance test result. This work is in progress and the design may changes.&lt;br /&gt;
&lt;br /&gt;
== Contact ==&lt;br /&gt;
* Jason Wang &amp;lt;jasowang@redhat.com&amp;gt;&lt;br /&gt;
* Amos Kong &amp;lt;akong@redhat.com&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Rationale ==&lt;br /&gt;
&lt;br /&gt;
Today&#039;s high-end server have more processors, guests running on them tend have an increasing number of vcpus. The scale of the protocol stack in guest in restricted because of the single queue virtio-net:&lt;br /&gt;
&lt;br /&gt;
* The network performance does not scale as the number of vcpus increasing: Guest can not transmit or retrieve  packets in parallel as virtio-net have only one TX and RX, virtio-net drivers must be synchronized before sending and receiving packets. Even through there&#039;s software technology to spread the loads into different processor such as RFS, such kind of method is only for transmission and is really expensive in guest as they depends on IPI which may brings extra overhead in virtualized environment.&lt;br /&gt;
* Multiqueue nic were more common used and is well supported by linux kernel, but current virtual nic can not utilize the multi queue support: the tap and virtio-net backend must serialize the co-current transmission/receiving request comes from different cpus.&lt;br /&gt;
&lt;br /&gt;
In order the remove those bottlenecks, we must allow the paralleled packet processing by introducing multi queue support for both back-end and guest drivers. Ideally, we may let the packet handing be done by processors in parallel without interleaving and scale the network performance as the number of vcpus increasing.&lt;br /&gt;
&lt;br /&gt;
== Git &amp;amp; Cmdline ==&lt;br /&gt;
&lt;br /&gt;
* kernel changes: git://github.com/jasowang/kernel-mq.git&lt;br /&gt;
* qemu-kvm changes: git://github.com/jasowang/qemu-kvm-mq.git&lt;br /&gt;
* qemu-kvm -netdev tap,id=hn0,queues=M -device virtio-net-pci,netdev=hn0,queues=M,vectors=N ...&lt;br /&gt;
&lt;br /&gt;
== Status &amp;amp; Challenges ==&lt;br /&gt;
* Status&lt;br /&gt;
** Have patches for all part but need performance tuning for small packet transmission&lt;br /&gt;
** Several of new vhost threading model were proposed&lt;br /&gt;
&lt;br /&gt;
* Challenges&lt;br /&gt;
** small packet transmission&lt;br /&gt;
*** Reason:&lt;br /&gt;
**** spread the packets into different queue reduce the possibility of batching thus damage the performance.&lt;br /&gt;
**** some but not too much batching may help for the performance&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
***** Find a adaptive/dynamic algorithm to switch between one queue mode and multiqueue mode&lt;br /&gt;
****** Find the threshold to do the switch, not easy as the traffic were unexpected in real workloads&lt;br /&gt;
****** Avoid the packet re-ordering when switching packet&lt;br /&gt;
***** Current Status &amp;amp; working on:&lt;br /&gt;
****** Add ioctl to notify tap to switch to one queue mode&lt;br /&gt;
****** switch when needed&lt;br /&gt;
** vhost threading&lt;br /&gt;
*** Three model were proposed:&lt;br /&gt;
**** per vq pairs, each vhost thread is polling a tx/rx vq pairs&lt;br /&gt;
***** simple&lt;br /&gt;
***** regression in small packet transmission&lt;br /&gt;
***** lack the numa affinity as vhost does the copy&lt;br /&gt;
***** may not scale well when using multiqueue as we may create more vhost threads than the numer of host cpu&lt;br /&gt;
**** multi-workers: multiple worker thread for a device&lt;br /&gt;
***** wakeup all threads and the thread contend for the work&lt;br /&gt;
***** improve the parallism especially for RR tes&lt;br /&gt;
***** broadcast wakeup and contention&lt;br /&gt;
***** #vhost threads may greater than #cpu&lt;br /&gt;
***** no numa consideration&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
**** per-cpu vhost thread:&lt;br /&gt;
***** pick a random thread in the same socket (except for the cpu that initiated the request) to handle the request&lt;br /&gt;
***** best performance in most conditions&lt;br /&gt;
***** schedule by it self and bypass the host scheduler, only suitable for network load&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
**** More testing&lt;br /&gt;
&lt;br /&gt;
== Design Goals ==&lt;br /&gt;
&lt;br /&gt;
=== Parallel send/receive processing ===&lt;br /&gt;
&lt;br /&gt;
To make sure the whole stack could be worked in parallel, the parallelism of not only the front-end (guest driver) but  also the back-end (vhost and tap/macvtap) must be explored. This is done by:&lt;br /&gt;
&lt;br /&gt;
* Allowing multiple sockets to be attached to tap/macvtap&lt;br /&gt;
* Using multiple threaded vhost to serve as the backend of a multiqueue capable virtio-net adapter&lt;br /&gt;
* Use a multi-queue awared virtio-net driver to send and receive packets to/from each queue&lt;br /&gt;
&lt;br /&gt;
=== In order delivery ===&lt;br /&gt;
&lt;br /&gt;
Packets for a specific stream are delivered in order to the TCP/IP stack in guest.&lt;br /&gt;
&lt;br /&gt;
=== Low overhead ===&lt;br /&gt;
&lt;br /&gt;
The multiqueue implementation should be low overhead, cache locality and send-side scaling could be maintained by&lt;br /&gt;
&lt;br /&gt;
* making sure the packets form a single connection are mapped to a specific processor.&lt;br /&gt;
* the send completion (TCP ACK) were sent to the same vcpu who send the data&lt;br /&gt;
* other considerations such as NUMA and HT&lt;br /&gt;
&lt;br /&gt;
=== No assumption about the underlying hardware ===&lt;br /&gt;
&lt;br /&gt;
The implementation should not tagert for specific hardware/environment. For example we should not only optimize the the host nic with RSS or flow director support.&lt;br /&gt;
&lt;br /&gt;
=== Compatibility ===&lt;br /&gt;
* Guest ABI: Based on the virtio specification, the multiqueue implementation of virtio-net should keep the compatibility with the single queue. The ability of multiqueue must be enabled through feature negotiation which make sure single queue driver can work under multiqueue backend, and multiqueue driver can work in single queue backend.&lt;br /&gt;
* Userspace ABI: As the changes may touch tun/tap which may have non-virtualized users, the semantics of ioctl must be kept in order to not break the application that use them. New function must be doen through new ioctls.&lt;br /&gt;
&lt;br /&gt;
=== Management friendly ===&lt;br /&gt;
The backend (tap/macvtap) should provides an easy to changes the number of queues/sockets. and qemu with multiqueue support should also be management software friendly, qemu should have the ability to accept file descriptors through cmdline and SCM_RIGHTS.&lt;br /&gt;
    &lt;br /&gt;
== High level Design ==&lt;br /&gt;
The main goals of multiqueue is to explore the parallelism of each module who is involved in the packet transmission and reception:&lt;br /&gt;
* macvtap/tap: For single queue virtio-net, one socket of macvtap/tap was abstracted as a queue for both tx and rx. We can reuse and extend this abstraction to allow macvtap/tap can dequeue and enqueue packets from multiple sockets. Then each socket can be treated as a tx and rx, and macvtap/tap is fact a multi-queue device in the host. The host network codes can then transmit and receive packets in parallel.&lt;br /&gt;
* vhost: The parallelism could be done through using multiple vhost threads to handle multiple sockets. Currently, there&#039;s two choices in design.&lt;br /&gt;
** 1:1 mapping between vhost threads and sockets. This method does not need vhost changes and just launch the the same number of vhost threads as queues. Each vhost thread is just used to handle one tx ring and rx ring just as they are used for single queue virtio-net.&lt;br /&gt;
** M:N mapping between vhost threads and sockets. This methods allow a single vhost thread to poll more than one tx/rx rings and sockests and use separated threads to handle tx and rx request.&lt;br /&gt;
* qemu: qemu is in charge of the fllowing things&lt;br /&gt;
** allow multiple tap file descriptors to be used for a single emulated nic&lt;br /&gt;
** userspace multiqueue virtio-net implementation which is used to maintaining compatibility, doing management and migration&lt;br /&gt;
** control the vhost based on the userspace multiqueue virtio-net&lt;br /&gt;
* guest driver&lt;br /&gt;
** Allocate multiple rx/tx queues&lt;br /&gt;
** Assign each queue a MSI-X vector in order to parallize the packet processing in guest stack&lt;br /&gt;
&lt;br /&gt;
The big picture looks like:&lt;br /&gt;
[[Image:ver1.jpg|left]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Choices and considerations ===&lt;br /&gt;
* 1:1 or M:N, 1:1 is much more simpler than M:N for both coding and queue/vhost management in qemu. And in theory, it could provides better parallelism than M:N. Performance test is needed to be done by both of the implementation.&lt;br /&gt;
*  Whether use a per-cpu queue: Morden 10gb card ( and M$ RSS) suggest to use the abstract of per-cpu queue that tries to allocate as many tx/rx queues as cpu numbers and does a 1:1 mapping between them. This can provides better parallelism and cache locality. Also this could simplify the other design such as in order delivery and flow director. For virtio-net, at least for guest with small number of vcpus, per-cpu queues is a better choice.  &lt;br /&gt;
The big picture is shown as.&lt;br /&gt;
&lt;br /&gt;
== Current status ==&lt;br /&gt;
&lt;br /&gt;
* macvtap/macvlan have basic multiqueue support.&lt;br /&gt;
* bridge does not have queue, but when it use a multiqueue tap as one of it port, some optimization may be needed.&lt;br /&gt;
* 1:1 Implementation &lt;br /&gt;
** qemu parts: http://www.spinics.net/lists/kvm/msg52808.html&lt;br /&gt;
** tap and guest driver:  http://www.spinics.net/lists/kvm/msg59993.html&lt;br /&gt;
* M:N Implementation&lt;br /&gt;
** kk&#039;s newest series(qemu/vhost/guest drivers): http://www.spinics.net/lists/kvm/msg52094.html&lt;br /&gt;
&lt;br /&gt;
== Detail Design ==&lt;br /&gt;
&lt;br /&gt;
=== Multiqueue Macvtap ===&lt;br /&gt;
&lt;br /&gt;
==== Basic design: ====&lt;br /&gt;
* Each socket were abstracted as a queue and the basic is allow multiple sockets to be attached to a single macvtap device. &lt;br /&gt;
* Queue attaching is done through open the named inode many times&lt;br /&gt;
* Each time it is opened a new socket were attached to the deivce and a file descriptor were returned and use for a backend of virtio-net backend (qemu or vhost-net).&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
In order to make the tx path lockless, macvtap use NETIF_F_LLTX to avoid the tx lock contention when host transmit packets. So its indeed a multiqueue network device from the point view of host.&lt;br /&gt;
==== Queue selector ====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
=== Multiqueue tun/tap ===&lt;br /&gt;
==== Basic design ====&lt;br /&gt;
* Borrow the idea of macvtap, we just allow multiple sockets to be attached in a single tap device. &lt;br /&gt;
* As there&#039;s no named inode for tap device, new ioctls UNATTACHQUEUE/TUNDETACHQUEUE is introduced to attach or detach a socket from tun/tap which can be used by virtio-net backend to add or delete a queue.. &lt;br /&gt;
* All socket related structures were moved to the private_data of file and initialized during file open. &lt;br /&gt;
* In order to keep semantics of TUNSETIFF and make the changes transparent to the legacy user of tun/tap, the allocating and initializing of network device is still done in TUNSETIFF, and first queue is atomically attached.&lt;br /&gt;
* TUNATTACH is used to attach an unattached file/socket to a tap device. * TUNDETACH is used to detach a file from a tap device. TUNDETACH is used for temporarily disable a queue that is useful for maintain backward compatibility of guest. ( Running single queue driver on a multiqueue device ).&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
pesudo codes to create a two queue tap device&lt;br /&gt;
# fd1 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd1, TUNSETIFF, &amp;quot;tap&amp;quot;)&lt;br /&gt;
# fd2 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd2, TUNATTACHQUEUE, &amp;quot;tap&amp;quot;)&lt;br /&gt;
then we have two queues tap device with fd1 and fd2 as its queue sockets.&lt;br /&gt;
&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
Just like macvtap, NETIF_F_LLTX is also used to tun/tap to avoid tx lock contention. And tun/tap is also in fact a multiqueue network device of host.&lt;br /&gt;
==== Queue selector (same as macvtap)====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
&lt;br /&gt;
==== Further Optimization ? ===&lt;br /&gt;
&lt;br /&gt;
# rxhash can only used for distributing workloads into different vcpus. The target vcpu may not be the one who is expected to do the recvmsg(). So more optimizations may need as:&lt;br /&gt;
## A simple hash to queue table to record the cpu/queue used by the flow. It is updated when guest send packets, and when tun/tap transmit packets to guest, this table could be used to do the queue selection.&lt;br /&gt;
## Some co-operation between host and guest driver to pass information such as which vcpu is isssue a recvmsg().&lt;br /&gt;
&lt;br /&gt;
=== vhost ===&lt;br /&gt;
* 1:1 without changes&lt;br /&gt;
* M:N [TBD]&lt;br /&gt;
&lt;br /&gt;
=== qemu changes ===&lt;br /&gt;
The changes in qemu contains two part:&lt;br /&gt;
* Add generic multiqueue support for nic layer: As the receiving function of nic backend is only aware of VLANClientState, we must make it aware of queue index, so&lt;br /&gt;
** Store queue_index in VLANClientState&lt;br /&gt;
** Store multiple VLANClientState in NICState&lt;br /&gt;
** Let netdev parameters accept multiple netdev ids, and link those tap based VLANClientState to their peers in NICState&lt;br /&gt;
* Userspace multiqueue support in virtio-net&lt;br /&gt;
** Allocate multiple virtqueues&lt;br /&gt;
** Expose the queue numbers through config space&lt;br /&gt;
** Enable the multiple support of backend only when the feature negotiated&lt;br /&gt;
** Handle packet request based on the queue_index of virtqueue and VLANClientState&lt;br /&gt;
** migration hanling&lt;br /&gt;
* Vhost enable/disable&lt;br /&gt;
** launch multiple vhost threads&lt;br /&gt;
** setup eventfd and contol the start/stop of vhost_net backend&lt;br /&gt;
The using of this is like:&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100 -netdev tap,id=hn1,fd=101 -device virtio-net-pci,netdev=hn0#hn1,queue=2 .....&lt;br /&gt;
&lt;br /&gt;
TODO: more user-friendly cmdline such as&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100,fd=101 -device virtio-net-pci,netdev=hn0,queues=2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== guest driver ===&lt;br /&gt;
The changes in guest driver as mainly:&lt;br /&gt;
* Allocate the number of tx and rx queue based on the queue number in config space&lt;br /&gt;
* Assign each queue a MSI-X vector&lt;br /&gt;
* Per-queue handling of TX/RX request&lt;br /&gt;
* Simply use skb_tx_hash() to choose the queue&lt;br /&gt;
&lt;br /&gt;
==== Future Optimizations ====&lt;br /&gt;
* Per-vcpu queue: Allocate as many tx/rx queues as the vcpu numbers. And bind tx/rx queue pairs to a specific vcpu by:&lt;br /&gt;
** Set the MSI-X irq affinity for tx/rx.&lt;br /&gt;
** Use smp_processor_id() to choose the tx queue..&lt;br /&gt;
* Comments: In theory, this should improve the parallelism, [TBD]&lt;br /&gt;
&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
== Test ==&lt;br /&gt;
* Test tool: netperf, iperf&lt;br /&gt;
* Test protocol: TCP_STREAM TCP_MAERTS TCP_RR&lt;br /&gt;
** between localhost and guest&lt;br /&gt;
** between external host and guest with a 10gb direct link&lt;br /&gt;
** regression criteria: throughout%cpu&lt;br /&gt;
* Test method:&lt;br /&gt;
** multiple sessions of netperf: 1 2 4 8 16&lt;br /&gt;
** compare with the single queue implementation&lt;br /&gt;
* Other&lt;br /&gt;
** numactl to bind the cpunode and memorynode&lt;br /&gt;
** autotest implemented a performance regression test, used T-test&lt;br /&gt;
== Performance Numbers ==&lt;br /&gt;
[[Multiqueue-performance-Sep-13|Performance]]&lt;br /&gt;
== TODO ==&lt;br /&gt;
== Reference ==&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4559</id>
		<title>Multiqueue</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4559"/>
		<updated>2012-07-05T11:50:36Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Multiqueue virtio-net =&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
&lt;br /&gt;
This page provides information about the design of multi-queue virtio-net, an approach enables packet sending/receiving processing to scale with the number of available vcpus of guest. This page provides an overview of multiqueue virtio-net and discusses the design of various parts involved. The page also contains some basic performance test result. This work is in progress and the design may changes.&lt;br /&gt;
&lt;br /&gt;
== Contact ==&lt;br /&gt;
* Jason Wang &amp;lt;jasowang@redhat.com&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Rationale ==&lt;br /&gt;
&lt;br /&gt;
Today&#039;s high-end server have more processors, guests running on them tend have an increasing number of vcpus. The scale of the protocol stack in guest in restricted because of the single queue virtio-net:&lt;br /&gt;
&lt;br /&gt;
* The network performance does not scale as the number of vcpus increasing: Guest can not transmit or retrieve  packets in parallel as virtio-net have only one TX and RX, virtio-net drivers must be synchronized before sending and receiving packets. Even through there&#039;s software technology to spread the loads into different processor such as RFS, such kind of method is only for transmission and is really expensive in guest as they depends on IPI which may brings extra overhead in virtualized environment.&lt;br /&gt;
* Multiqueue nic were more common used and is well supported by linux kernel, but current virtual nic can not utilize the multi queue support: the tap and virtio-net backend must serialize the co-current transmission/receiving request comes from different cpus.&lt;br /&gt;
&lt;br /&gt;
In order the remove those bottlenecks, we must allow the paralleled packet processing by introducing multi queue support for both back-end and guest drivers. Ideally, we may let the packet handing be done by processors in parallel without interleaving and scale the network performance as the number of vcpus increasing.&lt;br /&gt;
&lt;br /&gt;
== Git &amp;amp; Cmdline ==&lt;br /&gt;
&lt;br /&gt;
* kernel changes: git://github.com/jasowang/kernel-mq.git&lt;br /&gt;
* qemu-kvm changes: git://github.com/jasowang/qemu-kvm-mq.git&lt;br /&gt;
* qemu-kvm -netdev tap,id=hn0,queues=M -device virtio-net-pci,netdev=hn0,queues=M,vectors=N ...&lt;br /&gt;
&lt;br /&gt;
== Status &amp;amp; Challenges ==&lt;br /&gt;
* Status&lt;br /&gt;
** Have patches for all part but need performance tuning for small packet transmission&lt;br /&gt;
** Several of new vhost threading model were proposed&lt;br /&gt;
&lt;br /&gt;
* Challenges&lt;br /&gt;
** small packet transmission&lt;br /&gt;
*** Reason:&lt;br /&gt;
**** spread the packets into different queue reduce the possibility of batching thus damage the performance.&lt;br /&gt;
**** some but not too much batching may help for the performance&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
***** Find a adaptive/dynamic algorithm to switch between one queue mode and multiqueue mode&lt;br /&gt;
****** Find the threshold to do the switch, not easy as the traffic were unexpected in real workloads&lt;br /&gt;
****** Avoid the packet re-ordering when switching packet&lt;br /&gt;
***** Current Status &amp;amp; working on:&lt;br /&gt;
****** Add ioctl to notify tap to switch to one queue mode&lt;br /&gt;
****** switch when needed&lt;br /&gt;
** vhost threading&lt;br /&gt;
*** Three model were proposed:&lt;br /&gt;
**** per vq pairs, each vhost thread is polling a tx/rx vq pairs&lt;br /&gt;
***** simple&lt;br /&gt;
***** regression in small packet transmission&lt;br /&gt;
***** lack the numa affinity as vhost does the copy&lt;br /&gt;
***** may not scale well when using multiqueue as we may create more vhost threads than the numer of host cpu&lt;br /&gt;
**** multi-workers: multiple worker thread for a device&lt;br /&gt;
***** wakeup all threads and the thread contend for the work&lt;br /&gt;
***** improve the parallism especially for RR tes&lt;br /&gt;
***** broadcast wakeup and contention&lt;br /&gt;
***** #vhost threads may greater than #cpu&lt;br /&gt;
***** no numa consideration&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
**** per-cpu vhost thread:&lt;br /&gt;
***** pick a random thread in the same socket (except for the cpu that initiated the request) to handle the request&lt;br /&gt;
***** best performance in most conditions&lt;br /&gt;
***** schedule by it self and bypass the host scheduler, only suitable for network load&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
**** More testing&lt;br /&gt;
&lt;br /&gt;
== Design Goals ==&lt;br /&gt;
&lt;br /&gt;
=== Parallel send/receive processing ===&lt;br /&gt;
&lt;br /&gt;
To make sure the whole stack could be worked in parallel, the parallelism of not only the front-end (guest driver) but  also the back-end (vhost and tap/macvtap) must be explored. This is done by:&lt;br /&gt;
&lt;br /&gt;
* Allowing multiple sockets to be attached to tap/macvtap&lt;br /&gt;
* Using multiple threaded vhost to serve as the backend of a multiqueue capable virtio-net adapter&lt;br /&gt;
* Use a multi-queue awared virtio-net driver to send and receive packets to/from each queue&lt;br /&gt;
&lt;br /&gt;
=== In order delivery ===&lt;br /&gt;
&lt;br /&gt;
Packets for a specific stream are delivered in order to the TCP/IP stack in guest.&lt;br /&gt;
&lt;br /&gt;
=== Low overhead ===&lt;br /&gt;
&lt;br /&gt;
The multiqueue implementation should be low overhead, cache locality and send-side scaling could be maintained by&lt;br /&gt;
&lt;br /&gt;
* making sure the packets form a single connection are mapped to a specific processor.&lt;br /&gt;
* the send completion (TCP ACK) were sent to the same vcpu who send the data&lt;br /&gt;
* other considerations such as NUMA and HT&lt;br /&gt;
&lt;br /&gt;
=== No assumption about the underlying hardware ===&lt;br /&gt;
&lt;br /&gt;
The implementation should not tagert for specific hardware/environment. For example we should not only optimize the the host nic with RSS or flow director support.&lt;br /&gt;
&lt;br /&gt;
=== Compatibility ===&lt;br /&gt;
* Guest ABI: Based on the virtio specification, the multiqueue implementation of virtio-net should keep the compatibility with the single queue. The ability of multiqueue must be enabled through feature negotiation which make sure single queue driver can work under multiqueue backend, and multiqueue driver can work in single queue backend.&lt;br /&gt;
* Userspace ABI: As the changes may touch tun/tap which may have non-virtualized users, the semantics of ioctl must be kept in order to not break the application that use them. New function must be doen through new ioctls.&lt;br /&gt;
&lt;br /&gt;
=== Management friendly ===&lt;br /&gt;
The backend (tap/macvtap) should provides an easy to changes the number of queues/sockets. and qemu with multiqueue support should also be management software friendly, qemu should have the ability to accept file descriptors through cmdline and SCM_RIGHTS.&lt;br /&gt;
    &lt;br /&gt;
== High level Design ==&lt;br /&gt;
The main goals of multiqueue is to explore the parallelism of each module who is involved in the packet transmission and reception:&lt;br /&gt;
* macvtap/tap: For single queue virtio-net, one socket of macvtap/tap was abstracted as a queue for both tx and rx. We can reuse and extend this abstraction to allow macvtap/tap can dequeue and enqueue packets from multiple sockets. Then each socket can be treated as a tx and rx, and macvtap/tap is fact a multi-queue device in the host. The host network codes can then transmit and receive packets in parallel.&lt;br /&gt;
* vhost: The parallelism could be done through using multiple vhost threads to handle multiple sockets. Currently, there&#039;s two choices in design.&lt;br /&gt;
** 1:1 mapping between vhost threads and sockets. This method does not need vhost changes and just launch the the same number of vhost threads as queues. Each vhost thread is just used to handle one tx ring and rx ring just as they are used for single queue virtio-net.&lt;br /&gt;
** M:N mapping between vhost threads and sockets. This methods allow a single vhost thread to poll more than one tx/rx rings and sockests and use separated threads to handle tx and rx request.&lt;br /&gt;
* qemu: qemu is in charge of the fllowing things&lt;br /&gt;
** allow multiple tap file descriptors to be used for a single emulated nic&lt;br /&gt;
** userspace multiqueue virtio-net implementation which is used to maintaining compatibility, doing management and migration&lt;br /&gt;
** control the vhost based on the userspace multiqueue virtio-net&lt;br /&gt;
* guest driver&lt;br /&gt;
** Allocate multiple rx/tx queues&lt;br /&gt;
** Assign each queue a MSI-X vector in order to parallize the packet processing in guest stack&lt;br /&gt;
&lt;br /&gt;
The big picture looks like:&lt;br /&gt;
[[Image:ver1.jpg|left]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Choices and considerations ===&lt;br /&gt;
* 1:1 or M:N, 1:1 is much more simpler than M:N for both coding and queue/vhost management in qemu. And in theory, it could provides better parallelism than M:N. Performance test is needed to be done by both of the implementation.&lt;br /&gt;
*  Whether use a per-cpu queue: Morden 10gb card ( and M$ RSS) suggest to use the abstract of per-cpu queue that tries to allocate as many tx/rx queues as cpu numbers and does a 1:1 mapping between them. This can provides better parallelism and cache locality. Also this could simplify the other design such as in order delivery and flow director. For virtio-net, at least for guest with small number of vcpus, per-cpu queues is a better choice.  &lt;br /&gt;
The big picture is shown as.&lt;br /&gt;
&lt;br /&gt;
== Current status ==&lt;br /&gt;
&lt;br /&gt;
* macvtap/macvlan have basic multiqueue support.&lt;br /&gt;
* bridge does not have queue, but when it use a multiqueue tap as one of it port, some optimization may be needed.&lt;br /&gt;
* 1:1 Implementation &lt;br /&gt;
** qemu parts: http://www.spinics.net/lists/kvm/msg52808.html&lt;br /&gt;
** tap and guest driver:  http://www.spinics.net/lists/kvm/msg59993.html&lt;br /&gt;
* M:N Implementation&lt;br /&gt;
** kk&#039;s newest series(qemu/vhost/guest drivers): http://www.spinics.net/lists/kvm/msg52094.html&lt;br /&gt;
&lt;br /&gt;
== Detail Design ==&lt;br /&gt;
&lt;br /&gt;
=== Multiqueue Macvtap ===&lt;br /&gt;
&lt;br /&gt;
==== Basic design: ====&lt;br /&gt;
* Each socket were abstracted as a queue and the basic is allow multiple sockets to be attached to a single macvtap device. &lt;br /&gt;
* Queue attaching is done through open the named inode many times&lt;br /&gt;
* Each time it is opened a new socket were attached to the deivce and a file descriptor were returned and use for a backend of virtio-net backend (qemu or vhost-net).&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
In order to make the tx path lockless, macvtap use NETIF_F_LLTX to avoid the tx lock contention when host transmit packets. So its indeed a multiqueue network device from the point view of host.&lt;br /&gt;
==== Queue selector ====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
=== Multiqueue tun/tap ===&lt;br /&gt;
==== Basic design ====&lt;br /&gt;
* Borrow the idea of macvtap, we just allow multiple sockets to be attached in a single tap device. &lt;br /&gt;
* As there&#039;s no named inode for tap device, new ioctls UNATTACHQUEUE/TUNDETACHQUEUE is introduced to attach or detach a socket from tun/tap which can be used by virtio-net backend to add or delete a queue.. &lt;br /&gt;
* All socket related structures were moved to the private_data of file and initialized during file open. &lt;br /&gt;
* In order to keep semantics of TUNSETIFF and make the changes transparent to the legacy user of tun/tap, the allocating and initializing of network device is still done in TUNSETIFF, and first queue is atomically attached.&lt;br /&gt;
* TUNATTACH is used to attach an unattached file/socket to a tap device. * TUNDETACH is used to detach a file from a tap device. TUNDETACH is used for temporarily disable a queue that is useful for maintain backward compatibility of guest. ( Running single queue driver on a multiqueue device ).&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
pesudo codes to create a two queue tap device&lt;br /&gt;
# fd1 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd1, TUNSETIFF, &amp;quot;tap&amp;quot;)&lt;br /&gt;
# fd2 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd2, TUNATTACHQUEUE, &amp;quot;tap&amp;quot;)&lt;br /&gt;
then we have two queues tap device with fd1 and fd2 as its queue sockets.&lt;br /&gt;
&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
Just like macvtap, NETIF_F_LLTX is also used to tun/tap to avoid tx lock contention. And tun/tap is also in fact a multiqueue network device of host.&lt;br /&gt;
==== Queue selector (same as macvtap)====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
&lt;br /&gt;
==== Further Optimization ? ===&lt;br /&gt;
&lt;br /&gt;
# rxhash can only used for distributing workloads into different vcpus. The target vcpu may not be the one who is expected to do the recvmsg(). So more optimizations may need as:&lt;br /&gt;
## A simple hash to queue table to record the cpu/queue used by the flow. It is updated when guest send packets, and when tun/tap transmit packets to guest, this table could be used to do the queue selection.&lt;br /&gt;
## Some co-operation between host and guest driver to pass information such as which vcpu is isssue a recvmsg().&lt;br /&gt;
&lt;br /&gt;
=== vhost ===&lt;br /&gt;
* 1:1 without changes&lt;br /&gt;
* M:N [TBD]&lt;br /&gt;
&lt;br /&gt;
=== qemu changes ===&lt;br /&gt;
The changes in qemu contains two part:&lt;br /&gt;
* Add generic multiqueue support for nic layer: As the receiving function of nic backend is only aware of VLANClientState, we must make it aware of queue index, so&lt;br /&gt;
** Store queue_index in VLANClientState&lt;br /&gt;
** Store multiple VLANClientState in NICState&lt;br /&gt;
** Let netdev parameters accept multiple netdev ids, and link those tap based VLANClientState to their peers in NICState&lt;br /&gt;
* Userspace multiqueue support in virtio-net&lt;br /&gt;
** Allocate multiple virtqueues&lt;br /&gt;
** Expose the queue numbers through config space&lt;br /&gt;
** Enable the multiple support of backend only when the feature negotiated&lt;br /&gt;
** Handle packet request based on the queue_index of virtqueue and VLANClientState&lt;br /&gt;
** migration hanling&lt;br /&gt;
* Vhost enable/disable&lt;br /&gt;
** launch multiple vhost threads&lt;br /&gt;
** setup eventfd and contol the start/stop of vhost_net backend&lt;br /&gt;
The using of this is like:&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100 -netdev tap,id=hn1,fd=101 -device virtio-net-pci,netdev=hn0#hn1,queue=2 .....&lt;br /&gt;
&lt;br /&gt;
TODO: more user-friendly cmdline such as&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100,fd=101 -device virtio-net-pci,netdev=hn0,queues=2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== guest driver ===&lt;br /&gt;
The changes in guest driver as mainly:&lt;br /&gt;
* Allocate the number of tx and rx queue based on the queue number in config space&lt;br /&gt;
* Assign each queue a MSI-X vector&lt;br /&gt;
* Per-queue handling of TX/RX request&lt;br /&gt;
* Simply use skb_tx_hash() to choose the queue&lt;br /&gt;
&lt;br /&gt;
==== Future Optimizations ====&lt;br /&gt;
* Per-vcpu queue: Allocate as many tx/rx queues as the vcpu numbers. And bind tx/rx queue pairs to a specific vcpu by:&lt;br /&gt;
** Set the MSI-X irq affinity for tx/rx.&lt;br /&gt;
** Use smp_processor_id() to choose the tx queue..&lt;br /&gt;
* Comments: In theory, this should improve the parallelism, [TBD]&lt;br /&gt;
&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
== Test ==&lt;br /&gt;
* Test tool: netperf, iperf&lt;br /&gt;
* Test protocol: TCP_STREAM TCP_MAERTS TCP_RR&lt;br /&gt;
** between localhost and guest&lt;br /&gt;
** between external host and guest with a 10gb direct link&lt;br /&gt;
** regression criteria: throughout%cpu&lt;br /&gt;
* Test method:&lt;br /&gt;
** multiple sessions of netperf: 1 2 4 8 16&lt;br /&gt;
** compare with the single queue implementation&lt;br /&gt;
* Other&lt;br /&gt;
** numactl to bind the cpunode and memorynode&lt;br /&gt;
** autotest implemented a performance regression test, used T-test&lt;br /&gt;
== Performance Numbers ==&lt;br /&gt;
[[Multiqueue-performance-Sep-13|Performance]]&lt;br /&gt;
== TODO ==&lt;br /&gt;
== Reference ==&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4544</id>
		<title>Multiqueue</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=Multiqueue&amp;diff=4544"/>
		<updated>2012-05-20T15:44:28Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= Multiqueue virtio-net =&lt;br /&gt;
&lt;br /&gt;
== Overview ==&lt;br /&gt;
&lt;br /&gt;
This page provides information about the design of multi-queue virtio-net, an approach enables packet sending/receiving processing to scale with the number of available vcpus of guest. This page provides an overview of multiqueue virtio-net and discusses the design of various parts involved. The page also contains some basic performance test result. This work is in progress and the design may changes.&lt;br /&gt;
&lt;br /&gt;
== Contact ==&lt;br /&gt;
* Jason Wang &amp;lt;jasowang@redhat.com&amp;gt;&lt;br /&gt;
* Amos Kong &amp;lt;akong@redhat.com&amp;gt;&lt;br /&gt;
&lt;br /&gt;
== Rationale ==&lt;br /&gt;
&lt;br /&gt;
Today&#039;s high-end server have more processors, guests running on them tend have an increasing number of vcpus. The scale of the protocol stack in guest in restricted because of the single queue virtio-net:&lt;br /&gt;
&lt;br /&gt;
* The network performance does not scale as the number of vcpus increasing: Guest can not transmit or retrieve  packets in parallel as virtio-net have only one TX and RX, virtio-net drivers must be synchronized before sending and receiving packets. Even through there&#039;s software technology to spread the loads into different processor such as RFS, such kind of method is only for transmission and is really expensive in guest as they depends on IPI which may brings extra overhead in virtualized environment.&lt;br /&gt;
* Multiqueue nic were more common used and is well supported by linux kernel, but current virtual nic can not utilize the multi queue support: the tap and virtio-net backend must serialize the co-current transmission/receiving request comes from different cpus.&lt;br /&gt;
&lt;br /&gt;
In order the remove those bottlenecks, we must allow the paralleled packet processing by introducing multi queue support for both back-end and guest drivers. Ideally, we may let the packet handing be done by processors in parallel without interleaving and scale the network performance as the number of vcpus increasing.&lt;br /&gt;
&lt;br /&gt;
== Status &amp;amp; Challenges ==&lt;br /&gt;
* Status&lt;br /&gt;
** Have patches for all part but need performance tuning for small packet transmission&lt;br /&gt;
** Several of new vhost threading model were proposed&lt;br /&gt;
&lt;br /&gt;
* Challenges&lt;br /&gt;
** small packet transmission&lt;br /&gt;
*** Reason:&lt;br /&gt;
**** spread the packets into different queue reduce the possibility of batching thus damage the performance.&lt;br /&gt;
**** some but not too much batching may help for the performance&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
***** Find a adaptive/dynamic algorithm to switch between one queue mode and multiqueue mode&lt;br /&gt;
****** Find the threshold to do the switch, not easy as the traffic were unexpected in real workloads&lt;br /&gt;
****** Avoid the packet re-ordering when switching packet&lt;br /&gt;
***** Current Status &amp;amp; working on:&lt;br /&gt;
****** Add ioctl to notify tap to switch to one queue mode&lt;br /&gt;
****** switch when needed&lt;br /&gt;
** vhost threading&lt;br /&gt;
*** Three model were proposed:&lt;br /&gt;
**** per vq pairs, each vhost thread is polling a tx/rx vq pairs&lt;br /&gt;
***** simple&lt;br /&gt;
***** regression in small packet transmission&lt;br /&gt;
***** lack the numa affinity as vhost does the copy&lt;br /&gt;
***** may not scale well when using multiqueue as we may create more vhost threads than the numer of host cpu&lt;br /&gt;
**** multi-workers: multiple worker thread for a device&lt;br /&gt;
***** wakeup all threads and the thread contend for the work&lt;br /&gt;
***** improve the parallism especially for RR tes&lt;br /&gt;
***** broadcast wakeup and contention&lt;br /&gt;
***** #vhost threads may greater than #cpu&lt;br /&gt;
***** no numa consideration&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
**** per-cpu vhost thread:&lt;br /&gt;
***** pick a random thread in the same socket (except for the cpu that initiated the request) to handle the request&lt;br /&gt;
***** best performance in most conditions&lt;br /&gt;
***** schedule by it self and bypass the host scheduler, only suitable for network load&lt;br /&gt;
***** regression in small packet / small #instances&lt;br /&gt;
*** Solution and challenge:&lt;br /&gt;
**** More testing&lt;br /&gt;
&lt;br /&gt;
== Design Goals ==&lt;br /&gt;
&lt;br /&gt;
=== Parallel send/receive processing ===&lt;br /&gt;
&lt;br /&gt;
To make sure the whole stack could be worked in parallel, the parallelism of not only the front-end (guest driver) but  also the back-end (vhost and tap/macvtap) must be explored. This is done by:&lt;br /&gt;
&lt;br /&gt;
* Allowing multiple sockets to be attached to tap/macvtap&lt;br /&gt;
* Using multiple threaded vhost to serve as the backend of a multiqueue capable virtio-net adapter&lt;br /&gt;
* Use a multi-queue awared virtio-net driver to send and receive packets to/from each queue&lt;br /&gt;
&lt;br /&gt;
=== In order delivery ===&lt;br /&gt;
&lt;br /&gt;
Packets for a specific stream are delivered in order to the TCP/IP stack in guest.&lt;br /&gt;
&lt;br /&gt;
=== Low overhead ===&lt;br /&gt;
&lt;br /&gt;
The multiqueue implementation should be low overhead, cache locality and send-side scaling could be maintained by&lt;br /&gt;
&lt;br /&gt;
* making sure the packets form a single connection are mapped to a specific processor.&lt;br /&gt;
* the send completion (TCP ACK) were sent to the same vcpu who send the data&lt;br /&gt;
* other considerations such as NUMA and HT&lt;br /&gt;
&lt;br /&gt;
=== No assumption about the underlying hardware ===&lt;br /&gt;
&lt;br /&gt;
The implementation should not tagert for specific hardware/environment. For example we should not only optimize the the host nic with RSS or flow director support.&lt;br /&gt;
&lt;br /&gt;
=== Compatibility ===&lt;br /&gt;
* Guest ABI: Based on the virtio specification, the multiqueue implementation of virtio-net should keep the compatibility with the single queue. The ability of multiqueue must be enabled through feature negotiation which make sure single queue driver can work under multiqueue backend, and multiqueue driver can work in single queue backend.&lt;br /&gt;
* Userspace ABI: As the changes may touch tun/tap which may have non-virtualized users, the semantics of ioctl must be kept in order to not break the application that use them. New function must be doen through new ioctls.&lt;br /&gt;
&lt;br /&gt;
=== Management friendly ===&lt;br /&gt;
The backend (tap/macvtap) should provides an easy to changes the number of queues/sockets. and qemu with multiqueue support should also be management software friendly, qemu should have the ability to accept file descriptors through cmdline and SCM_RIGHTS.&lt;br /&gt;
    &lt;br /&gt;
== High level Design ==&lt;br /&gt;
The main goals of multiqueue is to explore the parallelism of each module who is involved in the packet transmission and reception:&lt;br /&gt;
* macvtap/tap: For single queue virtio-net, one socket of macvtap/tap was abstracted as a queue for both tx and rx. We can reuse and extend this abstraction to allow macvtap/tap can dequeue and enqueue packets from multiple sockets. Then each socket can be treated as a tx and rx, and macvtap/tap is fact a multi-queue device in the host. The host network codes can then transmit and receive packets in parallel.&lt;br /&gt;
* vhost: The parallelism could be done through using multiple vhost threads to handle multiple sockets. Currently, there&#039;s two choices in design.&lt;br /&gt;
** 1:1 mapping between vhost threads and sockets. This method does not need vhost changes and just launch the the same number of vhost threads as queues. Each vhost thread is just used to handle one tx ring and rx ring just as they are used for single queue virtio-net.&lt;br /&gt;
** M:N mapping between vhost threads and sockets. This methods allow a single vhost thread to poll more than one tx/rx rings and sockests and use separated threads to handle tx and rx request.&lt;br /&gt;
* qemu: qemu is in charge of the fllowing things&lt;br /&gt;
** allow multiple tap file descriptors to be used for a single emulated nic&lt;br /&gt;
** userspace multiqueue virtio-net implementation which is used to maintaining compatibility, doing management and migration&lt;br /&gt;
** control the vhost based on the userspace multiqueue virtio-net&lt;br /&gt;
* guest driver&lt;br /&gt;
** Allocate multiple rx/tx queues&lt;br /&gt;
** Assign each queue a MSI-X vector in order to parallize the packet processing in guest stack&lt;br /&gt;
&lt;br /&gt;
The big picture looks like:&lt;br /&gt;
[[Image:ver1.jpg|left]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Choices and considerations ===&lt;br /&gt;
* 1:1 or M:N, 1:1 is much more simpler than M:N for both coding and queue/vhost management in qemu. And in theory, it could provides better parallelism than M:N. Performance test is needed to be done by both of the implementation.&lt;br /&gt;
*  Whether use a per-cpu queue: Morden 10gb card ( and M$ RSS) suggest to use the abstract of per-cpu queue that tries to allocate as many tx/rx queues as cpu numbers and does a 1:1 mapping between them. This can provides better parallelism and cache locality. Also this could simplify the other design such as in order delivery and flow director. For virtio-net, at least for guest with small number of vcpus, per-cpu queues is a better choice.  &lt;br /&gt;
The big picture is shown as.&lt;br /&gt;
&lt;br /&gt;
== Current status ==&lt;br /&gt;
&lt;br /&gt;
* macvtap/macvlan have basic multiqueue support.&lt;br /&gt;
* bridge does not have queue, but when it use a multiqueue tap as one of it port, some optimization may be needed.&lt;br /&gt;
* 1:1 Implementation &lt;br /&gt;
** qemu parts: http://www.spinics.net/lists/kvm/msg52808.html&lt;br /&gt;
** tap and guest driver:  http://www.spinics.net/lists/kvm/msg59993.html&lt;br /&gt;
* M:N Implementation&lt;br /&gt;
** kk&#039;s newest series(qemu/vhost/guest drivers): http://www.spinics.net/lists/kvm/msg52094.html&lt;br /&gt;
&lt;br /&gt;
== Detail Design ==&lt;br /&gt;
&lt;br /&gt;
=== Multiqueue Macvtap ===&lt;br /&gt;
&lt;br /&gt;
==== Basic design: ====&lt;br /&gt;
* Each socket were abstracted as a queue and the basic is allow multiple sockets to be attached to a single macvtap device. &lt;br /&gt;
* Queue attaching is done through open the named inode many times&lt;br /&gt;
* Each time it is opened a new socket were attached to the deivce and a file descriptor were returned and use for a backend of virtio-net backend (qemu or vhost-net).&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
In order to make the tx path lockless, macvtap use NETIF_F_LLTX to avoid the tx lock contention when host transmit packets. So its indeed a multiqueue network device from the point view of host.&lt;br /&gt;
==== Queue selector ====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
=== Multiqueue tun/tap ===&lt;br /&gt;
==== Basic design ====&lt;br /&gt;
* Borrow the idea of macvtap, we just allow multiple sockets to be attached in a single tap device. &lt;br /&gt;
* As there&#039;s no named inode for tap device, new ioctls UNATTACHQUEUE/TUNDETACHQUEUE is introduced to attach or detach a socket from tun/tap which can be used by virtio-net backend to add or delete a queue.. &lt;br /&gt;
* All socket related structures were moved to the private_data of file and initialized during file open. &lt;br /&gt;
* In order to keep semantics of TUNSETIFF and make the changes transparent to the legacy user of tun/tap, the allocating and initializing of network device is still done in TUNSETIFF, and first queue is atomically attached.&lt;br /&gt;
* TUNATTACH is used to attach an unattached file/socket to a tap device. * TUNDETACH is used to detach a file from a tap device. TUNDETACH is used for temporarily disable a queue that is useful for maintain backward compatibility of guest. ( Running single queue driver on a multiqueue device ).&lt;br /&gt;
&lt;br /&gt;
Example:&lt;br /&gt;
pesudo codes to create a two queue tap device&lt;br /&gt;
# fd1 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd1, TUNSETIFF, &amp;quot;tap&amp;quot;)&lt;br /&gt;
# fd2 = open(&amp;quot;/dev/net/tun&amp;quot;)&lt;br /&gt;
# ioctl(fd2, TUNATTACHQUEUE, &amp;quot;tap&amp;quot;)&lt;br /&gt;
then we have two queues tap device with fd1 and fd2 as its queue sockets.&lt;br /&gt;
&lt;br /&gt;
==== Parallel processing ====&lt;br /&gt;
Just like macvtap, NETIF_F_LLTX is also used to tun/tap to avoid tx lock contention. And tun/tap is also in fact a multiqueue network device of host.&lt;br /&gt;
==== Queue selector (same as macvtap)====&lt;br /&gt;
It has a simple flow director implementation, when it needs to transmit packets to guest, the queue number is determined by:&lt;br /&gt;
# if the skb have rx queue mapping (for example comes from a mq nic), use this to choose the socket/queue&lt;br /&gt;
# if we can calculate the rxhash of the skb, use it to choose the socket/queue&lt;br /&gt;
# if the above two steps fail, always find the first available socket/queue&lt;br /&gt;
&lt;br /&gt;
==== Further Optimization ? ===&lt;br /&gt;
&lt;br /&gt;
# rxhash can only used for distributing workloads into different vcpus. The target vcpu may not be the one who is expected to do the recvmsg(). So more optimizations may need as:&lt;br /&gt;
## A simple hash to queue table to record the cpu/queue used by the flow. It is updated when guest send packets, and when tun/tap transmit packets to guest, this table could be used to do the queue selection.&lt;br /&gt;
## Some co-operation between host and guest driver to pass information such as which vcpu is isssue a recvmsg().&lt;br /&gt;
&lt;br /&gt;
=== vhost ===&lt;br /&gt;
* 1:1 without changes&lt;br /&gt;
* M:N [TBD]&lt;br /&gt;
&lt;br /&gt;
=== qemu changes ===&lt;br /&gt;
The changes in qemu contains two part:&lt;br /&gt;
* Add generic multiqueue support for nic layer: As the receiving function of nic backend is only aware of VLANClientState, we must make it aware of queue index, so&lt;br /&gt;
** Store queue_index in VLANClientState&lt;br /&gt;
** Store multiple VLANClientState in NICState&lt;br /&gt;
** Let netdev parameters accept multiple netdev ids, and link those tap based VLANClientState to their peers in NICState&lt;br /&gt;
* Userspace multiqueue support in virtio-net&lt;br /&gt;
** Allocate multiple virtqueues&lt;br /&gt;
** Expose the queue numbers through config space&lt;br /&gt;
** Enable the multiple support of backend only when the feature negotiated&lt;br /&gt;
** Handle packet request based on the queue_index of virtqueue and VLANClientState&lt;br /&gt;
** migration hanling&lt;br /&gt;
* Vhost enable/disable&lt;br /&gt;
** launch multiple vhost threads&lt;br /&gt;
** setup eventfd and contol the start/stop of vhost_net backend&lt;br /&gt;
The using of this is like:&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100 -netdev tap,id=hn1,fd=101 -device virtio-net-pci,netdev=hn0#hn1,queue=2 .....&lt;br /&gt;
&lt;br /&gt;
TODO: more user-friendly cmdline such as&lt;br /&gt;
qemu -netdev tap,id=hn0,fd=100,fd=101 -device virtio-net-pci,netdev=hn0,queues=2&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== guest driver ===&lt;br /&gt;
The changes in guest driver as mainly:&lt;br /&gt;
* Allocate the number of tx and rx queue based on the queue number in config space&lt;br /&gt;
* Assign each queue a MSI-X vector&lt;br /&gt;
* Per-queue handling of TX/RX request&lt;br /&gt;
* Simply use skb_tx_hash() to choose the queue&lt;br /&gt;
&lt;br /&gt;
==== Future Optimizations ====&lt;br /&gt;
* Per-vcpu queue: Allocate as many tx/rx queues as the vcpu numbers. And bind tx/rx queue pairs to a specific vcpu by:&lt;br /&gt;
** Set the MSI-X irq affinity for tx/rx.&lt;br /&gt;
** Use smp_processor_id() to choose the tx queue..&lt;br /&gt;
* Comments: In theory, this should improve the parallelism, [TBD]&lt;br /&gt;
&lt;br /&gt;
* ...&lt;br /&gt;
&lt;br /&gt;
== Test ==&lt;br /&gt;
* Test tool: netperf, iperf&lt;br /&gt;
* Test protocol: TCP_STREAM TCP_MAERTS TCP_RR&lt;br /&gt;
** between localhost and guest&lt;br /&gt;
** between external host and guest with a 10gb direct link&lt;br /&gt;
** regression criteria: throughout%cpu&lt;br /&gt;
* Test method:&lt;br /&gt;
** multiple sessions of netperf: 1 2 4 8 16&lt;br /&gt;
** compare with the single queue implementation&lt;br /&gt;
* Other&lt;br /&gt;
** numactl to bind the cpunode and memorynode&lt;br /&gt;
** autotest implemented a performance regression test, used T-test&lt;br /&gt;
== Performance Numbers ==&lt;br /&gt;
[[Multiqueue-performance-Sep-13|Performance]]&lt;br /&gt;
== TODO ==&lt;br /&gt;
== Reference ==&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3691</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3691"/>
		<updated>2011-08-03T06:50:45Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* add &#039;Multi-guests&#039;/&#039;guest to external boxs&#039; transfer test&lt;br /&gt;
* port some whql tests to autotest(virtio_net,virtio_blk)&lt;br /&gt;
* add multiple queue support&lt;br /&gt;
* support of customed qemu cmdline&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== amos ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* Extend performance tests (framework &amp;amp; cases)&lt;br /&gt;
* Push redhat internal kvm subtests to upstream (had completed the 1th plan)&lt;br /&gt;
* Improve network subtests, especial stress tests&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== cleber ==&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== RH Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py (shuang)&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
&lt;br /&gt;
== Pradeep ==&lt;br /&gt;
&lt;br /&gt;
* item 1&lt;br /&gt;
* item 2&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3690</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3690"/>
		<updated>2011-08-03T06:30:14Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* add &#039;Multi-guests&#039;/&#039;guest to external boxs&#039; transfer test&lt;br /&gt;
* port some whql tests to autotest(virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== amos ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* Extend performance tests (framework &amp;amp; cases)&lt;br /&gt;
* Push redhat internal kvm subtests to upstream (had completed the 1th plan)&lt;br /&gt;
* Improve network subtests, especial stress tests&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== cleber ==&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== RH Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py (shuang)&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
&lt;br /&gt;
== Pradeep ==&lt;br /&gt;
&lt;br /&gt;
* item 1&lt;br /&gt;
* item 2&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3689</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3689"/>
		<updated>2011-08-02T06:34:20Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* add &#039;Multi-guests&#039;/&#039;guest to external boxs&#039; transfer test&lt;br /&gt;
* port some whql tests to autotest(virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== amos ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* Push redhat internal kvm subtests to upstream (had completed the 1th plan)&lt;br /&gt;
* Improve network subtests, especial stress tests&lt;br /&gt;
* Improve server-side migration&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== cleber ==&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== RH Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py (shuang)&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
&lt;br /&gt;
== Pradeep ==&lt;br /&gt;
&lt;br /&gt;
* item 1&lt;br /&gt;
* item 2&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3688</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3688"/>
		<updated>2011-08-02T06:29:10Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* add &#039;Multi-guests&#039;/&#039;guest to external boxs&#039; transfer test&lt;br /&gt;
* port some whql tests to autotest(virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== amos ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* Push redhat internal kvm subtests to upstream (had completed the 1th plan)&lt;br /&gt;
* Improve network subtests, especial stress tests&lt;br /&gt;
* Improve server-side migration&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== cleber ==&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== RH Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py (shuang)&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3687</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3687"/>
		<updated>2011-08-02T06:27:03Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* add &#039;Multi-guests&#039;/&#039;guest to external boxs&#039; transfer test&lt;br /&gt;
* port some whql tests to autotest(virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== amos ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* Push redhat internal kvm subtests to upstream (had completed the 1th plan)&lt;br /&gt;
* Improve network subtests, especial stress tests&lt;br /&gt;
* Improve server-side migration&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== RH Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py (shuang)&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3686</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3686"/>
		<updated>2011-08-02T06:24:16Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* add &#039;Multi-guests&#039;/&#039;guest to external boxs&#039; transfer test&lt;br /&gt;
* port some whql tests to autotest(virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== amos ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* Push redhat internal kvm subtests to upstream (had completed the 1th plan)&lt;br /&gt;
* Improve network subtests, especial stress tests&lt;br /&gt;
* Improve server-side migration&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== RH Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py (shuang)&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3685</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3685"/>
		<updated>2011-08-02T06:19:51Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* add &#039;Multi-guests&#039;/&#039;guest to external boxs&#039; transfer test&lt;br /&gt;
* port some whql tests to autotest(virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== amos ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* Push redhat internal kvm subtests to upstream (had completed the 1th plan)&lt;br /&gt;
* Improve network subtests, especial stress tests&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== RH Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py (shuang)&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3677</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3677"/>
		<updated>2011-07-29T05:36:13Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== Amos ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* Push redhat internal kvm subtests to upstream.The ones left:&lt;br /&gt;
** hdparm.py&lt;br /&gt;
* Improve network subtests, especial stress tests&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3676</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3676"/>
		<updated>2011-07-29T05:33:50Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== Amos ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* push redhat internal kvm subtests to upstream.The ones left:&lt;br /&gt;
** hdparm.py&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3675</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3675"/>
		<updated>2011-07-29T05:32:42Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== Amos ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* push redhat internal kvm subtests to upstream.&lt;br /&gt;
The ones left:&lt;br /&gt;
** hdparm.py&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3674</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3674"/>
		<updated>2011-07-29T05:32:02Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* push redhat internal kvm subtests to upstream.&lt;br /&gt;
The ones left:&lt;br /&gt;
** hdparm.py&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* cpuflags.py&lt;br /&gt;
* qmp_command.py (fyang)&lt;br /&gt;
* qmp_event_notification.py (fyang)&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3673</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3673"/>
		<updated>2011-07-29T04:56:15Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* Turn tsc_drift into a kvm unit test.&lt;br /&gt;
* push redhat internal kvm subtests to upstream.&lt;br /&gt;
The ones left:&lt;br /&gt;
** hdparm.py&lt;br /&gt;
** cpuflags.py&lt;br /&gt;
** qmp_command.py&lt;br /&gt;
** qmp_event_notification.py&lt;br /&gt;
** format_disk.py (it had been included to multi_disk.py which is already in upstream)&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3659</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3659"/>
		<updated>2011-07-11T07:51:29Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* push redhat internal kvm subtests to upstream.&lt;br /&gt;
The ones left:&lt;br /&gt;
** hdparm.py&lt;br /&gt;
** cpuflags.py&lt;br /&gt;
** qmp_command.py&lt;br /&gt;
** qmp_event_notification.py&lt;br /&gt;
** floppy.py&lt;br /&gt;
** format_disk.py&lt;br /&gt;
** usb.py&lt;br /&gt;
&lt;br /&gt;
have sent patch to upstream:&lt;br /&gt;
** cdrom.py&lt;br /&gt;
** nmi_watchdog.py&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3658</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3658"/>
		<updated>2011-07-11T06:18:00Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* push redhat internal kvm subtests to upstream. The ones left:&lt;br /&gt;
** hdparm.py&lt;br /&gt;
** cpuflags.py&lt;br /&gt;
** nmi_watchdog.py&lt;br /&gt;
** qmp_command.py&lt;br /&gt;
** qmp_event_notification.py&lt;br /&gt;
** cdrom.py (patch sent)&lt;br /&gt;
** floppy.py&lt;br /&gt;
** format_disk.py&lt;br /&gt;
** usb.py&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3533</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3533"/>
		<updated>2011-04-27T10:58:14Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
* Autotest refactor for usage with libvirt/xen. See more detauls at the [[KVM Autotest Refactor page]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* Complete James Ren patch to parametrized control files&lt;br /&gt;
* First commit found on 4720 - http://autotest.kernel.org/changeset/4720&lt;br /&gt;
* The reason is that we should implement things like job submission and the kvm autotest wrapper properly on top of the parametrized control file feature, that is currently halfway implemented&lt;br /&gt;
&lt;br /&gt;
Tasks:&lt;br /&gt;
&lt;br /&gt;
** Move kvm autotest libraries to client/bin directory, so every client side test can use &lt;br /&gt;
** Review xen-autotest to see what are the differences between our vm class and theirs. Hopefully in the future, xen-autotest and libvirt-autotest will be only different vm implementations of our virt-autotest framework :)&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
=== TODO ===&lt;br /&gt;
&lt;br /&gt;
* Generate autotest API documentation with tools such as doxygen&lt;br /&gt;
* Walk through autotest with Ray Chauvin&lt;br /&gt;
&lt;br /&gt;
* Multi host migration - Already have a 1st patchset available, working to rebase it against a cleanup patchset made by Jason&lt;br /&gt;
&lt;br /&gt;
* Create &#039;maintenance jobs&#039; just to keep the machines up to date, with appropriate software installed.&lt;br /&gt;
* Hotplug tests during migration&lt;br /&gt;
* Work on setting up the conmux server for our internal test grid&lt;br /&gt;
* Research how to trigger jobs based on events rather than using the web interface setting recurrent jobs&lt;br /&gt;
* Set up a qemu-block job, with kwolf&#039;s qemu-block tree&lt;br /&gt;
* Set up a vhost job, with the upstream job (mst&#039;s trees for kernel and userspace)&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer &lt;br /&gt;
&lt;br /&gt;
=== Done ===&lt;br /&gt;
&lt;br /&gt;
* Infrastructure for testing different vm storage backends: file/lvm/iSCSI - Done by Beijing team&lt;br /&gt;
* Made some internal patches appropriate for upstream&lt;br /&gt;
* Copy kvm-autotest@redhat.com on the email list of results&lt;br /&gt;
* Send code to retrieve host kernels based on koji/brew tags upstream&lt;br /&gt;
* Resolved UUID mount point check problem&lt;br /&gt;
* QMP suite integrated upstream&lt;br /&gt;
* Wrote code to retrieve kernels based on koji/brew tags&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
* Post-commit review of new network patches (committed on Oct 7 2010)&lt;br /&gt;
* Use more exceptions in utility functions:&lt;br /&gt;
** Add convenience functions to kvm_subprocess.py to make running shell commands shorter&lt;br /&gt;
** Add convenience functions to the VM class&lt;br /&gt;
* Refactor _get_command_output() and friends in kvm_monitor.py and add cmd_raw(), cmd_obj() and cmd_qmp() as required by Luiz Capitulino&#039;s test suite&lt;br /&gt;
* Use select() instead of sleep() in kvm_monitor.py&lt;br /&gt;
* Simplify migration code if possible (in kvm_vm.py and kvm_test_utils.py)&lt;br /&gt;
* Fix VM.get_ifname() somehow (uses self.vnc_port -- bad)&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* push redhat internal kvm subtests(more than 18) to upstream&lt;br /&gt;
&#039;&#039;trans_hugepage_defrag.py trans_hugepage.py trans_hugepage_swapping.py hdparm.py cpuflags.py multi_disk.py nmi_watchdog.py qmp_command.py qmp_event_notification.py cdrom.py fillup_disk.py floppy.py format_disk.py lvm.py  usb.py ... &#039;&#039;[doing..]&lt;br /&gt;
stop_continue.py[review..]&lt;br /&gt;
image_copy.py module_probe.py[done]&lt;br /&gt;
&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge [sent v2]&lt;br /&gt;
* Add a new test: check if guest transmits packets when link is up/down [done]&lt;br /&gt;
* Report problem with multiple NICs and MAC address tracking [problem fixed]&lt;br /&gt;
* review and test michael&#039;s whql patchsets, give feedback to michael [done]&lt;br /&gt;
* Bonding test [done]&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Tests in parallel with migration (reboot/unattended installation/file transfer/ autotest client tests / guest_script,autoit? / memtest86+(steps?)), need modifications to framework?&lt;br /&gt;
* Refactor the network cmdline generation, using peer when possible and vlan for elder qemu-kvm&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Profiler of pidstat&lt;br /&gt;
* guest-&amp;gt;guest guest-&amp;gt;host netperf support&lt;br /&gt;
* More readable test result of netperf&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
* Migration to exec test suggested by mst&lt;br /&gt;
* Clean the Migration tests: mainly for exec and add offline support&lt;br /&gt;
* Write a &#039;cli&#039; wrapper to make autotest handy/easier for developers&lt;br /&gt;
* Refactor hotplug test using qdev and netdev_del/drive_del&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Cleber ==&lt;br /&gt;
&lt;br /&gt;
* Create initscripts for the scheduller daemon (monitor_db_babysitter)&lt;br /&gt;
* Revisit job parameterization to ease job submission&lt;br /&gt;
* Integrate Brew/Koji messaging with automatic job submission&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
* Look and fix the block_hotplug test upstream&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3459</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3459"/>
		<updated>2010-12-22T12:47:51Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
* Autotest refactor for usage with libvirt/xen. See more detauls at the [[KVM Autotest Refactor page]]&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* Complete James Ren patch to parametrized control files&lt;br /&gt;
* First commit found on 4720 - http://autotest.kernel.org/changeset/4720&lt;br /&gt;
* The reason is that we should implement things like job submission and the kvm autotest wrapper properly on top of the parametrized control file feature, that is currently halfway implemented&lt;br /&gt;
&lt;br /&gt;
Tasks:&lt;br /&gt;
&lt;br /&gt;
** Move kvm autotest libraries to client/bin directory, so every client side test can use &lt;br /&gt;
** Review xen-autotest to see what are the differences between our vm class and theirs. Hopefully in the future, xen-autotest and libvirt-autotest will be only different vm implementations of our virt-autotest framework :)&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
=== TODO ===&lt;br /&gt;
&lt;br /&gt;
* Multi host migration - Already have a 1st patchset available, working to rebase it against a cleanup patchset made by Jason&lt;br /&gt;
&lt;br /&gt;
* Create &#039;maintenance jobs&#039; just to keep the machines up to date, with appropriate software installed.&lt;br /&gt;
* Hotplug tests during migration&lt;br /&gt;
* Work on setting up the conmux server for our internal test grid&lt;br /&gt;
* Research how to trigger jobs based on events rather than using the web interface setting recurrent jobs&lt;br /&gt;
* Set up a qemu-block job, with kwolf&#039;s qemu-block tree&lt;br /&gt;
* Set up a vhost job, with the upstream job (mst&#039;s trees for kernel and userspace)&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer &lt;br /&gt;
&lt;br /&gt;
=== Done ===&lt;br /&gt;
&lt;br /&gt;
* Infrastructure for testing different vm storage backends: file/lvm/iSCSI - Done by Beijing team&lt;br /&gt;
* Made some internal patches appropriate for upstream&lt;br /&gt;
* Copy kvm-autotest@redhat.com on the email list of results&lt;br /&gt;
* Send code to retrieve host kernels based on koji/brew tags upstream&lt;br /&gt;
* Resolved UUID mount point check problem&lt;br /&gt;
* QMP suite integrated upstream&lt;br /&gt;
* Wrote code to retrieve kernels based on koji/brew tags&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
* Post-commit review of new network patches (committed on Oct 7 2010)&lt;br /&gt;
* Use more exceptions in utility functions:&lt;br /&gt;
** Add convenience functions to kvm_subprocess.py to make running shell commands shorter&lt;br /&gt;
** Add convenience functions to the VM class&lt;br /&gt;
* Refactor _get_command_output() and friends in kvm_monitor.py and add cmd_raw(), cmd_obj() and cmd_qmp() as required by Luiz Capitulino&#039;s test suite&lt;br /&gt;
* Use select() instead of sleep() in kvm_monitor.py&lt;br /&gt;
* Simplify migration code if possible (in kvm_vm.py and kvm_test_utils.py)&lt;br /&gt;
* Fix VM.get_ifname() somehow (uses self.vnc_port -- bad)&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* review and test michael&#039;s whql patchsets, give feedback to michael&lt;br /&gt;
* push redhat internal kvm subtests(18) to upstream&lt;br /&gt;
&#039;&#039;trans_hugepage_defrag.py trans_hugepage.py trans_hugepage_swapping.py hdparm.py cpuflags.py multi_disk.py nmi_watchdog.py qmp_command.py qmp_event_notification.py cdrom.py fillup_disk.py floppy.py format_disk.py lvm.py module_probe.py stop_continue.py usb.py image_copy.py&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Add a new test: check if guest transmits packets when link is up/down [done]&lt;br /&gt;
* Report problem with multiple NICs and MAC address tracking [problem fixed]&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge [request changed]&lt;br /&gt;
* Bonding test [done]&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Write a &#039;cli&#039; wrapper to make autotest handy/easier for developers&lt;br /&gt;
* Tests in parallel with migration (reboot/unattended installation/file transfer/ autotest client tests / guest_script,autoit? / memtest86+(steps?)), need modifications to framework?&lt;br /&gt;
* Refactor hotplug test using qdev and netdev_del/drive_del&lt;br /&gt;
* Refactor the network cmdline generation, using peer when possible and vlan for elder qemu-kvm&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
* Migration to exec test suggested by mst&lt;br /&gt;
* Clean the Migration tests: mainly for exec and add offline support&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3421</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3421"/>
		<updated>2010-11-29T08:37:36Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
* Autotest refactor for usage with libvirt/xen&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
=== TODO ===&lt;br /&gt;
&lt;br /&gt;
* Multi host migration - priority increased&lt;br /&gt;
&lt;br /&gt;
* Infrastructure for testing different vm storage backends: file/lvm/iSCSI&lt;br /&gt;
* Work on setting up the conmux server for our internal test grid&lt;br /&gt;
* Research how to trigger jobs based on events rather than using the web interface setting recurrent jobs&lt;br /&gt;
* Set up a qemu-block job, with kwolf&#039;s qemu-block tree&lt;br /&gt;
* Set up a vhost job, with the upstream job (mst&#039;s trees for kernel and userspace)&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer &lt;br /&gt;
&lt;br /&gt;
=== Done ===&lt;br /&gt;
&lt;br /&gt;
* Made some internal patches appropriate for upstream&lt;br /&gt;
* Copy kvm-autotest@redhat.com on the email list of results&lt;br /&gt;
* Send code to retrieve host kernels based on koji/brew tags upstream&lt;br /&gt;
* Resolved UUID mount point check problem&lt;br /&gt;
* QMP suite integrated upstream&lt;br /&gt;
* Wrote code to retrieve kernels based on koji/brew tags&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
&lt;br /&gt;
=== done ===&lt;br /&gt;
&lt;br /&gt;
* Post-commit review of new network patches (committed on Oct 7 2010)&lt;br /&gt;
* Use more exceptions in utility functions:&lt;br /&gt;
** Add convenience functions to kvm_subprocess.py to make running shell commands shorter&lt;br /&gt;
** Add convenience functions to the VM class&lt;br /&gt;
* Refactor _get_command_output() and friends in kvm_monitor.py and add cmd_raw(), cmd_obj() and cmd_qmp() as required by Luiz Capitulino&#039;s test suite&lt;br /&gt;
* Use select() instead of sleep() in kvm_monitor.py&lt;br /&gt;
* Simplify migration code if possible (in kvm_vm.py and kvm_test_utils.py)&lt;br /&gt;
* Fix VM.get_ifname() somehow (uses self.vnc_port -- bad)&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* push redhat internal kvm subtests(18) to upstream&lt;br /&gt;
&#039;&#039;trans_hugepage_defrag.py trans_hugepage.py trans_hugepage_swapping.py hdparm.py cpuflags.py multi_disk.py nmi_watchdog.py qmp_command.py qmp_event_notification.py cdrom.py fillup_disk.py floppy.py format_disk.py lvm.py module_probe.py stop_continue.py usb.py image_copy.py&#039;&#039;&lt;br /&gt;
&lt;br /&gt;
* Add a new test: check if guest transmits packets when link is up/down&lt;br /&gt;
* Report problem with multiple NICs and MAC address tracking [problem fixed]&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge [request changed]&lt;br /&gt;
* Bonding test [done]&lt;br /&gt;
&lt;br /&gt;
== jason ==&lt;br /&gt;
&lt;br /&gt;
* Write a &#039;cli&#039; wrapper to make autotest handy/easier for developers&lt;br /&gt;
* Tests in parallel with migration (reboot/unattended installation/file transfer/ autotest client tests / guest_script,autoit? / memtest86+(steps?)), need modifications to framework?&lt;br /&gt;
* Clean the Migration tests: mainly for exec and add offline support&lt;br /&gt;
* Refactor hotplug test using qdev and netdev_del/drive_del&lt;br /&gt;
* Refactor the network cmdline generation, using peer when possible and vlan for elder qemu-kvm&lt;br /&gt;
* Port network tests into windows&lt;br /&gt;
* Migration to exec test suggested by mst&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3322</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3322"/>
		<updated>2010-10-07T14:26:42Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
* Libvirt integration - Will hold on this work item for the next couple of months due to resource constraints.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
=== TODO ===&lt;br /&gt;
&lt;br /&gt;
* Make cache=off the default for guest images&lt;br /&gt;
* Multi host migration&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer &lt;br /&gt;
&lt;br /&gt;
=== Partially done/ Blocked ===&lt;br /&gt;
&lt;br /&gt;
* Include support for host kernel install on the KVM default control file - Using autotest standard API to get it done&lt;br /&gt;
** needs to resolve bugs in boottool to full functionality&lt;br /&gt;
&lt;br /&gt;
=== Done ===&lt;br /&gt;
&lt;br /&gt;
* Apply network patchset (Yay!)&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
* Post-commit review of new network patches (committed on Oct 7 2010)&lt;br /&gt;
* Use more exceptions in utility functions:&lt;br /&gt;
** Add convenience functions to kvm_subprocess.py to make running shell commands shorter&lt;br /&gt;
** Add convenience functions to the VM class&lt;br /&gt;
&lt;br /&gt;
=== Very old, possibly no longer relevant ===&lt;br /&gt;
&lt;br /&gt;
* Add docstrings to all functions that still lack them, including the ones in stepmaker.py, stepeditor.py and kvm_tests.py&lt;br /&gt;
* Rename all Windows ISOs currently used to their official MSDN names&lt;br /&gt;
* Fill the sections &amp;quot;Working with step files&amp;quot; and &amp;quot;Step file creation tips&amp;quot; in the wiki&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge&lt;br /&gt;
* Bonding test&lt;br /&gt;
* Add a new test: check if guest transmits packets when link is up/down&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
(yolkfull)&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3321</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3321"/>
		<updated>2010-10-07T13:16:42Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* Autotest support for executing tests in parallel&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
* add more whql tests (virtio_net,virtio_blk)&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
* Libvirt integration - Will hold on this work item for the next couple of months due to resource constraints.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
=== TODO ===&lt;br /&gt;
&lt;br /&gt;
* Make cache=off the default for guest images&lt;br /&gt;
* Multi host migration&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer &lt;br /&gt;
&lt;br /&gt;
=== Partially done/ Blocked ===&lt;br /&gt;
&lt;br /&gt;
* Include support for host kernel install on the KVM default control file - Using autotest standard API to get it done&lt;br /&gt;
** needs to resolve bugs in boottool to full functionality&lt;br /&gt;
&lt;br /&gt;
=== Done ===&lt;br /&gt;
&lt;br /&gt;
* Apply network patchset (Yay!)&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
** This may just be a matter of moving useful code from tests to kvm_test_utils.py to make it reusable&lt;br /&gt;
* Post-commit review of new network patches (committed on Oct 7 2010)&lt;br /&gt;
* Use more exceptions in utility functions:&lt;br /&gt;
** Add convenience functions to kvm_subprocess.py to make running shell commands shorter&lt;br /&gt;
** Add convenience functions to the VM class&lt;br /&gt;
&lt;br /&gt;
=== Very old, possibly no longer relevant ===&lt;br /&gt;
&lt;br /&gt;
* Add docstrings to all functions that still lack them, including the ones in stepmaker.py, stepeditor.py and kvm_tests.py&lt;br /&gt;
* Rename all Windows ISOs currently used to their official MSDN names&lt;br /&gt;
* Fill the sections &amp;quot;Working with step files&amp;quot; and &amp;quot;Step file creation tips&amp;quot; in the wiki&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge&lt;br /&gt;
* Bonding test&lt;br /&gt;
* Add a new test: check if guest transmits packets when link is up/down&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
(yolkfull)&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3318</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3318"/>
		<updated>2010-10-07T13:03:40Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* Autotest support for executing tests in parallel&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
* transfer test between guest(s) and extra boxs&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
* Libvirt integration - Will hold on this work item for the next couple of months due to resource constraints.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
=== TODO ===&lt;br /&gt;
&lt;br /&gt;
* Make cache=off the default for guest images&lt;br /&gt;
* Multi host migration&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer &lt;br /&gt;
&lt;br /&gt;
=== Partially done/ Blocked ===&lt;br /&gt;
&lt;br /&gt;
* Include support for host kernel install on the KVM default control file - Using autotest standard API to get it done&lt;br /&gt;
** needs to resolve bugs in boottool to full functionality&lt;br /&gt;
&lt;br /&gt;
=== Done ===&lt;br /&gt;
&lt;br /&gt;
* Apply network patchset (Yay!)&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
* Add a way to shutdown the VMs when the whole job completes (compared to doing nothing or shutting-down after every test)&lt;br /&gt;
* Add docstrings to all functions that still lack them, including the ones in stepmaker.py, stepeditor.py and kvm_tests.py&lt;br /&gt;
* Rename all Windows ISOs currently used to their official MSDN names&lt;br /&gt;
* Fill the sections &amp;quot;Working with step files&amp;quot; and &amp;quot;Step file creation tips&amp;quot; in the wiki&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge&lt;br /&gt;
* Bonding test&lt;br /&gt;
* Add a new test: check if guest transmits packets when link is up/down&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
(yolkfull)&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3317</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3317"/>
		<updated>2010-10-07T13:01:42Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== raw idea ==&lt;br /&gt;
&lt;br /&gt;
* Autotest support for executing tests in parallel&lt;br /&gt;
* more vhost_net test&lt;br /&gt;
* Add Multi-guests transfer test&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
* Libvirt integration - Will hold on this work item for the next couple of months due to resource constraints.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
=== TODO ===&lt;br /&gt;
&lt;br /&gt;
* Make cache=off the default for guest images&lt;br /&gt;
* Multi host migration&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer &lt;br /&gt;
&lt;br /&gt;
=== Partially done/ Blocked ===&lt;br /&gt;
&lt;br /&gt;
* Include support for host kernel install on the KVM default control file - Using autotest standard API to get it done&lt;br /&gt;
** needs to resolve bugs in boottool to full functionality&lt;br /&gt;
&lt;br /&gt;
=== Done ===&lt;br /&gt;
&lt;br /&gt;
* Apply network patchset (Yay!)&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
* Add a way to shutdown the VMs when the whole job completes (compared to doing nothing or shutting-down after every test)&lt;br /&gt;
* Add docstrings to all functions that still lack them, including the ones in stepmaker.py, stepeditor.py and kvm_tests.py&lt;br /&gt;
* Rename all Windows ISOs currently used to their official MSDN names&lt;br /&gt;
* Fill the sections &amp;quot;Working with step files&amp;quot; and &amp;quot;Step file creation tips&amp;quot; in the wiki&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge&lt;br /&gt;
* Bonding test&lt;br /&gt;
* Add a new test: check if guest transmits packets when link is up/down&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
(yolkfull)&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3316</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3316"/>
		<updated>2010-10-07T12:52:14Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
* Libvirt integration - Will hold on this work item for the next couple of months due to resource constraints.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
=== TODO ===&lt;br /&gt;
&lt;br /&gt;
* Make cache=off the default for guest images&lt;br /&gt;
* Multi host migration&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer &lt;br /&gt;
&lt;br /&gt;
=== Partially done/ Blocked ===&lt;br /&gt;
&lt;br /&gt;
* Include support for host kernel install on the KVM default control file - Using autotest standard API to get it done&lt;br /&gt;
** needs to resolve bugs in boottool to full functionality&lt;br /&gt;
&lt;br /&gt;
=== Done ===&lt;br /&gt;
&lt;br /&gt;
* Apply network patchset (Yay!)&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
* Add a way to shutdown the VMs when the whole job completes (compared to doing nothing or shutting-down after every test)&lt;br /&gt;
* Add docstrings to all functions that still lack them, including the ones in stepmaker.py, stepeditor.py and kvm_tests.py&lt;br /&gt;
* Rename all Windows ISOs currently used to their official MSDN names&lt;br /&gt;
* Fill the sections &amp;quot;Working with step files&amp;quot; and &amp;quot;Step file creation tips&amp;quot; in the wiki&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge&lt;br /&gt;
* Bonding test&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
(yolkfull)&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3315</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3315"/>
		<updated>2010-10-07T12:50:58Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== general ==&lt;br /&gt;
&lt;br /&gt;
&#039;&#039;&#039;Note:&#039;&#039;&#039; This session is for major, multi-person items. We&#039;ll update the status weekly based on the status of the subtasks attributed to persons.&lt;br /&gt;
&lt;br /&gt;
* Libvirt integration - Will hold on this work item for the next couple of months due to resource constraints.&lt;br /&gt;
&lt;br /&gt;
== lmr ==&lt;br /&gt;
&lt;br /&gt;
=== TODO ===&lt;br /&gt;
&lt;br /&gt;
* Make cache=off the default for guest images&lt;br /&gt;
* Multi host migration&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer &lt;br /&gt;
&lt;br /&gt;
=== Partially done/ Blocked ===&lt;br /&gt;
&lt;br /&gt;
* Include support for host kernel install on the KVM default control file - Using autotest standard API to get it done&lt;br /&gt;
** needs to resolve bugs in boottool to full functionality&lt;br /&gt;
&lt;br /&gt;
=== Done ===&lt;br /&gt;
&lt;br /&gt;
* Apply network patchset (Yay!)&lt;br /&gt;
&lt;br /&gt;
== mgoldish ==&lt;br /&gt;
&lt;br /&gt;
* Make WHQL tests run on our internal server&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it)&lt;br /&gt;
* Add a way to shutdown the VMs when the whole job completes (compared to doing nothing or shutting-down after every test)&lt;br /&gt;
* Add docstrings to all functions that still lack them, including the ones in stepmaker.py, stepeditor.py and kvm_tests.py&lt;br /&gt;
* Rename all Windows ISOs currently used to their official MSDN names&lt;br /&gt;
* Fill the sections &amp;quot;Working with step files&amp;quot; and &amp;quot;Step file creation tips&amp;quot; in the wiki&lt;br /&gt;
&lt;br /&gt;
== akong ==&lt;br /&gt;
&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge&lt;br /&gt;
* Bonding test&lt;br /&gt;
&lt;br /&gt;
== yolkfull ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc)&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example)&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives)&lt;br /&gt;
&lt;br /&gt;
== Beijing QE ==&lt;br /&gt;
&lt;br /&gt;
* pxe boot * net types&lt;br /&gt;
* Further migration&lt;br /&gt;
* Multiple nics&lt;br /&gt;
* multiple disks&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3307</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3307"/>
		<updated>2010-10-05T12:23:24Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== Network Patchset ==&lt;br /&gt;
&lt;br /&gt;
Problems with nw patchset&lt;br /&gt;
&lt;br /&gt;
1) multicast&lt;br /&gt;
 kvm.virtio_blk.smp2.Fedora.13.64.virtio_net.multicast Ping return non-zero value PING 225.0.0.1 (225.0.0.1) from 10.16.72.124 virtio_0_5900: 56(84) bytes of data.&lt;br /&gt;
&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
&amp;gt; +    # make sure guest replies to broadcasts&lt;br /&gt;
&amp;gt; +    cmd_broadcast = &amp;quot;echo 0 &amp;gt; /proc/sys/net/ipv4/icmp_echo_ignore&amp;quot;&lt;br /&gt;
&lt;br /&gt;
it&#039;s caused by this error&lt;br /&gt;
       cmd_broadcast = &amp;quot;echo 0 &amp;gt; /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts&amp;quot;&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
&lt;br /&gt;
2) netperf&lt;br /&gt;
 17:35:11 DEBUG| Execute netperf client test: /root/autotest/client/tests/netperf2/netperf-2.4.5/src/netperf -t TCP_CRR -H   10.16.74.142 -l 60 -- -m 1&lt;br /&gt;
 17:35:45 ERROR| Fail to execute netperf test, protocol:TCP_CRR&lt;br /&gt;
 17:35:45 DEBUG| Execute netperf client test: /root/autotest/client/tests/netperf2/netperf-2.4.5/src/netperf -t UDP_RR -H 10.16.74.142 -l 60 -- -m 1&lt;br /&gt;
 17:36:06 ERROR| Fail to execute netperf test, protocol:UDP_RR&lt;br /&gt;
&lt;br /&gt;
3) vlan&lt;br /&gt;
 TestError: Fail to configure ip for eth0.1&lt;br /&gt;
 17:46:27 DEBUG| Sending command: ifconfig eth0.1 192.168.1.1&lt;br /&gt;
 17:46:27 DEBUG| Command failed; status: 255, output: SIOCSIFADDR: No such device eth0.1: unknown interface: No such device&lt;br /&gt;
 17:46:27 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.1 ]];then vconfig rem eth0.1;fi&lt;br /&gt;
 17:46:27 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.1 ]];then vconfig rem eth0.1;fi&lt;br /&gt;
 17:46:28 INFO | rem eth0.1&lt;br /&gt;
 17:46:28 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.2 ]];then vconfig rem eth0.2;fi&lt;br /&gt;
 17:46:28 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.2 ]];then vconfig rem eth0.2;fi&lt;br /&gt;
 17:46:28 INFO | rem eth0.2&lt;br /&gt;
 17:46:28 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.3 ]];then vconfig rem eth0.3;fi&lt;br /&gt;
 17:46:28 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.3 ]];then vconfig rem eth0.3;fi&lt;br /&gt;
 17:46:28 INFO | rem eth0.3&lt;br /&gt;
 17:46:28 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.4 ]];then vconfig rem eth0.4;fi&lt;br /&gt;
 17:46:29 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.4 ]];then vconfig rem eth0.4;fi&lt;br /&gt;
 17:46:29 INFO | rem eth0.4&lt;br /&gt;
 17:46:29 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.5 ]];then vconfig rem eth0.5;fi&lt;br /&gt;
 17:46:29 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.5 ]];then vconfig rem eth0.5;fi&lt;br /&gt;
 17:46:29 INFO | rem eth0.5&lt;br /&gt;
 17:46:29 ERROR| Test failed: TestError: Fail to configure ip for eth0.1&lt;br /&gt;
&lt;br /&gt;
4) ethtool&lt;br /&gt;
Address comment from mst:&lt;br /&gt;
&lt;br /&gt;
 &amp;gt; &amp;gt; Initialize the callbacks first and execute all the sub&lt;br /&gt;
 &amp;gt; &amp;gt; tests one by one, all the result will be check at the&lt;br /&gt;
 &amp;gt; &amp;gt; end. When execute this test, vhost should be enabled,&lt;br /&gt;
 &amp;gt; &amp;gt; then most of new features can be used. Vhost doesn&#039;t&lt;br /&gt;
 &amp;gt; &amp;gt; support VIRTIO_NET_F_MRG_RXBUF, so do not check large&lt;br /&gt;
 &amp;gt; &amp;gt; packets in received offload test.&lt;br /&gt;
&lt;br /&gt;
 Well, it does support that now, in any case VIRTIO_NET_F_MRG_RXBUF&lt;br /&gt;
 is not required for large packets: it&#039;s an optimization saving&lt;br /&gt;
 guest memory, really. So no need to special-case.&lt;br /&gt;
&lt;br /&gt;
== Framework ==&lt;br /&gt;
&lt;br /&gt;
* Fix problem with serial_login() - probably we forgot to setup the guest in a way that we hook up a getty in the serial port - DONE, by jasowang&lt;br /&gt;
&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer - assignee? Considerations: We would like to avoid slipstreaming the drivers in the iso cdrom, since we feel it is too messy. If after honest try we can&#039;t find another way, let&#039;s give slipstreaming a try.&lt;br /&gt;
&lt;br /&gt;
* Re-schedule the unittest job on the autotest server. DONE&lt;br /&gt;
&lt;br /&gt;
* Multi host migration - 1 week to recap the patch that was sent by QA, improve on it, have an updated patch by then - lmr&lt;br /&gt;
Deadline: 09/15 - Could not meet it, has to postpone&lt;br /&gt;
New tentative deadline: 09/22&lt;br /&gt;
&lt;br /&gt;
* Libvirt integration - 1 week coordinating what it needs to be done, another 4 weeks (at least) implementing it - QA, lmr, mgoldish, help from libvirt developers&lt;br /&gt;
Deadline: 09/15 - Could not meet it, has to postpone&lt;br /&gt;
New tentative deadline: 09/22&lt;br /&gt;
&lt;br /&gt;
* Include support for host kernel install on the KVM default control file - Using autotest standard API to get it done - lmr&lt;br /&gt;
Tentative deadline: 09/22&lt;br /&gt;
&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it) [Michael]&lt;br /&gt;
* Add a way to shutdown the VMs when the whole job completes (compared to doing nothing or shutting-down after every test) [Michael]&lt;br /&gt;
* Random generated macaddress and ifname (BeiJing QE)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Client Side Test ===&lt;br /&gt;
&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge [Amos] status: have implemented in internal tree&lt;br /&gt;
* Bonding test [Amos]&lt;br /&gt;
* If needed, write a netperf/iperf test [jasowang] [NEEDS MORE WORK]&lt;br /&gt;
* pxe boot * net types [BeiJing QE]&lt;br /&gt;
* Further migration [BeiJing QE]&lt;br /&gt;
* Multiple nics [BeiJing QE]&lt;br /&gt;
* multiple disks [BeiJing QE]&lt;br /&gt;
* -vga stg, nographics&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example) [Yolkfull]&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives) [Yolkfull]&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support [BeiJing QE]&lt;br /&gt;
* Use private bridge and dnsmasq to do the unattended installation [BeiJing QE]&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server [BeiJing QE]&lt;br /&gt;
&lt;br /&gt;
== Server Side Tests ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests [Yolkfull]&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc) [Yolkfull]&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server [BeiJing QE]&lt;br /&gt;
&lt;br /&gt;
== Misc ==&lt;br /&gt;
&lt;br /&gt;
* Add docstrings to all functions that still lack them, including the ones in stepmaker.py, stepeditor.py and kvm_tests.py [Michael]&lt;br /&gt;
* Document the setupssh.iso and setuptelnet.iso creation procedures in the wiki&lt;br /&gt;
* Rename all Windows ISOs currently used to their official MSDN names&lt;br /&gt;
* Fill the sections &amp;quot;Working with step files&amp;quot; and &amp;quot;Step file creation tips&amp;quot; in the wiki [Michael]&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
	<entry>
		<id>https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3306</id>
		<title>KVM-Autotest/TODO</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=KVM-Autotest/TODO&amp;diff=3306"/>
		<updated>2010-10-05T12:22:43Z</updated>

		<summary type="html">&lt;p&gt;Kongove: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;= KVM-Autotest TODO list =&lt;br /&gt;
&lt;br /&gt;
== Network Patchset ==&lt;br /&gt;
&lt;br /&gt;
Problems with nw patchset&lt;br /&gt;
&lt;br /&gt;
1) multicast&lt;br /&gt;
 kvm.virtio_blk.smp2.Fedora.13.64.virtio_net.multicast Ping return non-zero value PING 225.0.0.1 (225.0.0.1) from 10.16.72.124 virtio_0_5900: 56(84) bytes of data.&lt;br /&gt;
&lt;br /&gt;
&amp;gt; +    # make sure guest replies to broadcasts&lt;br /&gt;
&amp;gt; +    cmd_broadcast = &amp;quot;echo 0 &amp;gt; /proc/sys/net/ipv4/icmp_echo_ignore&amp;quot;&lt;br /&gt;
&lt;br /&gt;
it&#039;s caused by this error&lt;br /&gt;
&lt;br /&gt;
       cmd_broadcast = &amp;quot;echo 0 &amp;gt; /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
2) netperf&lt;br /&gt;
 17:35:11 DEBUG| Execute netperf client test: /root/autotest/client/tests/netperf2/netperf-2.4.5/src/netperf -t TCP_CRR -H   10.16.74.142 -l 60 -- -m 1&lt;br /&gt;
 17:35:45 ERROR| Fail to execute netperf test, protocol:TCP_CRR&lt;br /&gt;
 17:35:45 DEBUG| Execute netperf client test: /root/autotest/client/tests/netperf2/netperf-2.4.5/src/netperf -t UDP_RR -H 10.16.74.142 -l 60 -- -m 1&lt;br /&gt;
 17:36:06 ERROR| Fail to execute netperf test, protocol:UDP_RR&lt;br /&gt;
&lt;br /&gt;
3) vlan&lt;br /&gt;
 TestError: Fail to configure ip for eth0.1&lt;br /&gt;
 17:46:27 DEBUG| Sending command: ifconfig eth0.1 192.168.1.1&lt;br /&gt;
 17:46:27 DEBUG| Command failed; status: 255, output: SIOCSIFADDR: No such device eth0.1: unknown interface: No such device&lt;br /&gt;
 17:46:27 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.1 ]];then vconfig rem eth0.1;fi&lt;br /&gt;
 17:46:27 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.1 ]];then vconfig rem eth0.1;fi&lt;br /&gt;
 17:46:28 INFO | rem eth0.1&lt;br /&gt;
 17:46:28 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.2 ]];then vconfig rem eth0.2;fi&lt;br /&gt;
 17:46:28 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.2 ]];then vconfig rem eth0.2;fi&lt;br /&gt;
 17:46:28 INFO | rem eth0.2&lt;br /&gt;
 17:46:28 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.3 ]];then vconfig rem eth0.3;fi&lt;br /&gt;
 17:46:28 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.3 ]];then vconfig rem eth0.3;fi&lt;br /&gt;
 17:46:28 INFO | rem eth0.3&lt;br /&gt;
 17:46:28 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.4 ]];then vconfig rem eth0.4;fi&lt;br /&gt;
 17:46:29 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.4 ]];then vconfig rem eth0.4;fi&lt;br /&gt;
 17:46:29 INFO | rem eth0.4&lt;br /&gt;
 17:46:29 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.5 ]];then vconfig rem eth0.5;fi&lt;br /&gt;
 17:46:29 DEBUG| Sending command: if [[ -e /proc/net/vlan/eth0.5 ]];then vconfig rem eth0.5;fi&lt;br /&gt;
 17:46:29 INFO | rem eth0.5&lt;br /&gt;
 17:46:29 ERROR| Test failed: TestError: Fail to configure ip for eth0.1&lt;br /&gt;
&lt;br /&gt;
4) ethtool&lt;br /&gt;
Address comment from mst:&lt;br /&gt;
&lt;br /&gt;
 &amp;gt; &amp;gt; Initialize the callbacks first and execute all the sub&lt;br /&gt;
 &amp;gt; &amp;gt; tests one by one, all the result will be check at the&lt;br /&gt;
 &amp;gt; &amp;gt; end. When execute this test, vhost should be enabled,&lt;br /&gt;
 &amp;gt; &amp;gt; then most of new features can be used. Vhost doesn&#039;t&lt;br /&gt;
 &amp;gt; &amp;gt; support VIRTIO_NET_F_MRG_RXBUF, so do not check large&lt;br /&gt;
 &amp;gt; &amp;gt; packets in received offload test.&lt;br /&gt;
&lt;br /&gt;
 Well, it does support that now, in any case VIRTIO_NET_F_MRG_RXBUF&lt;br /&gt;
 is not required for large packets: it&#039;s an optimization saving&lt;br /&gt;
 guest memory, really. So no need to special-case.&lt;br /&gt;
&lt;br /&gt;
== Framework ==&lt;br /&gt;
&lt;br /&gt;
* Fix problem with serial_login() - probably we forgot to setup the guest in a way that we hook up a getty in the serial port - DONE, by jasowang&lt;br /&gt;
&lt;br /&gt;
* Add a BLOCKED status on the autotest database, that tells the user that some feature testing was left out due to a problem on its dependencies.&lt;br /&gt;
&lt;br /&gt;
* Eventually get rid of all dependency on slirp/userspace networking to do any feature testing, since it is not a supported feature. Slipstreaming the kickstart in the install CD is a possibility, and it ends up being simpler than configuring an internal network thing for testing.&lt;br /&gt;
&lt;br /&gt;
* Make it possible to install virtio nw driver under windows xp/2003 guests without having to resort to the msi installer - assignee? Considerations: We would like to avoid slipstreaming the drivers in the iso cdrom, since we feel it is too messy. If after honest try we can&#039;t find another way, let&#039;s give slipstreaming a try.&lt;br /&gt;
&lt;br /&gt;
* Re-schedule the unittest job on the autotest server. DONE&lt;br /&gt;
&lt;br /&gt;
* Multi host migration - 1 week to recap the patch that was sent by QA, improve on it, have an updated patch by then - lmr&lt;br /&gt;
Deadline: 09/15 - Could not meet it, has to postpone&lt;br /&gt;
New tentative deadline: 09/22&lt;br /&gt;
&lt;br /&gt;
* Libvirt integration - 1 week coordinating what it needs to be done, another 4 weeks (at least) implementing it - QA, lmr, mgoldish, help from libvirt developers&lt;br /&gt;
Deadline: 09/15 - Could not meet it, has to postpone&lt;br /&gt;
New tentative deadline: 09/22&lt;br /&gt;
&lt;br /&gt;
* Include support for host kernel install on the KVM default control file - Using autotest standard API to get it done - lmr&lt;br /&gt;
Tentative deadline: 09/22&lt;br /&gt;
&lt;br /&gt;
* Enable &amp;quot;guest-load&amp;quot; for a VMs before/while tests are running (e.g. migration of a VM, while a movie is playing on it) [Michael]&lt;br /&gt;
* Add a way to shutdown the VMs when the whole job completes (compared to doing nothing or shutting-down after every test) [Michael]&lt;br /&gt;
* Random generated macaddress and ifname (BeiJing QE)&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== Client Side Test ===&lt;br /&gt;
&lt;br /&gt;
* Using dnsmasq in unattended_install to replace userspace network with private bridge [Amos] status: have implemented in internal tree&lt;br /&gt;
* Bonding test [Amos]&lt;br /&gt;
* If needed, write a netperf/iperf test [jasowang] [NEEDS MORE WORK]&lt;br /&gt;
* pxe boot * net types [BeiJing QE]&lt;br /&gt;
* Further migration [BeiJing QE]&lt;br /&gt;
* Multiple nics [BeiJing QE]&lt;br /&gt;
* multiple disks [BeiJing QE]&lt;br /&gt;
* -vga stg, nographics&lt;br /&gt;
* Verify SMBIOS/DMI data (UUID, for example) [Yolkfull]&lt;br /&gt;
* Disk serial number (for IDE, SCSI, VirtIO drives) [Yolkfull]&lt;br /&gt;
* Test block device cancellation path using device mapper to generate errors ( after we had a crash in de_write_dma_cb)&lt;br /&gt;
* Extend pci_assignable to support other PCI devices (USB, video cards, TV card etc)&lt;br /&gt;
* Different CPU flags support [BeiJing QE]&lt;br /&gt;
* Use private bridge and dnsmasq to do the unattended installation [BeiJing QE]&lt;br /&gt;
* Passthrough the perf keys of run_autotest to autotest server [BeiJing QE]&lt;br /&gt;
&lt;br /&gt;
== Server Side Tests ==&lt;br /&gt;
&lt;br /&gt;
* Run netperf test between two guests [Yolkfull]&lt;br /&gt;
* Migration with/without workload(dbench,lmbench etc) [Yolkfull]&lt;br /&gt;
* Register the virtual machine into autotest server and run benchmark through autotest server [BeiJing QE]&lt;br /&gt;
&lt;br /&gt;
== Misc ==&lt;br /&gt;
&lt;br /&gt;
* Add docstrings to all functions that still lack them, including the ones in stepmaker.py, stepeditor.py and kvm_tests.py [Michael]&lt;br /&gt;
* Document the setupssh.iso and setuptelnet.iso creation procedures in the wiki&lt;br /&gt;
* Rename all Windows ISOs currently used to their official MSDN names&lt;br /&gt;
* Fill the sections &amp;quot;Working with step files&amp;quot; and &amp;quot;Step file creation tips&amp;quot; in the wiki [Michael]&lt;br /&gt;
&lt;br /&gt;
== Bugs ==&lt;br /&gt;
&lt;br /&gt;
* Please open bugs on the [[http://autotest.kernel.org/newticket autotest defect tracking system]]&lt;/div&gt;</summary>
		<author><name>Kongove</name></author>
	</entry>
</feed>