<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://linux-kvm.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dmitryf</id>
	<title>KVM - User contributions [en]</title>
	<link rel="self" type="application/atom+xml" href="https://linux-kvm.org/api.php?action=feedcontributions&amp;feedformat=atom&amp;user=Dmitryf"/>
	<link rel="alternate" type="text/html" href="https://linux-kvm.org/page/Special:Contributions/Dmitryf"/>
	<updated>2026-05-01T15:59:59Z</updated>
	<subtitle>User contributions</subtitle>
	<generator>MediaWiki 1.39.5</generator>
	<entry>
		<id>https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4828</id>
		<title>NetworkingTodo</title>
		<link rel="alternate" type="text/html" href="https://linux-kvm.org/index.php?title=NetworkingTodo&amp;diff=4828"/>
		<updated>2013-06-23T17:17:58Z</updated>

		<summary type="html">&lt;p&gt;Dmitryf: &lt;/p&gt;
&lt;hr /&gt;
&lt;div&gt;This page should cover all networking related activity in KVM,&lt;br /&gt;
currently most info is related to virtio-net.&lt;br /&gt;
&lt;br /&gt;
TODO: add bugzilla entry links.&lt;br /&gt;
&lt;br /&gt;
=== projects in progress. contributions are still very wellcome!&lt;br /&gt;
&lt;br /&gt;
* vhost-net scalability tuning: threading for many VMs&lt;br /&gt;
      Plan: switch to workqueue shared by many VMs&lt;br /&gt;
      http://www.mail-archive.com/kvm@vger.kernel.org/msg69868.html&lt;br /&gt;
&lt;br /&gt;
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/479e3578ed05bfac85257b4200427735!OpenDocument&lt;br /&gt;
&lt;br /&gt;
      Developer: Shirley Ma?, MST?&lt;br /&gt;
      Testing: netperf guest to guest&lt;br /&gt;
&lt;br /&gt;
* multiqueue support in macvtap&lt;br /&gt;
       multiqueue is only supported for tun.&lt;br /&gt;
       Add support for macvtap.&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* enable multiqueue by default&lt;br /&gt;
       Multiqueue causes regression in some workloads, thus&lt;br /&gt;
       it is off by default. Detect and enable/disable&lt;br /&gt;
       automatically so we can make it on by default.&lt;br /&gt;
       This is because GSO tends to batch less when mq is enabled.&lt;br /&gt;
       https://patchwork.kernel.org/patch/2235191/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* rework on flow caches&lt;br /&gt;
       Current hlist implementation of flow caches has several limitations:&lt;br /&gt;
       1) at worst case, linear search will be bad&lt;br /&gt;
       2) not scale&lt;br /&gt;
       https://patchwork.kernel.org/patch/2025121/&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
       &lt;br /&gt;
* eliminate the extra copy in virtio-net driver&lt;br /&gt;
       We need do an extra copy of 128 bytes for every packets. &lt;br /&gt;
       This could be eliminated for small packets by:&lt;br /&gt;
       1) use build_skb() and head frag&lt;br /&gt;
       2) bigger vnet header length ( &amp;gt;= NET_SKB_PAD + NET_IP_ALIGN )&lt;br /&gt;
       Or use a dedicated queue for small packet receiving ? (reordering)&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* make pktgen works for virtio-net ( or partially orphan )&lt;br /&gt;
       virtio-net orphan the skb during tx,&lt;br /&gt;
       which will makes pktgen wait for ever to the refcnt.&lt;br /&gt;
       Jason&#039;s idea: introduce a flat to tell pktgen not for wait&lt;br /&gt;
       Discussion here: https://patchwork.kernel.org/patch/1800711/&lt;br /&gt;
       MST&#039;s idea: add a .ndo_tx_polling not only for pktgen&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Add HW_VLAN_TX support for tap&lt;br /&gt;
       Eliminate the extra data moving for tagged packets&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* Announce self by guest driver&lt;br /&gt;
       Send gARP by guest driver. Guest part is finished.&lt;br /&gt;
       Qemu is ongoing.&lt;br /&gt;
       V7 patches is here:&lt;br /&gt;
       http://lists.nongnu.org/archive/html/qemu-devel/2013-03/msg01127.html&lt;br /&gt;
       Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* guest programmable mac/vlan filtering with macvtap&lt;br /&gt;
        Developer: Dragos Tatulea?, Amos Kong&lt;br /&gt;
        Status: [[GuestProgrammableMacVlanFiltering]]&lt;br /&gt;
&lt;br /&gt;
* bridge without promisc mode in NIC&lt;br /&gt;
  given hardware support, teach bridge&lt;br /&gt;
  to program mac/vlan filtering in NIC&lt;br /&gt;
  Helps performance and security on noisy LANs&lt;br /&gt;
  http://comments.gmane.org/gmane.linux.network/266546&lt;br /&gt;
  Developer: Vlad Yasevich&lt;br /&gt;
&lt;br /&gt;
* reduce networking latency:&lt;br /&gt;
  allow handling short packets from softirq or VCPU context&lt;br /&gt;
  Plan:&lt;br /&gt;
    We are going through the scheduler 3 times&lt;br /&gt;
    (could be up to 5 if softirqd is involved)&lt;br /&gt;
    Consider RX: host irq -&amp;gt; io thread -&amp;gt; VCPU thread -&amp;gt;&lt;br /&gt;
    guest irq -&amp;gt; guest thread.&lt;br /&gt;
    This adds a lot of latency.&lt;br /&gt;
    We can cut it by some 1.5x if we do a bit of work&lt;br /&gt;
    either in the VCPU or softirq context.&lt;br /&gt;
  Testing: netperf TCP RR - should be improved drastically&lt;br /&gt;
           netperf TCP STREAM guest to host - no regression&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* Flexible buffers: put virtio header inline with packet data&lt;br /&gt;
  https://patchwork.kernel.org/patch/1540471/&lt;br /&gt;
  Developer: MST&lt;br /&gt;
&lt;br /&gt;
* device failover to allow migration with assigned devices&lt;br /&gt;
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover&lt;br /&gt;
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST&lt;br /&gt;
&lt;br /&gt;
* Reuse vringh code for better maintainability&lt;br /&gt;
  Developer: Rusty Russell&lt;br /&gt;
&lt;br /&gt;
* Improve stats, make them more helpful for per analysis&lt;br /&gt;
  Developer: Sriram Narasimhan&lt;br /&gt;
&lt;br /&gt;
* Bug: e1000 &amp;amp; rtl8139: Change macaddr in guest, but not update to qemu (info network)&lt;br /&gt;
  Developer: Amos Kong&lt;br /&gt;
&lt;br /&gt;
* Enable GRO for packets coming to bridge from a tap interface&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
* Better support for windows LRO&lt;br /&gt;
  Extend virtio-header with statistics for GRO packets:&lt;br /&gt;
  number of packets coalesced and number of duplicate ACKs coalesced&lt;br /&gt;
  Developer: Dmitry Fleytman&lt;br /&gt;
&lt;br /&gt;
=== projects that are not started yet - no owner ===&lt;br /&gt;
&lt;br /&gt;
* netdev polling for virtio.&lt;br /&gt;
  There are two kinds of netdev polling:&lt;br /&gt;
  - netpoll - used for debugging&lt;br /&gt;
  - proposed low latency net polling&lt;br /&gt;
  See http://lkml.indiana.edu/hypermail/linux/kernel/1303.0/00553.html&lt;br /&gt;
&lt;br /&gt;
* receive side zero copy&lt;br /&gt;
  The ideal is a NIC with accelerated RFS support,&lt;br /&gt;
  So we can feed the virtio rx buffers into the correct NIC queue.&lt;br /&gt;
  Depends on non promisc NIC support in bridge.&lt;br /&gt;
&lt;br /&gt;
* IPoIB infiniband bridging&lt;br /&gt;
  Plan: implement macvtap for ipoib and virtio-ipoib&lt;br /&gt;
&lt;br /&gt;
* RDMA bridging&lt;br /&gt;
&lt;br /&gt;
* DMA emgine (IOAT) use in tun&lt;br /&gt;
  Old patch here: [PATCH RFC] tun: dma engine support&lt;br /&gt;
  It does not speed things up. Need to see why and&lt;br /&gt;
  what can be done.&lt;br /&gt;
&lt;br /&gt;
* use kvm eventfd support for injecting level interrupts,&lt;br /&gt;
  enable vhost by default for level interrupts&lt;br /&gt;
&lt;br /&gt;
* virtio API extension: improve small packet/large buffer performance:&lt;br /&gt;
  support &amp;quot;reposting&amp;quot; buffers for mergeable buffers,&lt;br /&gt;
  support pool for indirect buffers&lt;br /&gt;
&lt;br /&gt;
* more GSO type support:&lt;br /&gt;
       Kernel not support more type of GSO: FCOE, GRE, UDP_TUNNEL&lt;br /&gt;
&lt;br /&gt;
* ring aliasing:&lt;br /&gt;
  using vhost-net as a networking backend with virtio-net in QEMU&lt;br /&gt;
  being what&#039;s guest facing.&lt;br /&gt;
  This gives you the best of both worlds: QEMU acts as a first&lt;br /&gt;
  line of defense against a malicious guest while still getting the&lt;br /&gt;
  performance advantages of vhost-net (zero-copy).&lt;br /&gt;
  In fact a bit of complexity in vhost was put there in the vague hope to&lt;br /&gt;
  support something like this: virtio rings are not translated through&lt;br /&gt;
  regular memory tables, instead, vhost gets a pointer to ring address.&lt;br /&gt;
  This allows qemu acting as a man in the middle,&lt;br /&gt;
  verifying the descriptors but not touching the packet data.&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
=== vague ideas: path to implementation not clear&lt;br /&gt;
&lt;br /&gt;
* ring redesign:&lt;br /&gt;
      find a way to test raw ring performance &lt;br /&gt;
      fix cacheline bounces &lt;br /&gt;
      reduce interrupts&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
* support more queues&lt;br /&gt;
     We limit TUN to 8 queues, but we really want&lt;br /&gt;
     1 queue per guest CPU. The limit comes from net&lt;br /&gt;
     core, need to teach it to allocate array of&lt;br /&gt;
     pointers and not array of queues.&lt;br /&gt;
     Jason has an draft patch to use flex array.&lt;br /&gt;
     Another thing is to move the flow caches out of tun_struct.&lt;br /&gt;
     Developer: Jason Wang&lt;br /&gt;
&lt;br /&gt;
* irq/numa affinity:&lt;br /&gt;
     networking goes much faster with irq pinning:&lt;br /&gt;
     both with and without numa.&lt;br /&gt;
     what can be done to make the non-pinned setup go faster?&lt;br /&gt;
&lt;br /&gt;
* reduce conflict with VCPU thread&lt;br /&gt;
    if VCPU and networking run on same CPU,&lt;br /&gt;
    they conflict resulting in bad performance.&lt;br /&gt;
    Fix that, push vhost thread out to another CPU&lt;br /&gt;
    more aggressively.&lt;br /&gt;
&lt;br /&gt;
* rx mac filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
        we have a small table of addresses, need to make it larger&lt;br /&gt;
        if we only need filtering for unicast (multicast is handled by IMP filtering)&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in tun&lt;br /&gt;
        the need for this is still not understood as we have filtering in bridge&lt;br /&gt;
&lt;br /&gt;
* vlan filtering in bridge&lt;br /&gt;
        IGMP snooping in bridge should take vlans into account&lt;br /&gt;
&lt;br /&gt;
* tx coalescing&lt;br /&gt;
        Delay several packets before kick the device.&lt;br /&gt;
&lt;br /&gt;
* interrupt coalescing&lt;br /&gt;
        Reduce the number of interrupt&lt;br /&gt;
&lt;br /&gt;
* bridging on top of macvlan &lt;br /&gt;
  add code to forward LRO status from macvlan (not macvtap)&lt;br /&gt;
  back to the lowerdev, so that setting up forwarding&lt;br /&gt;
  from macvlan disables LRO on the lowerdev&lt;br /&gt;
&lt;br /&gt;
* preserve packets exactly with LRO&lt;br /&gt;
  LRO is not normally compatible with forwarding.&lt;br /&gt;
  virtio we are getting packets from a linux host,&lt;br /&gt;
  so we could thinkably preserve packets exactly&lt;br /&gt;
  even with LRO. I am guessing other hardware could be&lt;br /&gt;
  doing this as well.&lt;br /&gt;
&lt;br /&gt;
=== testing projects ===&lt;br /&gt;
Keeping networking stable is highest priority.&lt;br /&gt;
&lt;br /&gt;
* Run weekly test on upstream HEAD covering test matrix with autotest&lt;br /&gt;
* Measure the effect of each of the above-mentioned optimizations&lt;br /&gt;
  - Use autotest network performance regression testing (that runs netperf)&lt;br /&gt;
  - Also test any wild idea that works. Some may be useful.&lt;br /&gt;
&lt;br /&gt;
=== non-virtio-net devices ===&lt;br /&gt;
* e1000: stabilize&lt;br /&gt;
&lt;br /&gt;
=== test matrix ===&lt;br /&gt;
&lt;br /&gt;
DOA test matrix (all combinations should work):&lt;br /&gt;
        vhost: test both on and off, obviously&lt;br /&gt;
        test: hotplug/unplug, vlan/mac filtering, netperf,&lt;br /&gt;
             file copy both ways: scp, NFS, NTFS&lt;br /&gt;
        guests: linux: release and debug kernels, windows&lt;br /&gt;
        conditions: plain run, run while under migration,&lt;br /&gt;
                vhost on/off migration&lt;br /&gt;
        networking setup: simple, qos with cgroups&lt;br /&gt;
        host configuration: host-guest, external-guest&lt;/div&gt;</summary>
		<author><name>Dmitryf</name></author>
	</entry>
</feed>