From KVM
(rewrote the page. TODO: add BZs, detailed project descriptions.)
Line 2: Line 2:
 
currently most info is related to virtio-net.
 
currently most info is related to virtio-net.
  
Stabilization is highest priority currently.
+
=== projects in progress. contributions are still very wellcome!
DOA test matrix (all combinations should work):
+
        vhost: test both on and off, obviously
+
        test: hotplug/unplug, vlan/mac filtering, netperf,
+
            file copy both ways: scp, NFS, NTFS
+
        guests: linux: release and debug kernels, windows
+
        conditions: plain run, run while under migration,
+
                vhost on/off migration
+
        networking setup: simple, qos with cgroups
+
        host configuration: host-guest, external-guest
+
  
=== vhost-net driver projects ===
+
* vhost-net scalability tuning: threading for many VMs
* iovec length limitations
+
      Plan: switch to workqueue shared by many VMs
      Developer: Jason Wang <jasowang@redhat.com>
+
      Developer: Shirley Ma?, MST
      Testing: guest to host file transfer on windows.
+
      Testing: netperf guest to guest
  
* mergeable buffers: fix host->guest BW regression
+
* multiqueue support in macvtap
       Testing: netperf host to guest default flags
+
      multiqueue is only supported for tun.
 +
      Add support for macvtap.
 +
       Developer: Jason Wang
  
* scalability tuning: threading for guest to guest
+
* enable multiqueue by default
       Developer: MST
+
       Multiqueue causes regression in some workloads, thus
      Testing: netperf guest to guest
+
      it is off by default. Detect and enable/disable
 +
      automatically so we can make it on by default
 +
      Developer: Jason Wang
  
=== qemu projects ===
+
* guest programmable mac/vlan filtering with macvtap
* fix hotplug issues
+
        Developer: Dragos Tatulea?, Amos Kong
      Developer: MST
+
        Status: [[GuestProgrammableMacVlanFiltering]]
      https://bugzilla.redhat.com/show_bug.cgi?id=623735
+
  
* migration with multiple macs/vlans
+
* bridge without promisc mode in NIC
        Developer: Jason Wang
+
  given hardware support, teach bridge
        qemu only sends ping with the first mac/no vlan:
+
  to program mac/vlan filtering in NIC
        need to send it for all macs/vlan
+
  Helps performance and security on noisy LANs
 +
  Developer: Vlad Yasevich
  
* bugfix: crash with illegal fd= value on command line
+
* allow handling short packets from softirq or VCPU context
      Developer: Jason Wang
+
  Testing: netperf TCP RR - should be improved drastically
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=581750
+
          netperf TCP STREAM guest to host - no regression
 +
  Developer: MST
  
=== virtio projects ===
+
* Flexible buffers: put virtio header inline with packet data
* suspend/resume support
+
  Developer: MST
  
* API extension: improve small packet/large buffer performance:
+
* device failover to allow migration with assigned devices
 +
  https://fedoraproject.org/wiki/Features/Virt_Device_Failover
 +
  Developer: Gal Hammer, Cole Robinson, Laine Stump, MST
 +
 
 +
* Reuse vringh code for better maintainability
 +
  Developer: Rusty Russell
 +
 
 +
=== projects that are not started yet - no owner ===
 +
 
 +
* receive side zero copy
 +
  The ideal is a NIC with accelerated RFS support,
 +
  So we can feed the virtio rx buffers into the correct NIC queue.
 +
  Depends on non promisc NIC support in bridge.
 +
 
 +
* IPoIB infiniband bridging
 +
  Plan: implement macvtap for ipoib and virtio-ipoib
 +
 
 +
* RDMA bridging
 +
 
 +
* use kvm eventfd support for injecting level interrupts,
 +
  enable vhost by default for level interrupts
 +
 
 +
* DMA emgine (IOAT) use in tun
 +
 
 +
* virtio API extension: improve small packet/large buffer performance:
 
   support "reposting" buffers for mergeable buffers,
 
   support "reposting" buffers for mergeable buffers,
 
   support pool for indirect buffers
 
   support pool for indirect buffers
 +
 +
=== vague ideas: path to implementation not clear
 +
 
* ring redesign:
 
* ring redesign:
 
       find a way to test raw ring performance  
 
       find a way to test raw ring performance  
 
       fix cacheline bounces  
 
       fix cacheline bounces  
 
       reduce interrupts
 
       reduce interrupts
      Developer: MST
 
      see patchset: virtio: put last seen used index into ring itself
 
  
=== projects involing other kernel components and/or networking stack ===
 
* guest programmable mac/vlan filtering with macvtap
 
        Developer: Dragos Tatulea
 
        Status: [[GuestProgrammableMacVlanFiltering]]
 
  
* bridge without promisc mode in NIC
+
* support more queues
  given hardware support, teach bridge
+
    We limit TUN to 8 queues
  to program mac/vlan filtering in NIC
+
 
 +
* irq/numa affinity:
 +
    networking goes much faster with irq pinning:
 +
    both with and without numa.
 +
    what can be done to make the non-pinned setup go faster?
 +
 
 +
* reduce conflict with VCPU thread
 +
    if VCPU and networking run on same CPU,
 +
    they conflict resulting in bad performance.
 +
    Fix that, push vhost thread out to another CPU
 +
    more aggressively.
  
 
* rx mac filtering in tun
 
* rx mac filtering in tun
Line 68: Line 95:
 
* vlan filtering in tun
 
* vlan filtering in tun
 
         the need for this is still not understood as we have filtering in bridge
 
         the need for this is still not understood as we have filtering in bridge
        for small # if vlans we can use BPF
 
  
 
* vlan filtering in bridge
 
* vlan filtering in bridge
 
         IGMP snooping in bridge should take vlans into account
 
         IGMP snooping in bridge should take vlans into account
  
* zero copy tx/rx for macvtap
 
      Developers: tx zero copy Shirley Ma; rx zero copy Xin Xiaohui
 
 
* multiqueue (involves all of vhost, qemu, virtio, networking stack)
 
      Developer: Krishna Jumar
 
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=632751
 
 
* kvm MSI interrupt injection fast path
 
      Developer: MST
 
 
* kvm eventfd support for injecting level interrupts
 
 
* DMA emgine (IOAT) use in tun
 
 
* allow handling short packets from softirq context
 
  Testing: netperf TCP STREAM guest to host
 
          netperf TCP RR
 
 
* irq affinity:
 
    networking goes much faster with irq pinning:
 
    both with and without numa.
 
    what can be done to make the non-pinned setup go faster?
 
  
 
=== testing projects ===
 
=== testing projects ===
* Cover test matrix with autotest
+
Keeping networking stable is highest priority.
* Test with windows drivers, pass WHQL
+
 
 +
* Run weekly test on upstream HEAD covering test matrix with autotest
  
 
=== non-virtio-net devices ===
 
=== non-virtio-net devices ===
 
* e1000: stabilize
 
* e1000: stabilize
      Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=602205
 
  
=== bugzilla entries for bugs fixed ===
+
=== test matrix ===
* verify these are ok upstream
+
    https://bugzilla.redhat.com/show_bug.cgi?id=623552
+
    https://bugzilla.redhat.com/show_bug.cgi?id=632747
+
    https://bugzilla.redhat.com/show_bug.cgi?id=632745
+
  
 
+
DOA test matrix (all combinations should work):
=== abandoned projects: ===
+
        vhost: test both on and off, obviously
* Add GSO/checksum offload support to AF_PACKET(raw) sockets.
+
        test: hotplug/unplug, vlan/mac filtering, netperf,
      status: incomplete
+
            file copy both ways: scp, NFS, NTFS
* guest kernel 2.6.31 seems to work well. Under certain workloads,
+
        guests: linux: release and debug kernels, windows
      virtio performance has regressed with guest kernels 2.6.32 and up
+
        conditions: plain run, run while under migration,
      (but still better than userspace). A patch has been posted:
+
                vhost on/off migration
      http://www.spinics.net/lists/netdev/msg115292.html
+
        networking setup: simple, qos with cgroups
      status: might be fixed, need to test
+
        host configuration: host-guest, external-guest

Revision as of 08:42, 23 May 2013

This page should cover all networking related activity in KVM, currently most info is related to virtio-net.

=== projects in progress. contributions are still very wellcome!

  • vhost-net scalability tuning: threading for many VMs
     Plan: switch to workqueue shared by many VMs
     Developer: Shirley Ma?, MST
     Testing: netperf guest to guest
  • multiqueue support in macvtap
      multiqueue is only supported for tun.
      Add support for macvtap.
      Developer: Jason Wang
  • enable multiqueue by default
      Multiqueue causes regression in some workloads, thus
      it is off by default. Detect and enable/disable
      automatically so we can make it on by default
      Developer: Jason Wang
  • guest programmable mac/vlan filtering with macvtap
       Developer: Dragos Tatulea?, Amos Kong
       Status: GuestProgrammableMacVlanFiltering
  • bridge without promisc mode in NIC
 given hardware support, teach bridge
 to program mac/vlan filtering in NIC
 Helps performance and security on noisy LANs
 Developer: Vlad Yasevich
  • allow handling short packets from softirq or VCPU context
 Testing: netperf TCP RR - should be improved drastically
          netperf TCP STREAM guest to host - no regression
 Developer: MST
  • Flexible buffers: put virtio header inline with packet data
 Developer: MST
  • device failover to allow migration with assigned devices
 https://fedoraproject.org/wiki/Features/Virt_Device_Failover
 Developer: Gal Hammer, Cole Robinson, Laine Stump, MST
  • Reuse vringh code for better maintainability
 Developer: Rusty Russell

projects that are not started yet - no owner

  • receive side zero copy
 The ideal is a NIC with accelerated RFS support,
 So we can feed the virtio rx buffers into the correct NIC queue.
 Depends on non promisc NIC support in bridge.
  • IPoIB infiniband bridging
 Plan: implement macvtap for ipoib and virtio-ipoib
  • RDMA bridging
  • use kvm eventfd support for injecting level interrupts,
 enable vhost by default for level interrupts
  • DMA emgine (IOAT) use in tun
  • virtio API extension: improve small packet/large buffer performance:
 support "reposting" buffers for mergeable buffers,
 support pool for indirect buffers

=== vague ideas: path to implementation not clear

  • ring redesign:
     find a way to test raw ring performance 
     fix cacheline bounces 
     reduce interrupts


  • support more queues
    We limit TUN to 8 queues 
  • irq/numa affinity:
    networking goes much faster with irq pinning:
    both with and without numa.
    what can be done to make the non-pinned setup go faster?
  • reduce conflict with VCPU thread
   if VCPU and networking run on same CPU,
   they conflict resulting in bad performance.
   Fix that, push vhost thread out to another CPU
   more aggressively.
  • rx mac filtering in tun
       the need for this is still not understood as we have filtering in bridge
       we have a small table of addresses, need to make it larger
       if we only need filtering for unicast (multicast is handled by IMP filtering)
  • vlan filtering in tun
       the need for this is still not understood as we have filtering in bridge
  • vlan filtering in bridge
       IGMP snooping in bridge should take vlans into account


testing projects

Keeping networking stable is highest priority.

  • Run weekly test on upstream HEAD covering test matrix with autotest

non-virtio-net devices

  • e1000: stabilize

test matrix

DOA test matrix (all combinations should work):

       vhost: test both on and off, obviously
       test: hotplug/unplug, vlan/mac filtering, netperf,
            file copy both ways: scp, NFS, NTFS
       guests: linux: release and debug kernels, windows
       conditions: plain run, run while under migration,
               vhost on/off migration
       networking setup: simple, qos with cgroups
       host configuration: host-guest, external-guest