From KVM
m (fixed link to VGML homepage)
m (DPDK related projects)
 
(9 intermediate revisions by 5 users not shown)
Line 1: Line 1:
 
=ToDo=
 
=ToDo=
 
+
__TOC__
 
The following items need some love. Please post to the list if you are interested in helping out:  
 
The following items need some love. Please post to the list if you are interested in helping out:  
  
* Real mode support: VT support for real mode is terrible, so we need to do it in software. This means extending the x86 emulator (x86_emulate.c) to handle more instructions, and changing the execution loop to call the emulator for real mode (in progress).
 
* Emulate the architectural performance monitor counters/msrs, for Linux nmi watchdog support.
 
 
* Emulate MSR_IA32_DEBUGCTL for guests which use it
 
* Emulate MSR_IA32_DEBUGCTL for guests which use it
 
* Bring up Windows 95 and Windows 98 guests
 
* Bring up Windows 95 and Windows 98 guests
Line 10: Line 8:
 
* Improve ballooning to try to use 2MB pages when possible ( in progress - kern.devel@gmail.com )
 
* Improve ballooning to try to use 2MB pages when possible ( in progress - kern.devel@gmail.com )
  
The following smaller scale tasks can be a nice entry point to someone wishing to get involved:  
+
==== Networking TODO: ====
 +
* Has its [[NetworkingTodo|own page]]
  
* Reduce qemu memory footprint when using kvm
+
==== PCI TODO: ====
** this involves dropping the PhysPageDesc array in favour of a memslot-like approach
+
* Has its [[PCITodo|own page]]
* Avoid taking kvm->lock when issuing mmio.  Need to check lapic and ioapic accesses for correctness.
+
  
 
==== MMU related: ====
 
==== MMU related: ====
Line 22: Line 20:
 
* Implement AM20 for dos and the like.
 
* Implement AM20 for dos and the like.
 
* O(1) write protection by protecting the PML4Es, then on demand PDPTEs, PDEs, and PTEs
 
* O(1) write protection by protecting the PML4Es, then on demand PDPTEs, PDEs, and PTEs
 +
* Simpler variant: don't drop large ptes when write protecting; just write protect them. When taking a write fault, either drop the large pte, or convert it to small ptes and write protect those (like O(1) write protection).
 
* O(1) mmu invalidation using a generation number
 
* O(1) mmu invalidation using a generation number
  
 
==== x86 emulator updates: ====
 
==== x86 emulator updates: ====
* Add a NonPT flag (or maybe its inverse, Paging) to instructions that are never used for page table updates (like add, sub, call).  Teach the mmu to unshadow page tables if a NonPT instruction is executed on them.
+
* On-demand register access, really, copying all registers all the time is gross.
* Change the emulator initialization sequence not to read all segment registers (this is slow), instead read them on demand.  On 64-bit, no segments are usually needed while on 32-bit only cs and ds are commenly required.
+
** Can be done by adding 'available' and 'dirty' bitmasks
* Same for ordinary registers, really, copying all registers all the time is gross.
+
* Implement mmx and sse memory move instructions; useful for guests that use multimedia extensions for accessing vga (partially done)
* Implement mmx and sse memory move instructions; useful for guests that use multimedia extensions for accessing vga
+
 
* Implement an operation queue for the emulator.  The emulator often calls userspace to perform a read or a write, but due to inversion of control it actually restarts instead of continuing.  The queue would allow it to replay all previous operations until it reaches the point it last stopped.
 
* Implement an operation queue for the emulator.  The emulator often calls userspace to perform a read or a write, but due to inversion of control it actually restarts instead of continuing.  The queue would allow it to replay all previous operations until it reaches the point it last stopped.
 
** if this is done, we can retire ->read_std() in favour of ->read_emulated().
 
** if this is done, we can retire ->read_std() in favour of ->read_emulated().
* push segment base resolution to the last possible moment, i.e. calling ctxt->ops->read_emulated(); then implement limit checks in that place
 
 
* convert more instructions to direct dispatch (function pointer in decode table)
 
* convert more instructions to direct dispatch (function pointer in decode table)
 +
* move init_emulate_ctxt() into x86_decode_insn() and other emulator entry points
  
 
==== Interactivity improvements: ====
 
==== Interactivity improvements: ====
Line 56: Line 54:
 
==== Bug fixes: ====
 
==== Bug fixes: ====
 
* Less sexy but ever important, fixing bugs is one of the most important contributions
 
* Less sexy but ever important, fixing bugs is one of the most important contributions
 +
 +
==== Random improvements ====
 +
* Utilize the SVM interrupt queue to avoid extra exits when guest interrupts are disabled
  
 
==== For the adventurous: ====
 
==== For the adventurous: ====
* Emulate the VT and SVM instructions, so that kvm can run in a virtual machine.
+
* Emulate the VMX instruction sets on qemu.  This would be very beneficial to debugging kvm ( working on this - kern.devel@gmail.com ).
* Emulate the VT and SVM instruction sets on qemu.  This would be very beneficial to debugging kvm ( working on this - kern.devel@gmail.com ).
+
* Keep this TODO up to date
+
 
* Add [http://lagarcavilla.org/vmgl/ vmgl] support to qemu.  Port to virtio.  Write a Windows driver.
 
* Add [http://lagarcavilla.org/vmgl/ vmgl] support to qemu.  Port to virtio.  Write a Windows driver.
 +
* Keep this TODO up to date
 +
 +
==== Nested VMX ====
 +
* Implement performance features such as EPT and VPID
 +
 +
== KVM Safe Mode ==
 +
 +
An ioctl() from userspace that tells KVM to disable one or more of the following features:
 +
 +
* shadow paging (force direct mapping)
 +
* instruction emulation (require virtio or mmio hypercall)
 +
* task switches
 +
* mode switches (long mode / legacy mode / real mode)
 +
* IDT/GDT/LDT changes
 +
* IDT/GDT/LDT write protect
 +
* write protect important MSRs (*STAR etc)
 +
 +
The idea is both to protect the guest from attacks, and to protect the host from the guest.
 +
 +
== DPDK related projects ==
 +
 +
*virtio:
 +
**  virtio-1: support for virtio pmd
 +
**  virtio-1: support for AMD host
 +
**  virtio-1: support for non-ept processors
 +
**  virtio-1: support for PCI-e
 +
**  virtio-1: vhost IOMMU
 +
**  virtio-net: mtu report to guest (fix OVS with tunneling)
 +
**  virtio net: emulate host offloads
 +
**  multi-queue macvlan
 +
**  ARI support
 +
**  Kernel live migration support
 +
 +
*vhost-user:
 +
**  userspace live migration
 +
**  libvirt support for OVS-DPDK
 +
**  vhost-user unit-test (without OVS/DPDK)
 +
 +
*misc
 +
**  NPT support for fast MMIO
 +
**  multi-queue macvlan
 +
**  VFIO in QEMU - emulated-IOMMU support
  
 
__NOTOC__
 
__NOTOC__

Latest revision as of 10:37, 25 October 2015

ToDo

The following items need some love. Please post to the list if you are interested in helping out:

  • Emulate MSR_IA32_DEBUGCTL for guests which use it
  • Bring up Windows 95 and Windows 98 guests
  • Implement ACPI memory hotplug
  • Improve ballooning to try to use 2MB pages when possible ( in progress - kern.devel@gmail.com )

Networking TODO:

PCI TODO:

MMU related:

  • Improve mmu page eviction algorithm (currently FIFO, change to approximate LRU).
  • Add a read-only memory type.
    • possible using mprotect()?
  • Implement AM20 for dos and the like.
  • O(1) write protection by protecting the PML4Es, then on demand PDPTEs, PDEs, and PTEs
  • Simpler variant: don't drop large ptes when write protecting; just write protect them. When taking a write fault, either drop the large pte, or convert it to small ptes and write protect those (like O(1) write protection).
  • O(1) mmu invalidation using a generation number

x86 emulator updates:

  • On-demand register access, really, copying all registers all the time is gross.
    • Can be done by adding 'available' and 'dirty' bitmasks
  • Implement mmx and sse memory move instructions; useful for guests that use multimedia extensions for accessing vga (partially done)
  • Implement an operation queue for the emulator. The emulator often calls userspace to perform a read or a write, but due to inversion of control it actually restarts instead of continuing. The queue would allow it to replay all previous operations until it reaches the point it last stopped.
    • if this is done, we can retire ->read_std() in favour of ->read_emulated().
  • convert more instructions to direct dispatch (function pointer in decode table)
  • move init_emulate_ctxt() into x86_decode_insn() and other emulator entry points

Interactivity improvements:

  • If for several frames in a row a large proportion of the framebuffer pages are changing, then for the next few frames don't bother to get the dirty page log from kvm, but instead assume that all pages are dirty. This will reduce page fault overhead on highly interactive workloads.
  • When detecting keyboard/video/mouse activity, scale up the frame rate; when activity dies down, scale it back down (applicable to qemu as well).

Pass-through/VT-d related:

  • Enhance KVM QEMU to return error messages if user attempts to pass-through unsupported devices:
    • Devices with shared host IOAPIC interrupt
    • Conventional PCI devices
    • Devices without FLR capability
  • QEMU PCI pass-through patch needs to be enhanced to same functionality as corresponding file in Xen
    • Remove direct HW access by QEMU for probing PCI BAR size
    • PCI handling of various PCI configuration registers
    • Other enhancements that was done in Xen
  • Host shared interrupt support
  • VT-d2 support (WIP in Linux Kernel)
    • Queued invalidation
    • Interrupt remapping
    • ATS
  • USB 2.0 (EHCI) support

Bug fixes:

  • Less sexy but ever important, fixing bugs is one of the most important contributions

Random improvements

  • Utilize the SVM interrupt queue to avoid extra exits when guest interrupts are disabled

For the adventurous:

  • Emulate the VMX instruction sets on qemu. This would be very beneficial to debugging kvm ( working on this - kern.devel@gmail.com ).
  • Add vmgl support to qemu. Port to virtio. Write a Windows driver.
  • Keep this TODO up to date

Nested VMX

  • Implement performance features such as EPT and VPID

KVM Safe Mode

An ioctl() from userspace that tells KVM to disable one or more of the following features:

  • shadow paging (force direct mapping)
  • instruction emulation (require virtio or mmio hypercall)
  • task switches
  • mode switches (long mode / legacy mode / real mode)
  • IDT/GDT/LDT changes
  • IDT/GDT/LDT write protect
  • write protect important MSRs (*STAR etc)

The idea is both to protect the guest from attacks, and to protect the host from the guest.

DPDK related projects

  • virtio:
    • virtio-1: support for virtio pmd
    • virtio-1: support for AMD host
    • virtio-1: support for non-ept processors
    • virtio-1: support for PCI-e
    • virtio-1: vhost IOMMU
    • virtio-net: mtu report to guest (fix OVS with tunneling)
    • virtio net: emulate host offloads
    • multi-queue macvlan
    • ARI support
    • Kernel live migration support
  • vhost-user:
    • userspace live migration
    • libvirt support for OVS-DPDK
    • vhost-user unit-test (without OVS/DPDK)
  • misc
    • NPT support for fast MMIO
    • multi-queue macvlan
    • VFIO in QEMU - emulated-IOMMU support