Small look inside

From KVM
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

A small look inside

Introduction

This text will be a little explanation about what kvm is doing. Its done while the existence of kvm-54, so future versions of kvm can differ from this.

svm = secure virtual machine (AMD)

vmx = virtual machine extensions (Intel)

Loading Modules

svm (AMD)

If one loads the module svm.ko it invokes the module_init() function of that module. This points like most modules to a own init function,here called svm_init(). This function lies in svm.c.

The svm_init() function does nothing special. It calls kvm_init() with a struct kvm_x86_ops. This structure is defined in x86.h. kvm_init() is the init function in kvm_main.c. If we look at the init functions of svm.c and vmx.c, we see that both call kvm_init(), only with the specific set of kvm_x86_ops and a different sizeof struct vcpu_svm or vcpu_vmx.

Roundup:

  • We got a struct kvm_x86_ops (svm.c) setup with alot of functions.
  • We got a struct vcpu_svm (kvm_svm.h), where we simply need the sizeof firstly
  • We call kvm_init() (kvm_main.c)

In the function kvm_init() theres firstly a call to kvm_init_debug(). This function creates some debugfs entries. The kvm_stats_debugfs_item struct is initialized in the file x86.c, there you can check which debugfs entries then exists. The debugfs must firstly be mounted before using it.

mount -t debugfs none /sys/kernel/debug

and if you want to add it to /etc/fstab, add this line:

none /sys/kernel/debug debugfs defaults 0 0


Then it calls the function kvm_arch_init() with the struct kvm_x86_ops, which we transferred with the opaque variable from svm.c. The function kvm_arch_init() is defined in x86.c.

When we look at kvm_arch_init() we see that it's calling kvm_mmu_module_init(), which is defined in mmu.c. This function creates three lookaside caches. Then it calls kvm_init_msr_list(). msr stands for machine specific registers.

In kvm_init_msr_list() we see that it reads machine specific registers with rdmsr_safe() to the array msrs_to_save[].

We go back to x86.c and kvm_arch_init(). The next mission is checking if the computer got kvm support and initialize to the global pointer, named kvm_x86_ops, the transferred struct kvm_x86_ops. Then it calls kvm_mmu_set_nonpresent_ptes() which is defined in mmu.c. Now it returns back to kvm_init() in kvm_main.c.

Then the function is allocation a page to the global page struct, named bad_page.Then it calls kvm_arch_hardware_setup() which is defined in x86.c and returns kvm_86_ops->hardware_setup(). We know we use svm module, so we initialized the kvm_86_ops struct with functions from svm.c. So we search for .hardware_setup in our svm_x86_ops struct and see that its connected with svm_hardware_setup().

In svm_hardware_setup() we firstly allocate two pages.Then we copy with memset() the byte 0xff to the page address of struct iopm_pages.... Then we allocate one page and do the same. After that it calls set_msr_interception() to set up which MSRs should be intercepted. Then the macro for_each_online_cpu() which expands to a loop through all online cpu's in the computer and then calls in the loop the function svm_cpu_init().

The function svm_cpu_init() allocates memory for a struct svm_cpu_data which member "cpu" gets initialized with every online cpu and the member "save_area" gets one page of memory allocated. Then theres the line per_cpu(svm_data, cpu) = svm_data;, where the per_cpu macro sits in percpu.h and also calls RELOC_HIDE() macro which is in compiler-gcc.h located. At the beginning of file svm.c the line static DEFINE_PER_CPU(struct svm_cpu_data *, svm_data); creates a per-CPU variable at compile-time. Now to get the value of the current CPU one can use get_cpu_var(), but to access another processor's copy of the variable, one use per_cpu(). So we initialize the svm_cpu_data struct pointer svm_data to exactly one cpu's svm_data.Then the for_each_online_cpu loop begins with the next cpu. After that every CPU in the system got own memory for a svm_cpu_data struct and one allocated page for smv_data->save_area.

Now where back in kvm_init() function in kvm_main.c. We loop through every online cpu and call for every online cpu the smp_call_function_single() function which runs a function on a specific cpu. The function we put to smp_call_function_single() is kvm_arch_check_processor_compat() which lies in x86.c and only returns kvm_86_ops->check_processor_compatibility. We look again into svm.c at the svm_check_processor_compat() function and see, that this function gives back a NULL pointer(is that correct??). Seemed that only the return value of the function smp_call_function_single in smpcommon.c is interresting. It returns smp_ops.smp_call_function_mask(mask, func, info, wait);.

After that were back in kvm_init() where the macro on_each_cpu will call the function hardware_enable. This function gets the raw_smp_processor_id() and checks if cpu_isset(). If this cpu is not set, it calls cpu_set() and then kvm_arch_hardware_enable(). The function kvm_arch_hardware_enable() only returns kvm_x86_ops->hardware_enable.

We look at svm.c into the svm_hardware_enable() function and see that it calls also the raw_smp_processor_id() function to get the id. Then it checks if this cpu has svm-support builtin. After that it fills the svm_cpu_data struct with per_cpu(svm_data, me) and checks if svm_data is not NULL. Now it assigns values to the svm_cpu_data struct members?. Now it reads and writes some machine specific registers, which enables the svm-extension on the cpu. The next wrmsrl() function writes ...????

Back in kvm_init() we look at register_cpu_notifier which calls in a mutex_lock() the function raw_notifier_chain_register() which adds a notifier to a raw notifier chain. Within this notifier_block we call kvm_cpu_hotplug(). This function reacts on three notifications, CPU_DYING, CPU_UP_CANCELED and CPU_ONLINE and disables or enables the virtualization on that cpu.

And register_reboot_notifier registers a function which will be called at reboot time.

Now we registering the sysdev class with sysdev_class_register().

Then we add a system device to the tree with sysdev_register().

Now we allocated memory with kmem_cache_create, with the sizeof vcpu_size. This kmem cache lets us meet the alignment requirements of fx_save.????

When we called kvm_init() from svm_init() we transferred THIS_MODULE to kvm_init() and now we set svm's module name as owner of kvm_chardev_ops, which is a file_operations struct. This struct only got 2 functions initialized, .unlocked_ioctl and .compat_ioct, which both are initialized with the kvm_dev_ioctl function.

If we have set this, we call misc_register() to register the miscellaneous device. The miscellaneous device got a MAJOR device number of 10. You can check this with

cat /proc/devices|grep misc

The kvm device is then called /dev/kvm for userspace access.


After that, we set kvm_sched_in and kvm_sched_out as the members of struct kvm_preempt_ops, which is type of struct preempt_ops. With theses functions a task can request the scheduler to notify it whenever it is preempted or scheduled back in.This allows the task to swap any special-purpose registers like the fpu or Intel's VT registers.

Now we call kvm_init_anon_inode() which in anon_inodes.c calls anon_inode_init(). This function creates a anonymous inode.

Then we call the function preempt_notifier_sys_init() which ...???

After that we loaded the kvm module.

Roundup:

  • Create 3 lookaside caches for ...
  • Read MSR's into array msrs_to_save for ...
  • Setup kvm_x86_ops
  • Setup which MSRs should be intercepted
  • Allocated memory for every online CPU a struct svm_cpu_data
  • Allocated page for every save_area member of svm_cpu_data
  • Check Hardware compatibility
  • Registered CPU Notifier for hotplug and reboot

Here is a picture that show what all will be created during kvm_init(). picture1

Here is a picture that shows which stuff will be destroyed by kvm_exit(). picture2

vmx (Intel

If one loads the module vmx.ko it invokes the module_init() function of that module. This points like most modules to a own init function,here called vmx_init(). This function lies in vmx.c.