E500 virtual CPU specification

From KVM
Revision as of 11:20, 26 July 2013 by Stuyoder (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Freescale Power Architecture Book E Virtual CPU Specification ------------------------------------------------------------- Version 1.6 Copyright 2011-2013, Freescale Semiconductor, Inc. May 28, 2013 CONTENTS 1. OVERVIEW 1.1 Introduction 1.2 References 1.3 Definitions 1.3.1 Virtual CPU (vcpu) and Emulated CPU 1.3.2 Guest Operating System 1.3.3 Boundedly Undefined 1.3.4 Volatile 1.4 Revision History 2. IMPLEMENTED CATEGORIES 3. REGISTERS 3.1 Version Registers 3.2 Machine State Register (MSR) 3.3 CPU Index and Processor ID Register (PIR) 3.4 External PID Load Context (EPLC) and External PID Store Context (EPSC) Registers 3.5 Timebase (TLB and TBU) 3.6 L1 and L2 Cache Control Registers 3.7 Branch Unit Control and Status Register (BUCSR) 3.8 Core Device Control and Status Register 0 (CDCSR0) 3.9 Debug Registers 3.10 PID1 and PID2 Registers 3.11 Embedded Processor Control Register (EPCR) 3.12 Hardware Threads 4. INSTRUCTIONS 4.1 Cache Locking Instructions 4.3 System Call instruction 4.4 Wait for Interrupt Instruction 4.5 Reservations 4.5 tlbilx 4.6 msgsnd/msgclr 5. MMU 5.1 Overview 5.2 TLBnCFG NENTRY and ASSOC 5.3 IPROT=0 5.4 MMUCFG 6. EXCEPTIONS 6.1 Debug Interrupts 7. HYPERVISOR SPECIFIC CONSIDERATIONS 7.1 KVM & Cacheable and Cache-inhibited Mappings 1. OVERVIEW 1.1 Introduction Virtualization enables multiple operating systems to run on a system, each in their own isolated virtual machine. Hypervisors create and manage virtual machines, one part of which is a virtual CPU (or vcpu). A vcpu emulates a physical CPU and the behavior of instructions, registers, and exceptions on the vcpu is nearly identical to the physical CPU being emulated. This document defines a virtual implementation of a CPU based on Freescale's implementation of Book III E of the Power ISA. In this document, the vcpu architecture is defined in terms of differences between the vcpu and the following physical CPUs which an implementation may emulate: -e500v2 -e500mc -e5500 -e6500 The differences between the vcpu and the cpu being emulated should be understood by operating systems developers. 1.2 References 1. Power ISA - Version 2.06 Revision B http://www.power.org/resources/downloads/PowerISA_V2.06B_V2_PUBLIC.pdf 2. EREF 2.0: A Programmer’s Reference Manual for Freescale Power Architecture® Processors http://www.freescale.com/files/32bit/doc/ref_manual/EREF_RM.pdf 2. PowerPC(tm) e500 Core Family Reference Manual http://www.freescale.com/files/32bit/doc/ref_manual/E500CORERM.pdf 3. e500mc Core Reference Manual, Freescale Semiconductor. http://www.freescale.com/files/32bit/doc/ref_manual/E500MCRM.pdf 4. e5500 Core Reference Manual, Freescale Semiconductor. Download at freescale.com with document e5500RM 5. ePAPR (Embedded Power Architecture Platform Requirements) version 1.1 https://www.power.org/resources/downloads/Power_ePAPR_APPROVED_v1.1.pdf 1.3 Definitions 1.3.1 Virtual CPU (vcpu) and Emulated CPU A 'virtual CPU' (or vcpu) is the CPU as seen by software running in a virtual machine. The vcpu emulates or behaves similar to some physical CPU-- the 'emulated CPU'. 1.3.2 Guest Operating System A 'guest operating system' (or 'guest') is an operating system running in a virtual machine created by a hypervisor. 1.3.3 Boundedly Undefined The definition of the term 'boundedly undefined' used in this specification is identical to the Power ISA: The results of executing a given instruction are said to be boundedly undefined if they could have been achieved by executing an arbitrary finite sequence of instructions (none of which yields boundedly undefined results) in the state the processor was in before executing the given instruction. Boundedly undefined results for a given instruction may vary between implementations, and between different executions on the same implementation. 1.3.4 Volatile The definition of the term 'volatile' used in this specification is identical to the Power ISA: Bits in a register or array (e.g., TLB) are considered volatile if they may change even if not explicitly modified by software. 1.4 Revision History Version Date Change -------------------------------------------------- 1.2 10/26/2011 updated references, msgsnd/msgclr, cache control registers 1.3 1/5/2012 Added e6500 CPU definitions 1.4 5/2/2012 Added definitions for EPCR and MMUCFG 1.5 6/25/2012 Category table updates for categories that were missing 1.6 5/28/2013 Added missing references to e6500, updated references links, minor clarifications 2. IMPLEMENTED CATEGORIES Table 2-1 below identifies the categories of the Power Architecture and EREF implemented by vcpu implementations for e500v2, e500mc, e5500, and e6500. X indicates category is supported in vcpu - indicates category is not supported in vcpu (Note: any categories not listed are not supported in vcpu) Table 2-1 ------------------------------------------------------------------------ Virtual CPU Feature/Category Abrv. e500v2 e500mc e5500 e6500 ------------------------------------------------------------------------ Base B X X X X Embedded E X X X X Alternate Timebase ATB X X X X Cache Specification CS X X X X Decorated Storage DS - X X X Embedded.Enhanced Debug E.ED - X X X Embedded.External PID E.PD - X X X Embedded.Hypervisor E.HV - - - - Embedded.Little-Endian E.LE [1] [1] [1] [1] Embedded.Performance Monitor E.PM X X X X Embedded.Processor Control E.PC - X X X Embedded.Cache Locking E.CL X X X X External Proxy EXP - X X X Floating Point FP - X X X Floating Point.Record FP.R - X X X Memory Coherence MMC X X X X Signal Processing Engine SP X - - - Embedded Float Scalar Double SP.FD X - - - Embedded Float Scalar Single SP.FS X - - - Embedded Float Vector SP.FV X - - - Store Conditional Page Mobility SCPM - X X X Wait WT - X X X 64-bit 64 - - X X Embedded.Page Table E.PT - - - X Embedded.Hypervisor.LRAT E.HV.LRAT - - - - Embedded Multi-Threading E.EM - - - - Vector (AltiVec) V - - - X Enhanced Reservations ER - - - X (Load and Reserve and Store Cond.) Data Cache Extended Operations DEO - X X X Cache Stashing CS - X X X ------------------------------------------------------------------------ [1] Little-Endian mappings are supported for data but not instructions. The ePAPR 1.1 [3] specification defines "power-isa-*" properties on CPU nodes that specify which Power Architecture categories are implemented. Property: power-isa-* Usage: optional Value: <empty> Description: If the power-isa-version property exists, then for each category from the Categories section of Book I of the Power ISA version indicated, the existence of a property named power-isa-[CAT], where [CAT] is the abbreviated category name with all uppercase letters converted to lowercase, indicates that the category is supported by the implementation. For example, if the power-isa-version property exists and its value is "2.06" and the power-isa-e.hv property exists, then the implementation supports [Category:Embedded.Hypervisor] as defined in Power ISA Version 2.06. A hypervisor should advertise implemented CPU categories on CPU nodes. An operating system should examine these properties to determine categories implemented by a virtual CPU. 3. REGISTERS This section describes differences between registers in a virtual CPU compared to the CPU being emulated. 3.1 Version Registers The Processor Version Register (PVR) and System Version Register (SVR) return the values of the CPU being emulated. Note: a guest should take care regarding what assumptions are made based on PVR as there are differences between the virtual CPU and the CPU being emulated as described in this specfication. 3.2 Machine State Register (MSR) The machine state register (MSR) in the vcpu has differences in the following MSR fields as defined in Table 2-1. Bits Name Description --------------------------------------------------------------- 35 GS Guest state. MSR[GS] is read-only and is always ‘1’. (See Note1) 37 UCLE User-mode cache lock enable. Is writeable and behaves as per the architecture if the vcpu implements category "Embedded.Cache Locking". Otherwise is '0' and is read-only. 54 DE Debug interrupt enable. Is writeable and and behaves as per the architecture if DBCR0[EDM]=0. If DBCR0[EDM]=1, then MSR[DE]=0 and is read-only. (see Note2) 58 IS On e500v2 'IS' must equal 'DS' 59 DS On e500v2 'DS' must equal 'IS' --------------------------------------------------------------- Notes ----- Note1 - The MSR[GS] bit is defined only when the CPU being emulated implements Category: Embedded.Hypervisor. Note2 - If the vcpu implements category "Embedded.Enhanced Debug", when MSR[DE]=1, the registers SPRG9, DSRR0, and DSRR1 are volatile. 3.3 CPU Index and Processor ID Register (PIR) The Processor ID Register is read-only. At virtual machine initialization, each vcpu in the virtual machine is assigned a unique index (within the partition) that can be used to distinguish the CPU from other CPUs in the partition. This CPU index value can be read by using the mfspr instruction to read the processor ID register (PIR). The CPU index is used in several instances: -The index enables software to detect whether a CPU is the boot CPU in an SMP configuration. The CPU index of the boot CPU is set by software in the device tree header (see ePAPR [3]). -If the vcpu implements Category: Embedded.Processor Control, the index is used as a parameter to the msgsnd and msgclr instructions to specify the targeted CPU for intra-partition signaling. -Interrupt source configuration in the VMPIC interrupt controller allows specifying the index of the CPU that is configured to receive the interrupt. Each CPU node is described in the device tree. The reg property for each CPU node has a value that matches the CPU index. 3.4 External PID Load Context (EPLC) and External PID Store Context (EPSC) Registers A virtual CPU may implement [Category: Embedded.External PID] of the Power ISA. EPLC and EPSC specify the context for external PID loads and stores as defined by the Power ISA. The EGS and ELPID fields in EPLC and EPSC specify the hypervisor context and are not accessible by supervisor level software on the vcpu. Values written to the EGS and ELPID fields are ignored. 3.5 Timebase (TLB and TBU) The TBU and TBL are read-only. 3.6 L1 and L2 Cache Control Registers The behavior of the L1 and L2 cache control registers is dependent on whether the virtual CPU implements category "Embedded.Cache Locking". All L1 and L2 cache control registers and L2 error registers can be read regardless of whether category "Embedded.Cache Locking" is implemented. When category "Embedded.Cache Locking" is _not_ implemented: -The L1CSR0[CUL] and L1CSR1[ICUL] fields can be written. For all other fields writes have no effect. When category "Embedded.Cache Locking" is implemented: -Writes to the flash lock clearing bits are supported-- L1CSR0[CLFR], L1CSR1[ICLFR], L2CSR0[L2FLC] -Writes to the L1CSR0 sticky status bits are supported-- L1CSR0[CUL], L1CSR0[CSLC], L1CSR0[CLO] -Writes to the L1CSR1 sticky status bits are supported-- L1CSR1[ICUL], L1CSR1[ICSLC], L1CSR1[ICLO] -Writes to the L2CSR0 sticky status bits are supported-- L2SCR0[L2LO] -Support for L1CSR0[DCBZ32] is implementation defined -For all other fields writes have no effect. 3.7 Branch Unit Control and Status Register (BUCSR) If the cpu being emulated implements BUCSR, the BUCSR fields are identical to those of the cpu being emulated. The BUCSR can be read, but is not writeable on the vcpu. Writes are NOPs and do not affect architectural state. 3.8 Core Device Control and Status Register 0 (CDCSR0) If the emulated CPU implements CDCSR0, the CDCSR0 fields are identical to those of the CPU being emulated. CDCSR0 can be read but is not writeable on the vcpu. Writes are a NOP and result in no architectural state changes. 3.9 Debug Registers The DBCR0 register in the vcpu is always readable. If DBCR0[EDM]=1, then the implementation has not granted debug resources to the vcpu. In this case all accesses to debug registers (except reading DBCR0) are boundedly undefined. If DBCR0[EDM]=0, then the debug registers implemented by the CPU being emulated are supported excepted as noted below. Writes to the debug registers and fields in Table 2-7 are not supported and ignored. Reads return 0x0. Table 2-7 ----------------------------------------------- Debug Register fields Register not supported ----------------------------------------------- DBCR0 IDM - internal debug mode FT - freeze timers IRPT - interrupt taken RET - return debug event RST - reset DBSRWR all fields DDAM all fields DEVENT all fields ----------------------------------------------- When MSR[DE]=1, the registers SPRG9, DSRR0, and DSRR1 are volatile. 3.10 PID1 and PID2 Registers The e500v2 vcpu does not implement the PID1 and PID2 SPRs. 3.11 Embedded Processor Control Register (EPCR) EPCR is implented if category 64-bit is supported. All bits in EPCR are reserved except for ICM. Bits Name Description --------------------------------------------------------------- 38 ICM Controls the computation mode for all interrupts. At interrupt time, EPCR[ICM] is copied into MSR[CM]. 0 Interrupts will execute in 32-bit mode. 1 Interrupts will execute in 64-bit mode. --------------------------------------------------------------- 3.12 Hardware Threads e6500 CPUs have multiple hardware threads, but threads may not be exposed in a virtual machine. No assumption should be made about the number of threads available based mechanisms such as PIR. A e6500 vcpu may only have one thread. The Thread Management Configuration Register 0 (TMCFG0) should be used to determine the number of threads in a virtual CPU. 4. INSTRUCTIONS 4.1 Cache Locking Instructions The behavior of cache locking instructions (dcbtls, dcbtstls, dcblc, icbtls, icblc) is dependent on whether the virtual CPU implements category "Embedded.Cache Locking". When category "Embedded.Cache Locking" is implemented cache locking instructions behave as per the architecture. If cache locking is not implemented, executing cache-locking instructions is effectively a nop-- the operation is ignored. 4.3 System Call instruction The sc instruction behaves as per the architecture except for the following: in user mode MSR[PR=1], sc with LEV == 1 results in a program exception with ESR[PPR] set (privileged instruction exception). 4.4 Wait for Interrupt Instruction The 'wait' instruction stops synchronous processor activity including the fetching of instructions until an asynchronous interrupt occurs. It is possible that a spurious 'wakeup' could occur where instruction fetching is resumed even when no vcpu interrupt or no loss of reservation occurred. 4.5 Reservations The ability to emulate an atomic operation using "load with reservation" instructions and "store conditional" instructions is based on the conditional behavior of "store conditional", the reservation set by "load with reservation", and the clearing of that reservation if the target location is modified by another processor or mechanism before the "store condtitional" performs its store. The following considerations should be understood regarding potential reservation loss. With the vcpu, a reservation may be broken for the following reasons: -The Power ISA lists reasons where reservation may be lost -An asynchronous interrupt in the physical CPU may cause a loss of a reservation, including interrupts not visible to or caused by guest software. -A reservation may be broken if software executes a privileged instruction or utilizes a privileged facility. Privileged instructions and facilities are defined by the Power ISA. 4.5 tlbilx The tlbilx instruction is supported on e500mc, e5500, and e6500 virtual CPU implementations even though category E.HV is not supported. 4.6 msgsnd/msgclr The msgsnd and msgclr instructions are defined by cateogry "Embedded.Processor Control" and are supported if the category is implemented on the cpu being emulated. The vcpu does not implement category "Embedded.Hypervisor". An attempt to use the E.HV features of msgsnd/msgclr is boundedly undefined. 5. MMU 5.1 Overview Software running on an vcpu implementation should not make assumptions about the configuration or geometry of the vcpu's MMU based on the PIR register. Instead, software should determine MMU configuration from the MMUCFG and TLBnCFG registers. The vcpu's MMU configuration may be different from the CPU being emulated. 5.2 TLBnCFG NENTRY and ASSOC The Power ISA [1] specifies how TLBnCFG[NENTRY] and TLBnCFG[ASSOC] should be interpreted. This definition is summarized in the table below: NENTRY ASSOC Meaning -------------------------------------------------------------------- 0 0 no TLB present 0 1 TLB geometry is completely implementation-defined. MAS0[ESEL] is ignored 0 >1 TLB geometry and number of entries is implementation defined, but has known associativity. For tlbre and tlbwe, a set of TLB entries is selected by an implementation dependent function of MAS8[TGS][TLPID], MAS1[TS][TID][TSIZE], and MAS2[EPN]. MAS0[ESEL] is used to select among entries in this set, except on tlbwe if MAS0[HES]=1. n > 0 n or 0 TLB is fully associative -------------------------------------------------------------------- 5.3 IPROT=0 A TLB entry with IPROT=0 may be evicted at any time. 5.4 MMUCFG The LPIDSIZE field (bits 36-39) can be used by software to detect whether category E.HV is present. A value of 0 indicates that E.HV functionality is not present. 6. EXCEPTIONS 6.1 Debug Interrupts The vcpu does not support delayed/deferred debug interrupts: -If MSR[DE]=0 and a debug condition occurs in the vcpu, no bit is set in DBSR. -Writes to DBSRWR have no effect. -Imprecise debug events (DBSR[IDE]) and unconditional debug events (DBSR[UDE]) are not supported. -If a debug event happens with MSR[DE] = 1, and the software running on the vcpu fails to clear DBSR before re-enabling MSR[DE] another debug interrupt will not occur. 7. HYPERVISOR SPECIFIC CONSIDERATIONS 7.1 Cacheable and Cache-inhibited Mappings on KVM As part of virtual machine initialization and setup, QEMU (the virtual machine manager) also creates mappings to a guest address regions. A guest must have I=1 for RAM mappings and I=0 for other mappings in order to avoid creating an architecture-violating alias with QEMU's mapping.