From KVM

Freescale Power Architecture Book E Virtual CPU Specification
-------------------------------------------------------------
Version 1.6
Copyright 2011-2013, Freescale Semiconductor, Inc.
May 28, 2013

                                CONTENTS

      1. OVERVIEW
          1.1 Introduction
          1.2 References
          1.3 Definitions
              1.3.1 Virtual CPU (vcpu) and Emulated CPU
              1.3.2 Guest Operating System
              1.3.3 Boundedly Undefined
              1.3.4 Volatile
          1.4 Revision History
      2. IMPLEMENTED CATEGORIES
      3. REGISTERS
          3.1 Version Registers
          3.2 Machine State Register (MSR)
          3.3  CPU Index and Processor ID Register (PIR)
          3.4  External PID Load Context (EPLC) and External PID Store Context
               (EPSC) Registers
          3.5  Timebase (TLB and TBU)
          3.6  L1 and L2 Cache Control Registers
          3.7  Branch Unit Control and Status Register (BUCSR)
          3.8  Core Device Control and Status Register 0 (CDCSR0)
          3.9  Debug Registers
          3.10 PID1 and PID2 Registers
          3.11 Embedded Processor Control Register (EPCR)
          3.12 Hardware Threads
      4. INSTRUCTIONS
          4.1 Cache Locking Instructions
          4.3 System Call instruction
          4.4 Wait for Interrupt Instruction
          4.5 Reservations
          4.5 tlbilx
          4.6 msgsnd/msgclr
      5. MMU
          5.1 Overview
          5.2 TLBnCFG NENTRY and ASSOC
          5.3 IPROT=0
          5.4 MMUCFG
      6. EXCEPTIONS
          6.1 Debug Interrupts
      7. HYPERVISOR SPECIFIC CONSIDERATIONS
          7.1 KVM  & Cacheable and Cache-inhibited Mappings


1.  OVERVIEW

    1.1 Introduction

        Virtualization enables multiple operating systems to run on a system,
        each in their own isolated virtual machine.  Hypervisors create and
        manage virtual machines, one part of which is a virtual CPU (or vcpu).
        A vcpu emulates a physical CPU and the behavior of instructions,
        registers, and exceptions on the vcpu is nearly identical to
        the physical CPU being emulated.

        This document defines a virtual implementation of a CPU based
        on Freescale's implementation of Book III E of the Power ISA.

        In this document, the vcpu architecture is defined in terms of
        differences between the vcpu and the following physical
        CPUs which an implementation may emulate:
           -e500v2
           -e500mc
           -e5500
           -e6500

        The differences between the vcpu and the cpu being emulated
        should be understood by operating systems developers.

    1.2 References

        1.  Power ISA - Version 2.06 Revision B
        http://www.power.org/resources/downloads/PowerISA_V2.06B_V2_PUBLIC.pdf

        2.  EREF 2.0: A Programmer’s Reference Manual for Freescale Power
        Architecture® Processors
        http://www.freescale.com/files/32bit/doc/ref_manual/EREF_RM.pdf
    
        2.  PowerPC(tm) e500 Core Family Reference Manual
        http://www.freescale.com/files/32bit/doc/ref_manual/E500CORERM.pdf

        3.  e500mc Core Reference Manual, Freescale Semiconductor.
        http://www.freescale.com/files/32bit/doc/ref_manual/E500MCRM.pdf

        4.  e5500 Core Reference Manual, Freescale Semiconductor.
        Download at freescale.com with document e5500RM

        5.  ePAPR (Embedded Power Architecture Platform Requirements)
        version 1.1
        https://www.power.org/resources/downloads/Power_ePAPR_APPROVED_v1.1.pdf

    1.3 Definitions

        1.3.1 Virtual CPU (vcpu) and Emulated CPU

              A 'virtual CPU' (or vcpu) is the CPU as seen by
              software running in a virtual machine.  The vcpu
              emulates or behaves similar to some physical CPU--
              the 'emulated CPU'.

        1.3.2 Guest Operating System

              A 'guest operating system' (or 'guest') is an operating
              system running in a virtual machine created by a hypervisor.

        1.3.3 Boundedly Undefined

              The definition of the term 'boundedly undefined' used in this
              specification is identical to the Power ISA:
          
                  The results of executing a given instruction are said to be
                  boundedly undefined if they could have been achieved by
                  executing an arbitrary finite sequence of instructions
                  (none of which yields boundedly undefined results) in the
                  state the processor was in before executing the given
                  instruction. Boundedly undefined results for a given
                  instruction may vary between implementations, and between
                  different executions on the same implementation.

        1.3.4 Volatile

              The definition of the term 'volatile' used in this
              specification is identical to the Power ISA:

                  Bits in a register or array (e.g., TLB) are considered
                  volatile if they may change even if not explicitly
                  modified by software.

    1.4 Revision History

        Version     Date      Change
        --------------------------------------------------
        1.2        10/26/2011  updated references, msgsnd/msgclr,
                               cache control registers
        1.3        1/5/2012    Added e6500 CPU definitions
        1.4        5/2/2012    Added definitions for EPCR and MMUCFG
        1.5        6/25/2012   Category table updates for categories
                               that were missing 
        1.6        5/28/2013   Added missing references to e6500,
                               updated references links, minor 
                               clarifications

2. IMPLEMENTED CATEGORIES

   Table 2-1 below identifies the categories of the Power Architecture
   and EREF implemented by vcpu implementations for e500v2, e500mc, e5500,
   and e6500.

       X indicates category is supported in vcpu
       - indicates category is not supported in vcpu
         (Note: any categories not listed are not supported in vcpu)

                                   Table 2-1
    ------------------------------------------------------------------------
                                                       Virtual CPU
    Feature/Category                    Abrv.   e500v2  e500mc  e5500  e6500
    ------------------------------------------------------------------------
    Base                                B          X       X       X      X
    Embedded                            E          X       X       X      X
    Alternate Timebase                  ATB        X       X       X      X
    Cache Specification                 CS         X       X       X      X
    Decorated Storage                   DS         -       X       X      X
    Embedded.Enhanced Debug             E.ED       -       X       X      X
    Embedded.External PID               E.PD       -       X       X      X
    Embedded.Hypervisor                 E.HV       -       -       -      -
    Embedded.Little-Endian              E.LE      [1]     [1]     [1]    [1]
    Embedded.Performance Monitor        E.PM       X       X       X      X
    Embedded.Processor Control          E.PC       -       X       X      X
    Embedded.Cache Locking              E.CL       X       X       X      X
    External Proxy                      EXP        -       X       X      X
    Floating Point                      FP         -       X       X      X
       Floating Point.Record            FP.R       -       X       X      X
    Memory Coherence                    MMC        X       X       X      X
    Signal Processing Engine            SP         X       -       -      -
        Embedded Float Scalar Double    SP.FD      X       -       -      -
        Embedded Float Scalar Single    SP.FS      X       -       -      -
        Embedded Float Vector           SP.FV      X       -       -      -
    Store Conditional Page Mobility     SCPM       -       X       X      X
    Wait                                WT         -       X       X      X
    64-bit                              64         -       -       X      X
    Embedded.Page Table                 E.PT       -       -       -      X
    Embedded.Hypervisor.LRAT            E.HV.LRAT  -       -       -      -
    Embedded Multi-Threading            E.EM       -       -       -      -
    Vector (AltiVec)                    V          -       -       -      X
    Enhanced Reservations               ER         -       -       -      X
     (Load and Reserve and Store Cond.)
    Data Cache Extended Operations      DEO        -       X       X      X
    Cache Stashing                      CS         -       X       X      X
    ------------------------------------------------------------------------
    [1] Little-Endian mappings are supported for data but not instructions.

    The ePAPR 1.1 [3] specification defines "power-isa-*" properties on
    CPU nodes that specify which Power Architecture categories are
    implemented.

        Property: power-isa-*
        Usage: optional
        Value: <empty>
        Description:
           If the power-isa-version property exists, then for each
           category from the Categories section of Book I of the Power
           ISA version indicated, the existence of a property named
           power-isa-[CAT], where [CAT] is the abbreviated category
           name with all uppercase letters converted to lowercase,
           indicates that the category is supported by the implementation.

           For example, if the power-isa-version property exists and
           its value is "2.06" and the power-isa-e.hv property exists,
           then the implementation supports [Category:Embedded.Hypervisor]
           as defined in Power ISA Version 2.06.

    A hypervisor should advertise implemented CPU categories on 
    CPU nodes.

    An operating system should examine these properties
    to determine categories implemented by a virtual CPU.

3.  REGISTERS

    This section describes differences between registers in a virtual CPU 
    compared to the CPU being emulated.

    3.1 Version Registers

        The Processor Version Register (PVR) and System Version Register (SVR)
        return the values of the CPU being emulated.

        Note: a guest should take care regarding what assumptions are made
        based on PVR as there are differences between the virtual
        CPU and the CPU being emulated as described in this specfication.

    3.2 Machine State Register (MSR)

        The machine state register (MSR) in the vcpu has differences in
        the following MSR fields as defined in Table 2-1.

             Bits    Name              Description
            ---------------------------------------------------------------
              35      GS       Guest state. MSR[GS] is read-only and is
                               always ‘1’. (See Note1)

              37     UCLE      User-mode cache lock enable.  Is writeable
                               and behaves as per the architecture if the
                               vcpu implements category "Embedded.Cache Locking".
                               Otherwise is '0' and is read-only.

              54      DE       Debug interrupt enable.  Is writeable and
                               and behaves as per the architecture if
                               DBCR0[EDM]=0.  If DBCR0[EDM]=1, then MSR[DE]=0
                               and is read-only. (see Note2)

              58      IS       On e500v2 'IS' must equal 'DS'
              59      DS       On e500v2 'DS' must equal 'IS'

            ---------------------------------------------------------------

         Notes
         -----

             Note1 - The MSR[GS] bit is defined only when the CPU being
                     emulated implements Category: Embedded.Hypervisor.

             Note2 - If the vcpu implements category "Embedded.Enhanced Debug",
                     when MSR[DE]=1, the registers SPRG9, DSRR0, and DSRR1
                     are volatile.

    3.3  CPU Index and Processor ID Register (PIR)

         The Processor ID Register is read-only.

         At virtual machine initialization, each vcpu in the virtual machine is
         assigned a unique index (within the partition) that can be used to
         distinguish the CPU from other CPUs in the partition.

         This CPU index value can be read by using the mfspr instruction to
         read the processor ID register (PIR).
 
         The CPU index is used in several instances:

            -The index enables software to detect whether a CPU is the boot
             CPU in an SMP configuration. The CPU index of the boot CPU is
             set by software in the device tree header (see ePAPR [3]).

            -If the vcpu implements Category: Embedded.Processor Control, the
             index is used as a parameter to the msgsnd and msgclr instructions
             to specify the targeted CPU for intra-partition signaling.

            -Interrupt source configuration in the VMPIC interrupt controller
             allows specifying the index of the CPU that is configured to
             receive the interrupt.

          Each CPU node is described in the device tree. The reg property
          for each CPU node has a value that matches the CPU index.

    3.4  External PID Load Context (EPLC) and External PID Store Context
         (EPSC) Registers

         A virtual CPU may implement [Category: Embedded.External PID]
         of the Power ISA.  EPLC and EPSC specify the context for external
         PID loads and stores as defined by the Power ISA.

         The EGS and ELPID fields in EPLC and EPSC specify the hypervisor
         context and are not accessible by supervisor level software on
         the vcpu.  Values written to the EGS and ELPID fields are
         ignored.

    3.5  Timebase (TLB and TBU)

         The TBU and TBL are read-only.

    3.6  L1 and L2 Cache Control Registers

        The behavior of the L1 and L2 cache control registers is dependent
        on whether the virtual CPU implements category "Embedded.Cache Locking".

        All L1 and L2 cache control registers and L2 error registers
        can be read regardless of whether category "Embedded.Cache Locking"
        is implemented.

        When category "Embedded.Cache Locking" is _not_ implemented:
           -The L1CSR0[CUL] and L1CSR1[ICUL] fields can be written.  For all
            other fields writes have no effect.

        When category "Embedded.Cache Locking" is implemented:
           -Writes to the flash lock clearing bits are supported--
            L1CSR0[CLFR], L1CSR1[ICLFR], L2CSR0[L2FLC]
           -Writes to the L1CSR0 sticky status bits are supported--
            L1CSR0[CUL], L1CSR0[CSLC], L1CSR0[CLO]
           -Writes to the L1CSR1 sticky status bits are supported--
            L1CSR1[ICUL], L1CSR1[ICSLC], L1CSR1[ICLO]
           -Writes to the L2CSR0 sticky status bits are supported--
            L2SCR0[L2LO]
           -Support for L1CSR0[DCBZ32] is implementation defined
           -For all other fields writes have no effect.

    3.7  Branch Unit Control and Status Register (BUCSR)

         If the cpu being emulated implements BUCSR, the BUCSR fields are
         identical to those of the cpu being emulated. The BUCSR can be read,
         but is not writeable on the vcpu. Writes are NOPs and do not affect
         architectural state.
      
    3.8  Core Device Control and Status Register 0 (CDCSR0)

         If the emulated CPU implements CDCSR0, the CDCSR0 fields are
         identical to those of the CPU being emulated. CDCSR0 can be read but
         is not writeable on the vcpu. Writes are a NOP and result in no
         architectural state changes.

    3.9  Debug Registers

         The DBCR0 register in the vcpu is always readable.

         If DBCR0[EDM]=1, then the implementation has not granted debug
         resources to the vcpu.  In this case all accesses to debug
         registers (except reading DBCR0) are boundedly undefined.

         If DBCR0[EDM]=0, then the debug registers implemented by
         the CPU being emulated are supported excepted as noted below.

         Writes to the debug registers and fields in Table 
         2-7 are not supported and ignored.  Reads return 0x0.

                               Table 2-7
               -----------------------------------------------
               Debug        Register fields
               Register     not supported
               -----------------------------------------------
                DBCR0       IDM - internal debug mode
                            FT - freeze timers
                            IRPT - interrupt taken
                            RET - return debug event
                            RST - reset

                DBSRWR      all fields

                DDAM        all fields

                DEVENT      all fields
               -----------------------------------------------

        When MSR[DE]=1, the registers SPRG9, DSRR0, and DSRR1 are
        volatile.

    3.10 PID1 and PID2 Registers

         The e500v2 vcpu does not implement the PID1 and PID2 SPRs.

    3.11 Embedded Processor Control Register (EPCR)

         EPCR is implented if category 64-bit is supported.  All bits
         in EPCR are reserved except for ICM.

             Bits    Name              Description
            ---------------------------------------------------------------
              38      ICM      Controls the computation mode for all interrupts.
                               At interrupt time, EPCR[ICM] is copied into
                               MSR[CM].

                                   0 Interrupts will execute in 32-bit mode.
                                   1 Interrupts will execute in 64-bit mode.
            ---------------------------------------------------------------

    3.12 Hardware Threads

        e6500 CPUs have multiple hardware threads, but threads may
        not be exposed in a virtual machine.  No assumption should
        be made about the number of threads available based mechanisms
        such as PIR.  A e6500 vcpu may only have one thread.

        The Thread Management Configuration Register 0 (TMCFG0) should
        be used to determine the number of threads in a virtual CPU.

4.  INSTRUCTIONS

    4.1 Cache Locking Instructions

        The behavior of cache locking instructions (dcbtls, dcbtstls,
        dcblc, icbtls, icblc) is dependent on whether the virtual CPU
        implements category "Embedded.Cache Locking".  When category
        "Embedded.Cache Locking" is implemented cache locking instructions
        behave as per the architecture.   If cache locking is not implemented,
        executing cache-locking instructions is effectively a nop-- the
        operation is ignored.

    4.3 System Call instruction

        The sc instruction behaves as per the architecture except for the
        following: in user mode MSR[PR=1], sc with LEV == 1 results in a
        program exception with ESR[PPR] set (privileged instruction exception).

    4.4 Wait for Interrupt Instruction

        The 'wait' instruction stops synchronous processor activity
        including the fetching of instructions until an asynchronous
        interrupt occurs.  It is possible that a spurious 'wakeup'
        could occur where instruction fetching is resumed even
        when no vcpu interrupt or no loss of reservation occurred.

    4.5 Reservations

        The ability to emulate an atomic operation using "load with
        reservation" instructions and "store conditional" instructions
        is based on the conditional behavior of "store conditional",
        the reservation set by "load with reservation", and the clearing
        of that reservation if the target location is modified by another
        processor or mechanism before the "store condtitional" performs
        its store.

        The following considerations should be understood regarding potential
        reservation loss. With the vcpu, a reservation may be broken for
        the following reasons:

           -The Power ISA lists reasons where reservation may be lost

           -An asynchronous interrupt in the physical CPU may cause a loss
            of a reservation, including interrupts not visible to or caused
            by guest software.

           -A reservation may be broken if software executes a privileged
            instruction or utilizes a privileged facility. Privileged
            instructions and facilities are defined by the Power ISA.

    4.5 tlbilx

        The tlbilx instruction is supported on e500mc, e5500, and e6500
        virtual CPU implementations even though category E.HV is not
        supported.

    4.6 msgsnd/msgclr

        The msgsnd and msgclr instructions are defined by cateogry
        "Embedded.Processor Control" and are supported if the category 
        is implemented on the cpu being emulated.

        The vcpu does not implement category "Embedded.Hypervisor".
        An attempt to use the E.HV features of msgsnd/msgclr is
        boundedly undefined.

5.  MMU

    5.1 Overview

        Software running on an vcpu implementation should not make
        assumptions about the configuration or geometry of the vcpu's
        MMU based on the PIR register.  Instead, software should determine
        MMU configuration from the MMUCFG and TLBnCFG registers.  The vcpu's
        MMU configuration may be different from the CPU being emulated.

    5.2 TLBnCFG NENTRY and ASSOC

        The Power ISA [1] specifies how TLBnCFG[NENTRY] and TLBnCFG[ASSOC]
        should be interpreted.  This definition is summarized in
        the table below:

          NENTRY  ASSOC      Meaning
          --------------------------------------------------------------------
           0       0         no TLB present

           0       1         TLB geometry is completely implementation-defined.
                             MAS0[ESEL] is ignored
 
           0       >1        TLB geometry and number of entries is
                             implementation defined, but has known
                             associativity.
                             For tlbre and tlbwe, a set of TLB entries is
                             selected by an implementation dependent function
                             of MAS8[TGS][TLPID], MAS1[TS][TID][TSIZE], and
                             MAS2[EPN]. MAS0[ESEL] is used to select among
                             entries in this set, except on tlbwe if
                             MAS0[HES]=1.

           n > 0   n or 0    TLB is fully associative
          --------------------------------------------------------------------

    5.3 IPROT=0

        A TLB entry with IPROT=0 may be evicted at any time.

    5.4 MMUCFG

        The LPIDSIZE field (bits 36-39) can be used by software to
        detect whether category E.HV is present.  A value of 0 indicates
        that E.HV functionality is not present.

6.  EXCEPTIONS

    6.1 Debug Interrupts

        The vcpu does not support delayed/deferred debug interrupts:

            -If MSR[DE]=0 and a debug condition occurs in the
             vcpu, no bit is set in DBSR.

            -Writes to DBSRWR have no effect.

            -Imprecise debug events (DBSR[IDE]) and unconditional debug
             events (DBSR[UDE]) are not supported.

            -If a debug event happens with MSR[DE] = 1, and the software
             running on the vcpu fails to clear DBSR before re-enabling
             MSR[DE] another debug interrupt will not occur.

7.  HYPERVISOR SPECIFIC CONSIDERATIONS

    7.1 Cacheable and Cache-inhibited Mappings on KVM

        As part of virtual machine initialization and setup, QEMU (the virtual
        machine manager) also creates mappings to a guest address regions.
        A guest must have I=1 for RAM mappings and I=0 for other mappings
        in order to avoid creating an architecture-violating alias with QEMU's
        mapping.