Popek and Goldberg virtualization requirements
The Popek and Goldberg virtualization requirements are a set of conditions sufficient for a computer architecture to support system virtualization efficiently. They were introduced by Gerald J. Popek and Robert P. Goldberg in their 1974 article "Formal Requirements for Virtualizable Third Generation Architectures".[1] Even though the requirements are derived under simplifying assumptions, they still represent a convenient way of determining whether a computer architecture supports efficient virtualization and provide guidelines for the design of virtualized computer architectures.
VMM definition
System virtual machines are capable of virtualizing a full set of hardware resources, including a processor (or processors), memory and storage resources and peripheral devices. A virtual machine monitor (VMM, also called hypervisor) is the piece of software that provides the abstraction of a virtual machine. There are three properties of interest when analyzing the environment created by a VMM:[2]
- Equivalence / Fidelity
- A program running under the VMM should exhibit a behavior essentially identical to that demonstrated when running on an equivalent machine directly.
- Resource control / Safety
- The VMM must be in complete control of the virtualized resources.
- Efficiency / Performance
- A statistically dominant fraction of machine instructions must be executed without VMM intervention.
In the terminology of Popek and Goldberg, a VMM must present all three properties. In the terminology used in the reference book of Smith and Nair (2005), VMMs are typically assumed to satisfy the equivalence and resource control properties, and those additionally meeting the performance property are called efficient VMMs.[3]
Popek and Goldberg describe the characteristics that the instruction set architecture (ISA) of the physical machine must possess in order to run VMMs which possess the above properties. Their analysis derives such characteristics using a model of "third generation architectures" (e.g., IBM 360, Honeywell 6000, DEC PDP-10) that is nevertheless general enough to be extended to modern machines. This model includes a processor that operates in either system or user mode, and has access to linear, uniformly addressable memory. It is assumed that a subset of the instruction set is available only when in system mode and that memory is addressed relative to a relocation register. I/O and interrupts are not modelled.
Virtualization theorems
To derive their virtualization theorems, which give sufficient (but not necessary) conditions for virtualization, Popek and Goldberg introduce a classification of instructions of an ISA into 3 different groups:
- Privileged instructions
- Those that trap if the processor is in user mode and do not trap if it is in system mode (supervisor mode).
- Control sensitive instructions
- Those that attempt to change the configuration of resources in the system.
- Behavior sensitive instructions
- Those whose behavior or result depends on the configuration of resources (the content of the relocation register or the processor's mode).
The main result of Popek and Goldberg's analysis can then be expressed as follows.
Theorem 1. For any conventional third-generation computer, an effective VMM may be constructed if the set of sensitive instructions for that computer is a subset of the set of privileged instructions.
Intuitively, the theorem states that to build a VMM it is sufficient that all instructions that could affect the correct functioning of the VMM (sensitive instructions) always trap and pass control to the VMM. This guarantees the resource control property. Non-privileged instructions must instead be executed natively (i.e., efficiently). The holding of the equivalence property also follows.
This theorem also provides a simple technique for implementing a VMM, called trap-and-emulate virtualization, more recently called classic virtualization: because all sensitive instructions behave nicely, all the VMM has to do is trap and emulate every one of them.[4][5]
A related problem is that of deriving sufficient conditions for recursive virtualization, that is, the conditions under which a VMM that can run on a copy of itself can be built. Popek and Goldberg present the following (sufficient) conditions.
Theorem 2. A conventional third-generation computer is recursively virtualizable if:
- it is virtualizable and
- a VMM without any timing dependencies can be constructed for it.
Some architectures, like the non-hardware-assisted x86, do not meet these conditions, so they cannot be virtualized in the classic way. But architectures can still be fully virtualized (in the x86 case meaning at the CPU and MMU level) by using different techniques like binary translation, which replaces the sensitive instructions that do not generate traps,[4] which are sometimes called critical instructions. This additional processing however makes the VMM less efficient in theory,[5] but hardware traps have non-negligible performance cost as well. A well-tuned caching binary translation system may achieve comparable performance, and it does in the case of x86 binary translation relative to first generation x86 hardware assist, which merely made sensitive instructions trappable.[6] Effectively this gives a theorem with different sufficiency conditions.
Handling critical instructions
The conditions for ISA virtualization expressed in Theorem 1 may be relaxed at the expense of the efficiency property. VMMs for non-virtualizable ISAs (in the Popek and Goldberg's sense) have routinely been built.
The virtualization of such architectures requires correct handling of critical instructions, i.e., sensitive but unprivileged instructions. One approach, known as patching, adopts techniques commonly used in dynamic recompilation: critical instructions are discovered at run-time and replaced with a trap into the VMM. Various mechanisms, such as the caching of emulation code or hardware assists, have been proposed to make the patching process more efficient. A different approach is that of paravirtualization, which requires guest operating systems to be modified (ported) before running in the virtual environment.
Instruction sets of common architectures
This section presents some relevant architectures and how they relate to the virtualization requirements.
PDP-10
The PDP-10 architecture has a few instructions which are sensitive (alter or query the processor's mode) but not privileged.[7] These instructions save or restore the condition codes containing USER or IOT bits:
- JSR: jump to subroutine
- JSP: jump and save program counter
- PUSHJ: push down and jump
- JRST: jump and restore
System/370
All sensitive instructions in the System/370 are privileged: it satisfies the virtualization requirements.[8]
Motorola MC68000
The Motorola MC68000 has a single unprivileged sensitive instruction:
- MOVE from SR
This instruction is sensitive because it allows access to the entire status register, which includes not only the condition codes but also the user/supervisor bit, interrupt level, and trace control. In most later family members, starting with the MC68010, the MOVE from SR instruction was made privileged, and a new MOVE from CCR instruction was provided to allow access to the condition code register only.[9] [10]
IA-32 (x86)
The IA-32 instruction set of the Pentium processor contains 18 sensitive, unprivileged instructions.[11] They can be categorized in two groups:
- Sensitive register instructions: read or change sensitive registers or memory locations such as a clock register or interrupt registers:
- SGDT, SIDT, SLDT
- SMSW
- PUSHF, POPF
- Protection system instructions: reference the storage protection system, memory or address relocation system:
- LAR, LSL, VERR, VERW
- POP
- PUSH
- CALL FAR, JMP FAR, INT n, RETF
- STR
- MOV (segment registers)
The introduction of the AMD-V and Intel VT-x instruction sets in 2005 allows x86 processors to meet the Popek and Goldberg virtualization requirements.
IA-64
The effort needed to support virtualization on the IA-64 architecture is described in a 2000 article by Magenheimer and Christian.[12]
SPARC
A "hyperprivileged" mode for the UltraSPARC architecture was specified in UltraSPARC Architecture 2005.'[13] It defines a sun4v platform[14] which is a super-set of the sun4u platform, but is still compliant to the SPARC v9 Level-1[15] specification.
PowerPC
All sensitive instructions in the PowerPC instruction set are privileged.[16][17]
Performance in practice
The efficiency requirement in Popek and Goldberg's definition of a VMM concerns only the execution of non-privileged instructions, which must execute natively. This is what distinguishes a VMM from the more general class of hardware emulation software. Unfortunately, even on an architecture that meets Popek and Goldberg's requirements, the performance of a virtual machine can differ significantly from the actual hardware. Early experiments performed on the System/370 (which meets the formal requirements of Theorem 1) showed that performance of a virtual machine could be as low as 21% of the native machine in some benchmarks. The cost of trapping and emulating privileged instructions in the VMM can be significant. This led the IBM engineers to introduce a number of hardware assists, which roughly doubled the performance of the System/370 virtual machines.[18] Assists were added in several stages. In the end, there were over 100 assists on the late models System/370.[19]
One of the main driving factors for the development of hardware assists for the System/370 was virtual memory itself. When the guest was an operating system that itself implemented virtual memory, even non-privileged instructions could experience longer execution times - a penalty imposed by the requirement to access translation tables not used in native execution (see shadow page tables).[20]
References
- ↑ Popek, G. J.; Goldberg, R. P. (July 1974). "Formal requirements for virtualizable third generation architectures". Communications of the ACM 17 (7): 412–421. doi:10.1145/361011.361073.
- ↑ Rogier Dittner, David Rule, The best damn server virtualization book period, Syngress, 2007, ISBN 1-59749-217-5, p. 19
- ↑ Smith and Nair, p. 387
- 1 2 Adams and Agesen, 2006, pp. 2-3
- 1 2 Smith and Nair, p. 391
- ↑ Adams and Agesen, p. 1 and 5
- ↑ S. W. Galley (1969). "PDP-10 Virtual machines". Proc. ACM SIGARCH-SIGOPS Workshop on Virtual Computer Systems. pp. 30–34.
- ↑ Smith and Nair, p. 395
- ↑ M68000 8-/16-32-Bit Microprocessor User's Manual, Ninth Edition. Phoenix, AZ, USA: Motorola, Inc. 1993.
- ↑ Motorola M68000 Family Programmer's Reference Manual. Phoenix, AZ, USA: Motorola, Inc. 1992.
- ↑ John Scott Robin and Cynthia E. Irvine (2000). "Analysis of the Intel Pentium's Ability to Support a Secure Virtual Machine Monitor". Proc. 9th USENIX Security Symposium.
- ↑ Daniel J. Magenheimer and Thomas W. Christian (2000). "vBlades: Optimized Paravirtualization for the Itanium Processor Family". Proc. 3rd Virtual Machine Research & Technology Symposium. USENIX. pp. 73–82.
- ↑ Weaver, David (2007-05-17). UltraSPARC Architecture 2005: One Architecture.... Multiple Innovative Implementations (DraftD0.9) (PDF). Santa Clara, CA, USA: Sun Microsystems, Inc.
- ↑ Sun Microsystems, Inc. (2006-01-24). UltraSPARC Virtual Machine Specification (PDF). Santa Clara, CA, USA.
- ↑ Weaver, David L.; Tom Germond (1994). The SPARC Architecture Manual: Version 9 (PDF). San Jose, CA, USA: SPARC International, Inc. ISBN 0-13-825001-4.
- ↑ http://www.pagetable.com/?p=15
- ↑ http://www.cs.cmu.edu/~410-s07/lectures/L38_Virtualization.pdf
- ↑ Smith and Nair, p. 415-416 and 426
- ↑ Gum, p. 535
- ↑ Gum, p. 533
- Notes
- Smith, James; Ravi Nair (2005). Virtual Machines. Morgan Kaufmann. ISBN 1-55860-910-5.
- Adams, Keith; Agesen, Ole (October 21–25, 2006). "A Comparison of Software and Hardware Techniques for x86 Virtualization" (PDF). Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA, USA, 2006. ACM 1-59593-451-0/06/0010. Retrieved 2006-12-22.
- P. H. Gum, System/370 Extended Architecture: Facilities for Virtual Machines, IBM J. Res. Develop., Vol. 27, No. 6, Nov. 1983, pp. 530–544