Physics processing unit
A physics processing unit (PPU) is a dedicated microprocessor designed to handle the calculations of physics, especially in the physics engine of video games.
Examples of calculations involving a PPU might include rigid body dynamics, soft body dynamics, collision detection, fluid dynamics, hair and clothing simulation, finite element analysis, and fracturing of objects.
The idea is that specialized processors offload time consuming tasks from a computer's CPU, much like how a GPU performs graphics operations in the main CPU's place. The term was coined by Ageia to describe its PhysX chip. Several other technologies in the CPU-GPU spectrum have some features in common with it, although Ageia's solution was the only complete one designed, marketed, supported, and placed within a system exclusively as a PPU.
History
An early academic PPU research project[1][2] named SPARTA (Simulation of Physics on A Real-Time Architecture) was carried out at Penn State[3] and University of Georgia. This was a simple FPGA based PPU that was limited to two dimensions. This project was extended into a considerably more advanced ASIC-based system named HELLAS.
February 2006 saw the release of the first dedicated PPU PhysX from Ageia (later merged into nVidia). The unit is most effective in accelerating particle systems, with only a small performance improvement measured for rigid body physics.[4] The Ageia PPU is documented in depth in their US patent application #20050075849.[5] Nvidia/Ageia no longer produces PPUs and hardware acceleration for physics processing is now supported through some of their graphics processing units.
-
Example SPARTA animation
-
SPARTA Printed circuit board
-
Hellas die photo
AGEIA PhysX
The first processor to be advertised as a PPU was called the PhysX chip, introduced by a fabless semiconductor company called AGEIA. Games wishing to take advantage of the PhysX PPU must use AGEIA's PhysX SDK, (formerly known as the NovodeX SDK).
It consists of a general purpose RISC core controlling an array of custom SIMD floating point VLIW processors working in local banked memories, with a switch-fabric to manage transfers between them. There is no cache-hierarchy as in a CPU or GPU.
The PhysX was available from three companies akin to the way video cards are manufactured. ASUS, BFG Technologies,[6] and ELSA Technologies were the primary manufacturers. PCs with the cards already installed were available from system builders such as Alienware, Dell, and Falcon Northwest.[7]
In February 2008, after Nvidia bought Ageia Technologies and eventually cut off the ability to process PhysX on the AGEIA PPU and NVIDIA GPUs in systems with active ATi/AMD GPUs, it seemed that PhysX went 100% to Nvidia. But in March 2008, Nvidia announced that it will make PhysX an open standard for everyone,[8] so the main graphic-processor manufacturers will have PhysX support in the next generation graphics cards. Nvidia announced that PhysX will also be available for some of their released graphics cards just by downloading some new drivers.
See physics engine for a discussion of academic research PPU projects.
PhysX P1 (PPU) hardware specifications
ASUS and BFG Technologies bought licenses to manufacture alternate versions of AGEIA's PPU, the PhysX P1 with 128 MB GDDR3:
- Multi-core device based on the MIPS instruction set with integrated physics acceleration hardware and memory subsystem with "tons of cores"[9][10]
- 125 million transistors[11]
- 182 mm2 die size
- Fabrication process: 130 nm
- Peak power consumption: 30 W
- Memory: 128 MB GDDR3 RAM with 128-bit interface
- 32-bit PCI 3.0 (ASUS also made a PCI Express version card)
- Sphere collision tests: 530 million per second (maximum capability)
- Convex collision tests: 530,000 per second (maximum capability)
- Peak instruction bandwidth: 20 billion per second
Havok FX
The Havok SDK is a major competitor to the PhysX SDK. Used in more than 150 games, including major titles like Half-Life 2, Halo 3 and Dead Rising.[12]
To compete with the PhysX PPU, an edition known as Havok FX was to take advantage of multi-GPU technology from ATI (CrossFire) and NVIDIA (SLI) using existing cards to accelerate certain physics calculations.[13]
Havok's solution divides the physics simulation into effect and gameplay physics, with effect physics being offloaded (if possible) to the GPU as Shader Model 3.0 instructions and gameplay physics being processed on the CPU as normal. The important distinction between the two is that effect physics do not affect gameplay (dust or small debris from an explosion, for example); the vast majority of physics operations are still performed in software. This approach differs significantly from the PhysX SDK, which moves all calculations to the PhysX card if it is present.
Since Havok's acquisition by Intel, Havok FX appears to have been shelved or cancelled.[14]
GPUs vs PPUs
The drive toward GPGPU has made GPUs more suitable for the job of a PPU; DX10 added integer data types, unified shader architecture, and a geometry shader stage which allows a broader range of algorithms to be implemented; Modern GPUs support compute shaders, which run across an indexed space and don't require any graphical resources, just general purpose data buffers. NVidia CUDA provides a little more in the way of inter-thread communication and scratchpad-style workspace associated with the threads.
Nonetheless GPUs are built around a larger number of longer latency, slower threads, and designed around texture & framebuffer data paths, and poor branching performance; this distinguishes them from PPU's & the Cell as being less well optimized for taking over game world simulation tasks.
The Codeplay Sieve compiler supports the PPU, indicating that the Ageia physX chip would be suitable for GPGPU type tasks. However Ageia seem unlikely to pursue this market.
Intel Larrabee & AMD Fusion
It is speculated that Intel's Larrabee (a throughput-optimized many-core implementation of the x86 architecture) will be well-suited to the role of a PPU; like the Cell, it sits between the CPU and the GPU in the spectrum of general purpose processing versus specialized high-performance back-end processing. Intel has confirmed that Larrabee's memory architecture will not use scratchpads like the Cell or Ageia PPU, and will instead be closer to a conventional CPU cache hierarchy. However, it will have extensions to enable high-throughput computing (most likely a full complement of cache-control instructions).
AMD have declared their long term intention to enable AMD APUs to use Radeon as a vector co-processor, sharing resources such as cache hierarchy. This future configuration started materializing in the form of Heterogeneous System Architecture.
PS2 - VU0
Although very different from the PhysX, one could argue the PlayStation 2's VU0 is an early, limited implementation of a PPU. Conversely, one could describe a PPU to a PS2 programmer as an evolved replacement for VU0. Its feature-set and placement within the system is geared toward accelerating game update tasks including physics and AI; it can offload such calculations working off its own instruction stream whilst the CPU is operating on something else. Being a DSP however, it is much more dependent on the CPU to do useful work in a game engine, and would not be capable of implementing a full physics API, so it cannot be classed as a PPU. Also VU0 is capable of providing additional vertex processing power, though this is more a property of the pathways in the system rather than the unit itself.
This usage is similar to Havok FX or GPU physics in that an auxiliary unit's general purpose floating point power is used to complement the CPU in either graphics or physics roles.
See also
- Physics Abstraction Layer
- Microsoft Robotics Studio
- Scratchpad RAM - relevant to the distributed memory architecture of the Ageia PhysX PPU
- GPGPU - for applications of existing GPUs to the same physics problems PPUs are designed for
- CELL
- adapteva
- openCL
References
- ↑ S. Yardi, B. Bishop, T. Kelliher, "HELLAS: A Specialised Architecture for Interactive Deformable Object Modeling", ACM Southeast Conference, Melbourne, FL, March 10–12, 2006, pp. 56–61.
- ↑ B. Bishop, T. Kelliher, "Specialized Hardware for Deformable Object Modeling," IEEE Transactions on Circuits and Systems for Video Technology, 13(11):1074–1079, Nov. 2003.
- ↑ "SPARTA Homepage". Cse.psu.edu. Retrieved 2010-08-16.
- ↑ "Exclusive: ASUS Debuts AGEIA PhysX Hardware". AnandTech. Retrieved 2010-08-16.
- ↑ "United States Patent Application: 0050086040". Appft1.uspto.gov. Retrieved 2010-08-16.
- ↑ http://www.bfgtech.com/news_8.31.05.html
- ↑ "BFG Tech ad for the PhysX". Maximum PC (Future US). May 2006. p. 6. ISSN 1522-4279. Retrieved 2009-09-16.
- ↑ Nvidia offers PhysX support to AMD / ATI
- ↑ "PhysX FAQ". NVIDIA Corporation.
- ↑ Nicholas Blachford (2006). "Lets Get Physical: Inside The PhysX Physics Processor".
- ↑ Legit Reviews - ASUS's AGEIA PhysX P1 Card
- ↑ Games using Havok
- ↑ Havok FX product information
- ↑ Shilov, Anton (2007-11-19). "GPU Physics Dead for Now, Says AMD’s Developer Relations Chief". Xbit Laboratories. Retrieved 2007-11-26.
External links
- AGEIA Official Website (no longer available)
- AGEIA Physx Processor Website (no longer available)
- Projects using PhysX SDK (no longer available)
- BFG AGEIA PhysX Card Review
- Planet PhysX News & Information Page (no longer available)
- PC Hardware: AGEIA PhysX Interview (no longer available)
- PC Perspective: AGEIA PhysX Physics Processing Unit Preview (no longer available)
- Havok FX physics engine (middleware library) SDK (no longer available)
- NVIDIA CUDA Toolkit and SDK
- PhysX Toolkit and SDK
|