Soot (software)
In static program analysis, Soot is a language manipulation and optimization framework consisting of intermediate languages for the Java programming language. It has been developed by the Sable Research Group at McGill University known for its SableVM, a Java virtual machine and the AspectBench Compiler, an open research compiler for AspectJ. In 2010, two research papers on Soot (Vallée-Rai et al. 1999 and Pominville et al. 2000) were selected as IBM CASCON First Decade High Impact Papers among 12 other papers from the 425 entries.[1]
Soot provides four intermediate representations for use through its API for other analysis programs to access and build upon:[2]
- Baf: a near bytecode representation.
- Jimple: a simplified version of Java source code that has a maximum of three components per statement.
- Shimple: an SSA variation of Jimple (similar to GIMPLE).
- Grimp: an aggregated version of Jimple suitable for decompilation and code inspection.
The current Soot software release also contains detailed program analyses that can be used out-of-the-box, such as context-sensitive flow-insensitive points-to analysis,[3] call-graph analysis and domination analysis (answering the question "must event a follow event b?"). It also has a decompiler called dava.
Soot is free software available under the GNU Lesser General Public License (LGPL).
Jimple
Jimple is an intermediate representation of a Java program designed to be easier to optimize than Java bytecode. It is typed, has a concrete syntax and is based on three-address code.
Jimple It includes only 15 different operations, thus simplifying flow analysis. By contrast, java bytecode includes over 200 different operations.[4][5]
Unlike java bytecode, in Jimple local and stack variables are typed and Jimple is inherently type safe.
Converting to Jimple, or "Jimplifying" (after "simplifying"), is conversion of bytecode to three-address code. The idea behind the conversion, first investigated by Clark Verbrugge, is to associate a variable to each position in the stack. Hence stack operations become assignments involving the stack variables.
Example
Consider the following bytecode, which is from the [6]
iload 1 // load variable x1, and push it on the stack iload 2 // load variable x2, and push it on the stack iadd // pop two values, and push their sum on the stack istore 1 // pop a value from the stack, and store it in variable x1
The above translates to the following three-address code:
stack1 = x1 // iload 1 stack2 = x2 // iload 2 stack1 = stack1 + stack2 // iadd x1 = stack1 // istore 1
In general the resulting code does not have static single assignment form.
References
- ↑ http://dl.acm.org/citation.cfm?id=1925805
- ↑ http://www.sable.mcgill.ca/soot/
- ↑ http://www.sable.mcgill.ca/soot/tutorial/analysis/index.html
- ↑ Vallee-Rai, Raja (1998). "The Jimple Framework".
- ↑ Vallee-Rai, Raja; Hendren, Laurie J. (1998). "Jimple: Simplifying Java Bytecode for Analyses and Transformations".
- ↑ Vallee-Rai 1998.
Further reading
- Vallée-Rai, Raja; Co, Phong; Gagnon, Etienne; Hendren, Laurie; Lam, Patrick; Sundaresan, Vijay (1999). "Soot: A Java bytecode optimization framework". Proceedings of the 1999 conference of the Centre for Advanced Studies on Collaborative research (CASCON '99). Republished in "CASCON First Decade High Impact Papers". CASCON '10.
- Pominville, Patrice; Qian, Feng; Vallée-Rai, Raja; Hendren, Laurie; Verbrugge, Clark (2000). "A framework for optimizing Java using attributes". Republished in "CASCON First Decade High Impact Papers". CASCON '10.
- Lam, Patrick; Bodden, Eric; Lhoták, Ondřej; Hendren, Laurie (2011). "The Soot framework for Java program analysis: a retrospective" (PDF). Cetus Users and Compiler Infrastructure Workshop.