Some compilers are using Three-Address-Code (TAC) as an intermediate representation. This representation is very simple to understand and write. Moreover, it’s easy to run some optimization on this representation.
Each TAC statement has this general form : result = operand1 operator operand2
For example, here are some TAC statements:
a = 1 x = a * 3 if x > a goto test param "dddd" call print test: param "asdf" call print
In this post, we will see some of the local optimizations that can be applied on TAC. A local optimization is an optimization that is applied locally to a basic block. A basic block is a set of TAC statements that has only one entry point and one exit point. Once the first instruction of the basic block is executed, the rest of the instructions are necessarily executed exactly once. These optimizations are easy to design and implement. If you want to run global optimizations (through all the basic blocks of a function) or even Interprocedural Optimization (IPO), you will need a far more complex framework to run optimizations. I will try to write something on global optimization when I will have implemented some of them in EDDI.
The goal of optimization is of course to replace some statements with more efficient statements.
The list presented in this post is not exhaustive, this is only the optimizations that I’ve implemented in EDDIC, but this represent most of the local optimizations.
The first three optimization techniques can be applied independently on each statement of the program.
1. Arithmetic Identities
The first optimization is about arithmetic identities. There are some properties in math that we can use to simplify simple TAC statement.
Here are all the identities that are simplified in EDDIC:
- x = a + 0 or x = 0 + a => x = a
- x = a – 0 => x = a
- x = 0 – a => x = -a
- x = a – a => x = 0
- x = a * 1 or x = 1 * a => x = a
- x = a * 0 or x = 0 * a => x = 0
- x = a * -1 or x = -1 * a => x = -a
- x = a / 1 => x = a
- x = a / -1 => x = -a
- x = 0 / a => x = 0
- x = a / a => x = 1
All the expressions on the right are more efficient to compute than the one on the left.
2. Reduce in strength
Another easy optimization is the reduction of strength of some math operations. For example, an addition is cheaper than multiplication and multiplication is cheaper than division. If your language does not have floating point math, the only reduction that can be done is this one:
Here are all the identities that are simplified in EDDIC:
- x = 2 * a or x = a * 2 => x = a + a
With floating point math, we can do a little better:
- x = a / 2 => x = a * 0.5
- x = a / 4 => x = a * 0.25
- etc…
3. Constant folding
When both operands on the right side of the TAC statement are integers, we can replace the math operation directly by the result of the computation.
With a and b being any integer, we can transform these TAC statements:
- x = 1 + 2 => x = 3
- x = 3 – 1 => x = 2
- x = 3 * 2 => x = 6
- x = 5 / 2 => x = 2
- x = 5 % 2 => x = 1
We can also use this optimization to simplify conditional jumps. For example, if 3 > 2 goto B2 can be replaced by goto B2.
More than being way more efficient statements, it also enables other optimization to be performed on the TAC program.