After having read my first book about compilers, I decided to try another one more focused on optimizations. For that, I chose “Advanced Compiler Design and Implementation“, by Steven S. Muchnick.
This book covers several subjects about compilers, but more than 60% of the text is about compiler optimizations.
The first chapter introduces the main concepts of compiler design. It also explains why optimization is so important in a compiler.
The algorithms of this book are presented in ICAN (Informal Compiler Algorithm Notation) notation. The chapter two provides both a brief and a full description of this notation. In my case, the brief description has been enough to understand the algorithms presented in the following chapters, but it can be useful for a deep understanding of the notation to read the full description.
The next chapter covers Symbol Table. It also includes a way to generate load and store instructions directly based on the information contained in the Symbol Table. Then, the fourth chapter presents the intermediate representations used in that book. This book uses three different intermediate languages: A high-level one, a medium-level one and a low-level. This chapter covers each of them in details. The importance of the design of an intermediate representation is also discussed here. There will be two more intermediate forms used in the book, static single-assignment (SSA) and program dependence graphs that are discussed later in the book.
The chapter five gives some information about the different runtime support of some architectures. It is very useful to know how to handle high-level languages at runtime. The next one is about producing code generators automatically from machine descriptions. Three approaches are covered in this chapter.
With the seventh chapter, the optimization techniques start. This chapter covers control-flow analysis. It will introduce several techniques that can be used to perform this kind of analysis, namely depth first search and dominators, interval analysis and structural analysis. These analysis can be used to identify structures like loops and branches in the intermediate representations. The chapter eight covers data-flow analysis. This chapter introduces a lot of mathematical concepts like lattices or flow functions. It takes some time to understand completely the concepts of this chapter, but the explanations are very good. Again, three methods of doing this analysis are studied. It covers iterative data-flow analysis, control-tree techniques and slotwise analysis. Another techniques are also introduced, but not covered in details. The chapter 9 covers dependence analysis. This analysis will be vital for optimizations on arrays and loops and to instruction scheduling techniques that will be studied later. Finally, the chapter ten introduces alias-analysis techniques.
Once the analysis techniques have been covered, the other chapters are about optimization themselves. The chapter 11 introduces optimizations. It explains which optimizations should be performed at which level and in which order. It also describes briefly the optimizations that are covered in the next chapters. You will see that the following chapters are very rich, each of them containing a lot of optimizations that can be performed.
The first optimizations that are covered (in chapter 12) are the so-called early optimizations. It includes scalar replacement of aggregates, value numbering, copy propagation and sparse conditional constant propagation. It also covers constant folding and algebraic simplifications. After that, the optimizations that reduce redundancy are covered. Again, several techniques are covered, common subexpression elimination, forward substitution, loop invariant code motion, partial redundancy elimination and code hoisting. Then, the loop optimizations are introduced. This chapter first introduces a way to identify induction variables in a loop and then covers some optimization that can be used. For example, strength reduction and unnecessary bounds checking optimizations are covered.
The next two chapters are more related to low-level problematic. The chapter 15 covers optimizations that can be applied to reduce the cost of procedures. It discusses tail-call optimization, procedure integration, in-line expansion, leaf-routine optimization and shrink wrapping. The, the chapter 16 covers a very important subject that is Register Allocation. It covers several techniques like cost based methods and global graph coloring.
The chapter 17 deals with code scheduling. It is a technique that reorder instructions to take best advantage of the pipelines built into ,modern processors. First, local approaches (within a basic block) are discussed and then optimization for scheduling across basic-block boundaries are covered. For the two subjects, several techniques are discussed. The chapter 18 covers low-level optimizations like unreachable-code elimination, loop inversion, dead-code elimination, etc… This chapter is very broad and very interesting too.
The chapter 19 covers more complex optimization: the inteprocedural optimizations. Several techniques for doing inteprocedural analysis are covered in details as well as several optimizations depending on these analysis, like constant propagation. This chapter is not very simple to understand and even less to apply, but it is very interesting. The chapter 20 is the last about optimizations. It covers techniques to improve the memory hierarchy usage. The first optimizations are about instruction-cache: instruction prefetching, procedure sorting and procedure splitting for example. Then, data-cache optimizations are covered. It includes data prefetching and scalar replacement of array elements in details and gives an outline for some other optimizations.
Finally, the chapter 21 studies four different compilers to see what optimizations are applied and in which order. Their intermediate forms are also studied. It is very interesting how this is done in real-world compiler.
To conclude, I think that this book is really great. It covers a lot of optimizations that can be implemented in a compiler. All the optimizations are covered in details with code samples and examples of applying the optimization on some code. However, it has to be said that this book is not easy to read and sometimes it is hard to understand exactly what means a specific optimization and in what it differs from some close technique. If you want to write an aggressive optimizer compiler or just write some optimizations for an existing one, you should consider to take a look at this book.
If you know another good book on Compilers, I will be glad to hear about it.