Thursday 30 September 2010

Compiler Optimizations

The task of every compiler is to translate high-level language source code to machine code that will run on target processor. This may be achieved through assembly language file or by linking object files, the ultimate goal of every compiler is to generate code for the target processor. In principle this is a simple task. Every high level statement can be translated in a series of target instructions. However, without some optimizations this code would be very inefficient. Unoptimized code still works, but it is slower and the files are bigger.


The nature of high-level statements is to operate with variables. Loading and storing of variables can happen in any order. But transfers from and to memory are slower comparing to transfers between registers. And if some value is stored to memory location and immediately after it is needed again in a different calculation, then it doesn't make sense to load it again since it was already present in some register. With proper register loading we can save a lot of redundant loads and stores. There are many optimization algorithms to make the code as efficient as possible. In fact, the compiler optimization is a science. There are many books written on this subject.

Most optimization algorithms are based on control and data flow analysis. There are many optimization approaches: to reduce jumps, to remove dead, redundant or common code, to remove redundant loads and stores, to use registers efficiently, to replace slow instructions with faster ones, to replace specific arithmetic calculations with short instructions, etc. Each optimization is based on some code or data properties. Some of the common optimizations include: constant folding, integer arithmetic optimizations, dead code elimination, branch elimination, code-block reordering, loop-invariant code motion, loop inversion, induction variable elimination, Instruction selection, instruction combining, register allocation, common sub-expression elimination, peephole optimizations, etc.

The basic rule of every optimization is that the original functionality of the program should not be changed. All optimization algorithms are based on the assumption that the program under analysis is the only one who changes memory locations. In reality interrupts or hardware ports can break this rule. Therefore proper actions must be taken to prevent optimizations on memory locations which might be modified outside the code we are trying to optimize. An additional programming technique that complicates optimizations is using pointers. But with proper analysis it is possible to apply safe optimizations on most parts of the code.

You can optimize the code for speed or for code size. In each case the program runs faster or occupies less memory. The later is extremely important in embedded programming where the memory of microcontrollers is limited. Compiler optimizations are an important part of every compiler. Unoptimized code is a waste of memory and time on any target system.

A practical example of  compiler optimizations in action is the Pascal compiler for 8051 microcontrollers which uses Turbo Pascal syntax and generates optimized code. If you are interested in practical compiler construction you can examine the Turbo Pascal internals. This is Turbo Pascal compiler written in Turbo Pascal which can be used as an excellent book on compiler design.

No comments:

Post a Comment