Three-Address Code: Compiler Optimization Techniques Explained

Understanding Intermediate Code in Modern Compilers

When programmers write source code, they prioritize readability and functionality over raw efficiency. This is where compilers step in, transforming human-friendly code into optimized machine instructions. Three-address code serves as a crucial intermediate representation (IR) that bridges high-level logic and low-level execution. Unlike abstract syntax trees, three-address code provides a linear, assembly-like structure where each instruction contains at most one operator and three operands. For example, x = y + z * w decomposes into:

t1 = z * w
t2 = y + t1
x = t2

This decomposition enables powerful machine-independent optimizations before final code generation. Industry compilers like GCC and LLVM leverage such IRs to perform transformations that would be impractical directly on source code or syntax trees.

How Three-Address Code Enables Optimization

Three-address code’s simplicity creates optimization opportunities through its explicit sequencing and reduced operational complexity. Key characteristics include:

Explicit temporary variables for complex expression evaluation
Symbolic labels for control flow (e.g., if x < 10 goto L3)
Backpatching to resolve forward jumps during single-pass generation
Machine agnosticism allowing front-end specific optimizations

During IR generation, compilers replace source variables with symbol table pointers. Conditional jumps use labels like L1, which are later resolved to actual addresses. The video reveals a critical insight: "Backpatching allows compilers to generate jump targets in a single pass by maintaining a 'to-fix' list." This technique is foundational to efficient compilation.

Basic Blocks: The Units of Local Optimization

Compilers partition three-address code into basic blocks—sequential instructions with single entry and exit points. Control flow only enters at the start and exits via jumps or fall-through at the end. Consider this loop example:

L1: if x > y goto L2
    z = 0
    goto L3
L2: z = 1
L3: i = i + 1
    if i < 10 goto L1

Basic blocks identified:

L1 to goto L3 (conditional block)
L2: z = 1 (assignment block)
L3 to end (loop increment block)

Why basic blocks matter:

Enable local optimizations without considering entire program flow
Allow instruction reordering within blocks
Form nodes in control flow graphs (CFGs) for global analysis
Simplify dead code elimination and register allocation

Optimization Techniques in Action

Within basic blocks, compilers apply these key optimizations:

Dead Code Elimination

Instructions assigning values never used later are removed. For example:

a = 5  // Never referenced elsewhere
b = x * 2

→ The a=5 assignment is deleted.

Common Subexpression Elimination

Repeated calculations are identified and cached:

t1 = x + y
z = t1 * 3
t2 = x + y  // Duplicate
w = t2 / 4

→ Transforms to:

t1 = x + y
z = t1 * 3
w = t1 / 4  // Reuses t1

Constant Folding and Propagation

Compilers precompute constant expressions:

x = 3
y = x * 5  // 15
z = y + 2  // 17

→ Becomes:

x = 3
y = 15
z = 17

Algebraic Simplification

Mathematical identities simplify operations:

a = b + 0  → a = b
c = d * 1  → c = d
e = f * 2  → e = f + f  // Faster on some architectures

Advanced Techniques: DAGs and Peephole Optimization

Beyond basic blocks, compilers deploy:

Directed Acyclic Graphs (DAGs):

Represent expressions while identifying common subexpressions
Enable optimal instruction sequencing during code regeneration

Peephole Optimization:
A "sliding window" examines short instruction sequences to:

Remove redundant jumps (e.g., goto L1 immediately followed by goto L2 → direct goto L2)
Simplify sequences (e.g., x = x + 0 → eliminated)
Replace slow instructions with faster equivalents (e.g., mul x, 8 → left_shift x, 3)

Why Optimization Isn't Perfect

Despite sophisticated techniques:

Full optimization is computationally infeasible (NP-hard problems)
Compilers prioritize safe transformations preserving program behavior
Trade-offs exist between compilation time and optimization depth
Target-specific optimizations occur later in the pipeline

As the video emphasizes: "Optimization is a misnomer—compilers improve code, but 'optimal' is unattainable in practice." The key is achieving significant efficiency gains without altering functionality.

Practical Implications for Developers

Write clear code first: Modern compilers handle low-level optimizations effectively
Avoid premature optimization: Focus on algorithm efficiency rather than micro-optimizations
Understand compiler flags: Use settings like -O2 (GCC) or /O2 (MSVC) to enable IR optimizations
Profile before optimizing: Target bottlenecks revealed by profiling tools

Key Takeaways

Three-address code simplifies complex expressions into single-operation instructions
Basic blocks enable localized, machine-independent optimizations
Dead code elimination and constant propagation reduce runtime overhead
DAGs and peephole techniques optimize across blocks
Compilers balance optimization depth against compilation speed

Crucially, these transformations occur transparently—developers get faster code without modifying source. When writing loops or conditionals, which optimization technique do you suspect provides the most significant performance lift in your projects? Share your observations below.