Abstract Syntax Trees: Representing and Traversing Program Structure

After parsing the source code, compilers use Abstract Syntax Trees (ASTs) to represent the hierarchical structure of a program. ASTs are the foundation for most compiler operations like semantic analysis, optimization, and code generation.

What is an Abstract Syntax Tree?

An Abstract Syntax Tree (AST) is a tree representation of the abstract syntactic structure of source code. Unlike parse trees, ASTs omit syntactic details like parentheses and focus on the essential structure and meaning of the code.

AST Node Types

Common AST node types include:

Program: The root node representing the entire program
Statements: Declarations, assignments, conditionals, loops
Expressions: Binary operations, literals, identifiers, function calls
Literals: Numbers, strings, booleans
Identifiers: Variable and function names

Example AST Structure

For the expression (a + b) * c, the AST would look like:

AST Traversal

There are several ways to traverse an AST:

Depth-First Traversal: Visit all children before siblings
Breadth-First Traversal: Visit all nodes at the same level before going deeper
Visitor Pattern: Define separate visitor objects for each node type

Implementing AST Traversal in JavaScript

Let's implement a visitor pattern to traverse our AST:

class ASTVisitor {
  visitProgram(node) {
    for (const statement of node.body) {
      this.visit(statement);
    }
  }
  
  visitBinaryExpression(node) {
    this.visit(node.left);
    this.visit(node.right);
  }
  
  visitVariableDeclaration(node) {
    if (node.initializer) {
      this.visit(node.initializer);
    }
  }
  
  visitExpressionStatement(node) {
    this.visit(node.expression);
  }
  
  visitIdentifier(node) {
    // Leaf node, nothing to traverse
  }
  
  visitLiteral(node) {
    // Leaf node, nothing to traverse
  }
  
  visitBlockStatement(node) {
    for (const statement of node.body) {
      this.visit(statement);
    }
  }
  
  visitGroupExpression(node) {
    this.visit(node.expression);
  }
  
  visit(node) {
    switch (node.type) {
      case 'Program': return this.visitProgram(node);
      case 'BinaryExpression': return this.visitBinaryExpression(node);
      case 'VariableDeclaration': return this.visitVariableDeclaration(node);
      case 'ExpressionStatement': return this.visitExpressionStatement(node);
      case 'Identifier': return this.visitIdentifier(node);
      case 'Literal': return this.visitLiteral(node);
      case 'BlockStatement': return this.visitBlockStatement(node);
      case 'GroupExpression': return this.visitGroupExpression(node);
      default:
        throw new Error(`Unknown node type: ${node.type}`);
    }
  }
}

// Example: AST Printer Visitor
class ASTPrinter extends ASTVisitor {
  constructor() {
    super();
    this.indentLevel = 0;
    this.result = '';
  }
  
  indent() {
    return ' '.repeat(this.indentLevel * 2);
  }
  
  visitProgram(node) {
    this.result += `${this.indent()}Program\n`;
    this.indentLevel++;
    super.visitProgram(node);
    this.indentLevel--;
  }
  
  visitBinaryExpression(node) {
    this.result += `${this.indent()}BinaryExpression: ${node.operator}\n`;
    this.indentLevel++;
    super.visitBinaryExpression(node);
    this.indentLevel--;
  }
  
  visitVariableDeclaration(node) {
    this.result += `${this.indent()}VariableDeclaration: ${node.kind} ${node.name}\n`;
    this.indentLevel++;
    super.visitVariableDeclaration(node);
    this.indentLevel--;
  }
  
  visitExpressionStatement(node) {
    this.result += `${this.indent()}ExpressionStatement\n`;
    this.indentLevel++;
    super.visitExpressionStatement(node);
    this.indentLevel--;
  }
  
  visitIdentifier(node) {
    this.result += `${this.indent()}Identifier: ${node.name}\n`;
  }
  
  visitLiteral(node) {
    this.result += `${this.indent()}Literal: ${node.value}\n`;
  }
  
  visitBlockStatement(node) {
    this.result += `${this.indent()}BlockStatement\n`;
    this.indentLevel++;
    super.visitBlockStatement(node);
    this.indentLevel--;
  }
  
  visitGroupExpression(node) {
    this.result += `${this.indent()}GroupExpression\n`;
    this.indentLevel++;
    super.visitGroupExpression(node);
    this.indentLevel--;
  }
  
  print(ast) {
    this.visit(ast);
    return this.result;
  }
}

// Example usage
const sourceCode = `
let x = 10;
let y = (x + 5) * 2;
`;

const lexer = new Lexer(sourceCode); // Assuming Lexer is defined
const tokens = lexer.tokenize();
const parser = new Parser(tokens); // Assuming Parser is defined
const ast = parser.parse();

const printer = new ASTPrinter();
console.log(printer.print(ast));

Output:

Program
  VariableDeclaration: let x
    Literal: 10
  VariableDeclaration: let y
    BinaryExpression: *
      GroupExpression
        BinaryExpression: +
          Identifier: x
          Literal: 5
      Literal: 2

Semantic Analysis with AST

A key use of ASTs is for semantic analysis, which checks for errors that syntax alone can't catch:

class SemanticAnalyzer extends ASTVisitor {
  constructor() {
    super();
    this.symbolTable = new Map(); // Track variable declarations
    this.errors = [];
  }
  
  visitVariableDeclaration(node) {
    // Check for redeclaration
    if (this.symbolTable.has(node.name)) {
      this.errors.push(`Variable '${node.name}' is already declared`);
    }
    
    // Add to symbol table
    this.symbolTable.set(node.name, {
      kind: node.kind,
      initialized: node.initializer !== null
    });
    
    super.visitVariableDeclaration(node);
  }
  
  visitIdentifier(node) {
    // Check for undefined variables
    if (!this.symbolTable.has(node.name)) {
      this.errors.push(`Variable '${node.name}' is not defined`);
    }
  }
  
  analyze(ast) {
    this.visit(ast);
    return this.errors;
  }
}

// Usage:
const analyzer = new SemanticAnalyzer();
const semanticErrors = analyzer.analyze(ast);

if (semanticErrors.length > 0) {
  console.error('Semantic errors found:');
  semanticErrors.forEach(error => console.error(` - ${error}`));
} else {
  console.log('No semantic errors found.');
}

AST Transformation

ASTs can be transformed for various purposes such as optimization:

class ConstantFoldingOptimizer extends ASTVisitor {
  visitBinaryExpression(node) {
    // First recursively optimize the children
    super.visitBinaryExpression(node);
    
    // If both operands are literals, fold the expression
    if (node.left.type === 'Literal' && node.right.type === 'Literal') {
      let result;
      switch (node.operator) {
        case '+': result = node.left.value + node.right.value; break;
        case '-': result = node.left.value - node.right.value; break;
        case '*': result = node.left.value * node.right.value; break;
        case '/': result = node.left.value / node.right.value; break;
        default: return node; // Can't fold
      }
      
      // Replace the binary expression with a literal
      return {
        type: 'Literal',
        value: result
      };
    }
    
    return node;
  }
  
  visit(node) {
    const originalNodeType = node.type;
    const optimizedNode = super.visit(node);
    
    // Return the potentially transformed node
    return optimizedNode || node;
  }
  
  optimize(ast) {
    return this.visit(JSON.parse(JSON.stringify(ast))); // Deep clone the AST
  }
}

AST Visualization

Common AST Operations

Symbol Table Construction: Track variables, functions, and their scopes
Type Checking: Ensure operations are performed on compatible types
Dead Code Elimination: Remove unreachable or unused code
Constant Propagation: Replace variables with their constant values
Code Generation: Translate the AST into target code

Conclusion

Abstract Syntax Trees are a powerful representation of program structure that enables semantic analysis, optimization, and code generation. By using visitor patterns and other traversal techniques, we can manipulate and analyze our program's structure effectively.

In the next section, we'll look at how to generate code from our AST to complete the compilation process.

What is an Abstract Syntax Tree?​

AST Node Types​

Example AST Structure​

AST Traversal​

Implementing AST Traversal in JavaScript​

Semantic Analysis with AST​

AST Transformation​

AST Visualization​

Common AST Operations​

Conclusion​