Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Overview

The Wipple compiler is designed around a database of nodes, where nodes can have arbitrary facts. A node is a handle to an expression, pattern, type, or other program element. Facts represent the results of parsing (e.g., Syntax), name resolution (e.g., Defined and Resolved), type checking (e.g., Typed and Bounds), and other phases. The IDE can query these facts to obtain the information it needs for error messages, autocomplete, etc.

Syntax

Nodes that represent program elements written directly in the source code have the Syntax fact, which contains a span (a file path and range) and the underlying AST value. The AST is generated by a recursive-descent parser.

All AST values implement Visit, which the compiler calls to traverse the program and perform name resolution and instantiation. Most nodes call visit on their children directly, but some forward to other nodes created while visiting. For example, operators (such as 1 + 2) instead create a virtual function call node (such as Add 1 2) and visit that.

Type checking

Wipple’s type system operates on groups of nodes. Concrete types, such as String, are assigned to the group as a whole rather than individual expressions.

The type checker accepts constraints (defined on nodes) as input. There are five types of constraints:

  1. Group constraints place two nodes in the same group.

  2. Type constraints assign a concrete type to the group of a representative node. When more than one concrete type is assigned to the same group, the types are unified if possible.

    For example, the expression n :: Number would generate a group constraint between n and Number. Then, the node representing Number would generate a type constraint for the concrete type Number, resulting in n and Number sharing this type.

    Of note is that the node representing the source code Number and the concrete type Number are different objects in the compiler — most nodes representing types in the source code contribute equivalent type constraints, but some contribute different constraints. For instance, placeholders (_) do not contribute any constraints, and instead inherit their type from the group they belong to; _ is not a concrete type. This allows code like (1, 2, 3) :: List _ to resolve to List Number. Similarly, the first time a type parameter is used, it contributes a type constraint, but subsequent uses instead contribute group constraints linking back to the first use.

  3. Instantiate constraints take a generic definition and makes a copy of all of the definition’s constraints, where any mentioned type parameters are replaced with fresh nodes, and where mentions of the definition node are replaced by the instantiated node. This design straightforwardly preserves groups, bounds, and other information beyond the definition’s concrete type.

    For example, the definition id :: value -> value would generate a group constraint between the two values, and a type constraint assigning the concrete type parameter value to this group. Then, calling id 123 would generate a fresh node representing the instantiated value, along with the original group constraint and type constraint modified to use this fresh node. Finally, the literal 123 generates a type constraint assigning Number to the instantiated value group, resulting in the function call expression id 123 also having type Number.

  4. Bound constraints resolve trait bounds.

    To resolve bounds, the type checker iterates over each available instance, unifying the instance’s instantiated parameters with the bound’s. If any errors occur during unification, the instance is discarded and the type checker is restored to its previous state. For example, given the instances As-Sequence String String and As-Sequence (List element) element and the bound As-Sequence (List Number) x, the first instance fails because String and List _ do not unify, and the second instance succeeds, assigning Number to the group containing x.

    During bound resolution, if more than one instance matches, the bound is considered ambiguous and is added back to the queue. To prevent infinite loops, the type checker stops if no progress is made (i.e., a concrete type is assigned to a group or a bound is resolved) after iterating through the entire queue.

  5. Default constraints act like type constraints, but only if the node’s group does not already have another concrete type.

Querying and feedback generation

Feedback (error messages) is created by querying the database for nodes with specific facts. For example, the conflicting-types error message is produced for any node belonging to a group with multiple concrete types.

Code generation

Wipple compiles to an intermediate representation (IR) via a Codegen fact. Then, the IR is monomorphized and compiled to JavaScript.