Contract Extraction
The contract extraction system parses source code files to discover what entities and routines are actually implemented. This information is compared against CODEMANIFEST declarations to verify completeness.
Entry Point
contract(language: str, cell_path: str) -> list[EntityContract | RoutineContract]
language-- the programming language of the source code, used to dispatch to the correct parser.cell_path-- the file or directory path containing the source code.- Returns a list of
EntityContractandRoutineContractinstances representing what the code implements.
The language parameter is resolved from the project configuration. Based on its value, the system selects the appropriate tree-sitter parser.
Data Structures
BaseContract
Abstract base for all extracted contracts.
| Property | Type | Description |
|---|---|---|
name |
str |
The name of the entity or routine. |
signature |
str |
The type signature as declared in code. |
contract |
str |
The contract identifier linking to the CODEMANIFEST declaration. |
EntityContract
Represents a class, struct, or interface extracted from source code.
| Property | Type | Description |
|---|---|---|
properties |
list[PropertyContract] |
Properties of the entity. |
methods |
list[MethodContract] |
Methods of the entity. |
Inherits name, signature, and contract from BaseContract.
PropertyContract
A property belonging to an entity.
| Property | Type | Description |
|---|---|---|
name |
str |
Property name. |
signature |
str |
Property type signature. |
MethodContract
A method belonging to an entity.
| Property | Type | Description |
|---|---|---|
name |
str |
Method name. |
signature |
str |
Method signature including parameters and return type. |
RoutineContract
Represents a standalone function (not belonging to any entity).
| Property | Type | Description |
|---|---|---|
name |
str |
Function name. |
signature |
str |
Function signature. |
contract |
str |
Contract identifier. |
Supported Languages
Each language uses a dedicated tree-sitter grammar to parse source code into an AST, then walks the AST to extract contract information.
Python (tree-sitter-python)
Extracts:
- Classes with their methods, properties, and decorators.
- Standalone functions.
- Method signatures including parameter lists and return annotations.
Go (tree-sitter-go)
Extracts:
- Structs and interfaces.
- Functions and methods (including receiver functions).
- Struct fields as properties.
Kotlin (tree-sitter-kotlin)
Extracts:
- Classes and data classes.
- Functions and extension functions.
- Properties with their type annotations.
Swift (tree-sitter-swift)
Extracts:
- Classes, structs, and protocols.
- Functions and methods.
- Properties with access level and type.
JavaScript (tree-sitter-js)
Extracts:
- Classes with methods.
- Functions and arrow functions.
- Class properties.
Extraction Process
For each supported language, the extraction follows the same pattern:
- Read source files -- load all relevant source files from
cell_path. - Parse with tree-sitter -- build a syntax tree using the language-specific grammar.
- Walk the tree -- traverse the AST looking for declarations (classes, functions, methods, properties).
- Build contracts -- for each declaration, create the corresponding contract object with name, signature, and nested members.
- Return results -- collect all contracts into a flat list.
The extracted contracts are then compared against the CODEMANIFEST declarations to identify missing implementations or undeclared code.
Where to Next
- AST Factory -- how CODEMANIFEST declarations are parsed.
- Validation Rules -- rules that validate contract declarations.
- Architecture Overview -- the full pipeline context.