Conventional Commit Abstract Syntax Tree.
cast is a Working Draft for representing conventional commit messages in a syntax tree. It implements unist. It can represent conventional commit messages as defined by the Conventional Commits specification.
This document defines a format for representing conventional commit messages as an abstract syntax tree.
- Introduction
- Where this specification fits
- Types
- Nodes (abstract)
- Nodes
- Mixin
- Content model
- Conventional Commit Mapping
- Round-trip Conversion
- Examples
- Utilities
- References
This document defines a format for representing conventional commit messages as an abstract syntax tree. Development of cast started in January 2025, as part of a conventional commit parser project that needed to provide both structured JSON output and AST-based transformations.
This specification is written in a Web IDL-like grammar.
cast extends unist, a format for syntax trees, to benefit from its ecosystem of utilities.
cast relates to JavaScript in that it has utilities for working with compliant syntax trees in JavaScript. However, cast is not limited to JavaScript and can be used in other programming languages.
cast relates to the unified ecosystem in that cast syntax trees can be used with unified processors for transformation, validation, and serialization tasks.
cast relates to conventional commits in that it provides a structured representation of commit messages that follow the conventional commit format, enabling programmatic analysis and transformation.
interface Literal <: UnistLiteral {
value: string
}Literal (UnistLiteral) represents an abstract interface in cast containing a value.
Its value field is a string.
interface Parent <: UnistParent {
children: [CAstContent]
}Parent (UnistParent) represents an abstract interface in cast containing other nodes (said to be children). Its content is limited to only other cast content.
interface Root <: Parent {
type: 'root'
children: [Content]
}Root (Parent) represents a conventional commit message document. Root can be used as the root of a tree, never as a child. Its content model is content.
For example, the following commit message:
feat(api): add user authentication
This commit adds JWT-based authentication to the API.
It includes login, logout, and token refresh endpoints.
BREAKING CHANGE: authentication is now required for all API endpoints
Resolves: #123
Yields:
{
type: 'root',
children: [
{
type: 'header',
children: [/* header content */]
},
{
type: 'body',
children: [/* body content */]
},
{
type: 'footer',
children: [/* footer content */]
}
]
}interface Header <: Parent {
type: 'header'
children: [HeaderContent]
}Header (Parent) represents the first line of a conventional commit message. Header can be used where content is expected. Its content model is header content.
The header contains the commit type, optional scope, optional breaking change indicator, and description.
interface Type <: Literal {
type: 'type'
value: string
}Type (Literal) represents the type of change being committed.
Type can be used where header content is expected.
Its content is represented by its value field.
Common conventional commit types include: feat, fix, docs, style, refactor, test, chore, etc.
For example:
{type: 'type', value: 'feat'}interface Scope <: Literal {
type: 'scope'
value: string
}Scope (Literal) represents the scope of the change being committed.
Scope can be used where header content is expected.
Its content is represented by its value field.
The scope is optional and appears in parentheses after the type.
For example:
{type: 'scope', value: 'api'}interface Bang <: Parent {
type: 'bang'
children: [Text]
}Bang (Parent) represents a breaking change indicator in the header.
Bang can be used where header content is expected.
Its content model consists of a single Text node containing '!'.
The breaking change indicator appears as an exclamation mark (!) after the type or scope.
The Bang node is a syntactic element used for round-trip conversion.
The semantic meaning of a breaking change is captured by the breaking field on Description and Trailer nodes.
For example:
{
type: 'bang',
children: [
{type: 'text', value: '!'}
]
}interface Description <: Parent {
type: 'description'
children: [TextContent]
breaking: boolean
value: string
}Description (Parent) represents the description of the change. Description can be used where header content is expected. Its content model is text content.
The breaking field is true when the commit header contains a breaking change indicator (!), and false otherwise.
The value field contains the complete description text extracted from all child text nodes, providing convenient access to the description without traversing the children array.
For example:
{
type: 'description',
breaking: false,
value: 'add user authentication',
children: [
{type: 'text', value: 'add user authentication'}
]
}Example with breaking change:
{
type: 'description',
breaking: true,
value: 'add user authentication',
children: [
{type: 'text', value: 'add user authentication'}
]
}interface Body <: Parent {
type: 'body'
children: [BodyContent]
}Body (Parent) represents the body of the commit message. Body can be used where content is expected. Its content model is body content.
The body provides additional context about the change and is separated from the header by a blank line.
For example:
{
type: 'body',
children: [
{
type: 'line',
children: [
{type: 'text', value: 'This commit adds JWT-based authentication to the API.'}
]
},
{
type: 'line',
children: [
{type: 'text', value: 'It includes login, logout, and token refresh endpoints.'}
]
}
]
}interface Footer <: Parent {
type: 'footer'
children: [FooterContent]
}Footer (Parent) represents the footer section of the commit message. Footer can be used where content is expected. Its content model is footer content.
The footer contains git trailers and is separated from the body by a blank line.
interface Trailer <: Parent {
type: 'trailer'
children: [TrailerContent]
breaking: boolean
}Trailer (Parent) represents a single git trailer in the footer. Trailer can be used where footer content is expected. Its content model is trailer content.
A trailer consists of a token and a value separated by a colon.
The breaking field is true when the trailer represents a breaking change (e.g., BREAKING CHANGE: or BREAKING-CHANGE:), and false otherwise.
For example:
{
type: 'trailer',
breaking: false,
children: [
{type: 'trailerkey', children: [{type: 'text', value: 'Resolves'}]},
{type: 'trailervalue', children: [{type: 'text', value: '#123'}]}
]
}Breaking change trailer:
{
type: 'trailer',
breaking: true,
children: [
{type: 'trailerkey', children: [{type: 'text', value: 'BREAKING CHANGE'}]},
{type: 'trailervalue', children: [{type: 'text', value: 'API has changed'}]}
]
}interface TrailerKey <: Parent {
type: 'trailerkey'
children: [TextContent]
}TrailerKey (Parent) represents the key part of a git trailer. TrailerKey can be used where trailer content is expected. Its content model is text content.
Common trailer keys include: BREAKING CHANGE, Resolves, Fixes, Reviewed-by, Co-authored-by, etc.
For example:
{
type: 'trailerkey',
children: [
{type: 'text', value: 'Resolves'}
]
}interface TrailerValue <: Parent {
type: 'trailervalue'
children: [TextContent]
}TrailerValue (Parent) represents the value part of a git trailer. TrailerValue can be used where trailer content is expected. Its content model is text content.
interface Line <: Parent {
type: 'line'
children: [TextContent]
}Line (Parent) represents a single line of text in the body or trailer value. Line can be used where body content is expected. Its content model is text content.
Lines are separated by newline characters and can contain both plain text and issue references, allowing for precise tracking of inline elements.
For example:
{
type: 'line',
children: [
{type: 'text', value: 'This fixes issue '},
{type: 'issueReference', value: '#123', prefix: '#', id: 123},
{type: 'text', value: ' in the parser'}
]
}interface Text <: Literal {
type: 'text'
value: string
}Text (Literal) represents textual content.
Text can be used where text content is expected.
Its content is represented by its value field.
For example:
{type: 'text', value: 'add user authentication'}interface IssueReference <: Literal {
type: 'issueReference'
value: string
prefix: string
id: number
}IssueReference (Literal) represents a reference to an issue or pull request.
IssueReference can be used where text content is expected.
Its content is represented by its value field, with additional prefix and id fields for structured access.
For example:
{
type: 'issueReference',
value: '#123',
prefix: '#',
id: 123
}interface mixin PositionalInfo {
position: Position?
}PositionalInfo represents positional information of a node in the source commit message. This mixin can be applied to any node to preserve source location information for error reporting and source mapping.
All CAST nodes should include position information when parsed from source text to enable:
- Precise error reporting with line and column numbers
- Source mapping for transformations
- IDE integrations with hover information and diagnostics
- Linting tools with exact error locations
interface Position {
start: Point
end: Point
}Position represents the location of a node in a source commit message.
The start field represents the place of the first character of the node.
The end field represents the place of the first character after the node.
interface Point {
line: number >= 1
column: number >= 1
offset: number >= 0
}Point represents one place in a source commit message.
The line field (1-indexed integer) represents a line in the source.
The column field (1-indexed integer) represents a column in the source.
The offset field (0-indexed integer) represents a character in the source.
When converting from CST to CAST, position information should be preserved as follows:
- Token-based nodes (Type, Scope): Use exact token positions
- Composite nodes (Header, Body, Footer, TrailerKey, TrailerValue): Span from first to last child
- Text nodes: Preserve exact character ranges including whitespace
- Issue references: Use substring positions within trailer values
Position information is optional but strongly recommended for nodes parsed from source text.
type CAstContent = ContentEach node in cast falls into one or more categories of Content that group nodes with similar characteristics together.
type Content = Header | Body | FooterContent represents the top-level sections of a conventional commit message.
type HeaderContent = Type | Scope | Bang | DescriptionHeader content represents the components that can appear in the commit header.
type BodyContent = LineBody content represents the content that can appear in the commit body. Body content consists of line nodes, which can contain text and issue references.
type FooterContent = TrailerFooter content represents the content that can appear in the commit footer.
type TrailerContent = TrailerKey | TrailerValueTrailer content represents the components of a git trailer.
type TextContent = Text | IssueReferenceText content represents textual content that may contain issue references.
This section maps elements of the Conventional Commits specification to cast nodes:
| Conventional Commit Element | CAST Node | Description |
|---|---|---|
<type> |
Type |
The type of change (feat, fix, etc.) |
(<scope>) |
Scope |
Optional scope in parentheses |
! |
Bang |
Breaking change indicator |
<description> |
Description |
Short description of the change |
| Body paragraph | Body |
Extended description |
<token>: <value> |
Trailer |
Git trailer (footer) |
BREAKING CHANGE: |
Trailer (special) |
Breaking change description |
#123, GH-456 |
IssueReference |
Issue/PR references |
The cast specification is designed to support lossless round-trip conversion:
- Parse: Commit message text → CAST
- Transform: Modify the CAST (validate, lint, reformat)
- Serialize: CAST → Commit message text
Key design principles for round-trip compatibility:
- Preserve whitespace: Significant whitespace is preserved in
Textnodes - Maintain structure: All structural elements are represented as nodes
- Position tracking: Optional positional information preserves source locations
- No information loss: All parts of the original commit message are represented
Input:
feat: add user authentication
AST:
{
type: 'root',
breaking: false,
children: [
{
type: 'header',
children: [
{type: 'type', value: 'feat'},
{type: 'description', children: [
{type: 'text', value: ' add user authentication'}
]}
]
}
]
}Input:
feat(api)!: add user authentication
This commit adds JWT-based authentication to the API.
It includes login, logout, and token refresh endpoints.
BREAKING CHANGE: authentication is now required for all API endpoints
Resolves: #123
Co-authored-by: Jane Doe <jane@example.com>
AST:
{
type: 'root',
breaking: true,
children: [
{
type: 'header',
children: [
{type: 'type', value: 'feat'},
{type: 'scope', value: 'api'},
{type: 'bang', value: '!'},
{type: 'description', breaking: true, value: 'add user authentication', children: [
{type: 'text', value: ' add user authentication'}
]}
]
},
{
type: 'body',
children: [
{type: 'text', value: 'This commit adds JWT-based authentication to the API.\nIt includes login, logout, and token refresh endpoints.'}
]
},
{
type: 'footer',
children: [
{
type: 'trailer',
breaking: true,
children: [
{type: 'trailerkey', children: [{type: 'text', value: 'BREAKING CHANGE'}]},
{type: 'trailervalue', children: [
{type: 'text', value: ' authentication is now required for all API endpoints'}
]}
]
},
{
type: 'trailer',
breaking: false,
children: [
{type: 'trailerkey', children: [{type: 'text', value: 'Resolves'}]},
{type: 'trailervalue', children: [
{type: 'issueReference', value: '#123', prefix: '#', id: 123}
]}
]
},
{
type: 'trailer',
breaking: false,
children: [
{type: 'trailerkey', children: [{type: 'text', value: 'Co-authored-by'}]},
{type: 'trailervalue', children: [
{type: 'text', value: ' Jane Doe <jane@example.com>'}
]}
]
}
]
}
]
}- unist: Universal Syntax Tree. T. Wormer; et al.
- Conventional Commits: Conventional Commits.
- Git Trailers: Git Trailers Documentation
- JavaScript: ECMAScript Language Specification. Ecma International.
- Web IDL: Web IDL, C. McCormack. W3C.