Skip to content

Instantly share code, notes, and snippets.

View alogic0's full-sized avatar

Oleg Tsybulskyi alogic0

  • Germantown, Tennessee, USA
View GitHub Profile
@alogic0
alogic0 / zig_vtables.md
Last active May 19, 2026 03:57
Zig vtables

In Zig, vtables (virtual tables) are a manual pattern used to implement dynamic dispatch, allowing different types to be used through a single, shared interface. [^1][^2][^3][^4][^5] Unlike languages like C++ or Java, Zig does not have "classes" or "interfaces" as built-in keywords. Instead, you construct vtables yourself by creating a struct of function pointers. [^6][^7][^8][^9][^10]

How Zig vtables Work

A typical Zig interface is a "fat pointer" consisting of two main parts: [^3][^11]

  1. A Context Pointer (ptr): A pointer to the specific implementation's data, usually typed as *anyopaque.
  2. A VTable Pointer (vtable): A pointer to a constant struct containing function pointers that define the interface's behavior. [^3][^12][^13]

Example: std.mem.Allocator

The most prominent use of this pattern in the Zig Standard Library is the Allocator. It allows functions to accept any allocator (like ArenaAllocator or GeneralPurposeAllocator) without knowing exactly which one is being used at co

@alogic0
alogic0 / lexer_4.md
Created April 26, 2026 07:58
Phase 4: Incrementalism and the "Dirty" Range

Phase 4: Incrementalism and the "Dirty" Range

Now we reach the defining feature of Tree-sitter: Incremental Parsing. In a traditional parser, if a user has a 10,000-line file and types a single }, you re-parse the whole file. In Tree-sitter, you only re-parse what changed.

To do this in Zig, we need to introduce Edit Mapping and Node Reuse.


1. The Edit Structure

When a user types, we receive an "edit" event. This isn't just the new string; it's the range of bytes that were replaced.

@alogic0
alogic0 / lexer_3.md
Created April 26, 2026 07:49
Phase 3: The Stack and the State Machine

Phase 3: The Stack and the State Machine

To move from "manual reductions" to a real Tree-sitter-like algorithm, we have to introduce the LR Stack. In a bottom-up parser, the stack doesn't just hold nodes; it holds States.

In Zig, we can represent this by combining our NodeId with a StateId. This tells the parser: "I am currently in the middle of a function definition, and I just saw an identifier. What do I expect next?"


1. Defining the State

The "State" is an integer that refers to a row in a giant transition table (the Parse Table). In Tree-sitter, this table is generated from your grammar.js.

@alogic0
alogic0 / lexer_2_1.md
Created April 20, 2026 09:03
Lesson 2, extension about Zig memory management

Since you're architecting a high-performance parser for large-scale data (like your work with millions of records), Zig's approach to memory is your greatest ally. It avoids the "hidden" costs of garbage collection by making every allocation explicit.

In Zig, if a function needs memory, it must ask for an Allocator.

1. Manual Memory Management: The Allocator

Zig does not have a global heap. Instead, you pass an Allocator (an interface) to any structure—like your Parser—that needs to grow.

  • Explicit Control: You decide if memory lives on the stack, the heap, or a fixed-size buffer.
  • The defer Keyword: To prevent memory leaks, Zig uses defer to ensure memory is freed as soon as the scope closes.
  • Safety: Using the GeneralPurposeAllocator (GPA) during development will catch memory leaks and "double-frees" immediately.
@alogic0
alogic0 / lexer_2.md
Created April 20, 2026 08:46
Lesson 2 in Lexer Generator

To move from a flat stream of tokens to a Syntax Tree, we need to define how tokens relate to each other hierarchically.

In Tree-sitter, the goal is to create a Concrete Syntax Tree (CST). Unlike an Abstract Syntax Tree (AST), which throws away "useless" characters like parentheses or semicolons, a CST keeps everything so that the code can be reconstructed exactly as it was written.

1. The Node Structure

In Zig, we want this to be memory-efficient. Instead of using pointers for every child (which causes cache misses), we can use an index-based approach.

pub const NodeId = u32;
@alogic0
alogic0 / lexer_1.md
Created April 18, 2026 07:00
Lexer 1

To understand Tree-sitter using Zig, we start with the absolute atom of parsing: the Lexer.

In a traditional compiler, the Lexer (or Scanner) turns a string of characters into a stream of Tokens. In an incremental system like Tree-sitter, the lexer must be able to start at any byte offset, but for our first step, we will build a linear Lexer in Zig.


1. The Core Data Structures

First, we define what our "Tokens" look like. Using a Zig enum is perfect for this because we can use tagged unions later for more complex metadata.

@alogic0
alogic0 / tree-sitter-learning-path.md
Last active April 18, 2026 06:32
Parser generator

Tree-sitter is a high-speed, incremental parsing system that has revolutionized how editors like Neovim and VS Code handle syntax highlighting and code navigation. To understand its algorithms, you need to bridge the gap between traditional context-free grammars and modern incremental state management.

Here is your structured learning path to mastering the mechanics behind Tree-sitter.


Phase 1: The Theoretical Foundation

Before diving into Tree-sitter’s source code, you must understand the "Classical" way of parsing. Tree-sitter is based on LR(1) parsing, but with specific enhancements.

@alogic0
alogic0 / analytic
Created April 17, 2026 02:58
gen-z-sitter_opiniot.txt
✦ This repository is a highly disciplined and architecturally mature rewrite of the Tree-sitter generator in Zig. It stands out for its rigorous
engineering standards, incremental milestone-driven approach, and clear focus on real-world compatibility.
Here is a breakdown of my assessment:
1. Exceptional Planning and Organization
The project is one of the most well-documented "work-in-progress" compilers I've seen.
* Milestone Rigor: Every stage of development is tracked via specific MILESTONE_X_IMPLEMENTATION_CHECKLIST.md files. You have successfully
navigated from basic scaffold (M0) all the way through complex parse-table serialization and real-world external scanner integration (M31).
* The Master Plan: MASTER_PLAN.md and MASTER_PLAN_2.md provide a high-level strategic roadmap that balances internal architectural purity with the
@alogic0
alogic0 / fog2_12.txt
Created September 24, 2024 14:02
English exercise
Ex 1
2. She's been writing articles about global warming since last month.
3. They haven't been living in New York for a few years.
4. They've been living in Toronto since 2013.
5. They've been driving a fuel-efficient car since last year.
6. Pete hasn't been working since last year.
7. Pete and Amanda have been thinking of treveling to Africa since last year.
8. Amanda has been reading a lot about Africa for a few months.
9. Pete has been studying zoology back in school since last month.
@alogic0
alogic0 / money.md
Last active September 22, 2024 09:12
English phrases and words related to money, expenses, and taxes

Certainly! Here are some essential English phrases and words related to money, expenses, and taxes that your friend should know when living in Memphis (or anywhere in the United States):

Money and Banking

  • Currency: The official currency is the US Dollar (USD).
  • Bank Account: To open a bank account, you might need an ID and proof of address.
  • Checking Account: An account used for daily transactions.
  • Savings Account: An account used to save money and earn interest.
  • Debit Card: A card linked to your bank account for transactions.
  • Credit Card: A card that allows you to borrow money up to a certain limit.
  • ATM (Automated Teller Machine): A machine to withdraw or deposit money.