Skip to content

Instantly share code, notes, and snippets.

@emberian
Created February 14, 2025 10:34
Show Gist options
  • Save emberian/e4f39bfe846b2394c0e91a24ed1245bb to your computer and use it in GitHub Desktop.
Save emberian/e4f39bfe846b2394c0e91a24ed1245bb to your computer and use it in GitHub Desktop.

Okay, let's get our hands dirty and prototype a proc-macro for the Refined Syntax Idea 4! This will be a non-trivial proc-macro, but breaking it down into steps will make it manageable. We'll focus on the core parsing and code generation, without implementing all the features initially (like full expression parsing within the handlers). We'll build up the functionality iteratively.

Project Setup (Proc-Macro Crate)

First, create a new proc-macro crate within your muchin project. This is crucial because proc-macros must be in a separate crate. Assuming your main muchin crate is in a directory called muchin, do the following:

cd muchin
cargo new muchin_macros --lib

Then, edit muchin/Cargo.toml to add a dependency on muchin_macros:

# In muchin/Cargo.toml
[dependencies]
muchin_macros = { path = "muchin_macros" }
# ... other dependencies ...

And inside muchin_macros/Cargo.toml, make sure you have:

# In muchin_macros/Cargo.toml
[package]
name = "muchin_macros"
version = "0.1.0"
edition = "2021"

[lib]
proc-macro = true

[dependencies]
syn = { version = "2.0", features = ["full"] } # For parsing Rust code
quote = "1.0" # For generating Rust code
proc-macro2 = "1.0" # For TokenStream manipulation
paste = "1.0" # (Optional) For identifier manipulation

muchin_macros/src/lib.rs (The Proc-Macro Implementation)

Now, let's start building the proc-macro itself in muchin_macros/src/lib.rs. We'll do this in stages:

Stage 1: Basic Macro Structure and Input Parsing (Skeleton)

use proc_macro::TokenStream;
use quote::{quote, format_ident};
use syn::{parse_macro_input, parse::{Parse, ParseStream, Result}, Ident, Token, braced, bracketed, parenthesized, LitStr};

// Data structure to hold the parsed model definition
struct MuchinModel {
    model_name: Ident,
    state_type: Ident,
    // We will add more fields here in later stages
}

// Parsing implementation for MuchinModel
impl Parse for MuchinModel {
    fn parse(input: ParseStream) -> Result<Self> {
        // Parse 'model' keyword and model name
        let _: Token![model] = input.parse()?;
        let model_name: Ident = input.parse()?;

        // Parse 'state_type' keyword and state type
        let _: Token![state_type] = input.parse()?;
        let state_type: Ident = input.parse()?;

        // Parse the rest of the model definition (actions, register, handlers)
        // We will fill this in later stages

        Ok(MuchinModel {
            model_name,
            state_type,
        })
    }
}


#[proc_macro]
pub fn muchin_model(input: TokenStream) -> TokenStream {
    // Parse the input into our `MuchinModel` struct
    let model_def = parse_macro_input!(input as MuchinModel);

    // Generate code (for now, just print the parsed data)
    let model_name = model_def.model_name;
    let state_type = model_def.state_type;

    let expanded = quote! {
        // Placeholder: Generated code will go here
        struct #model_name; // Example of using model_name
        type MyStateType = #state_type; //Example
        
    };

    TokenStream::from(expanded)
}

Explanation (Stage 1):

  1. Dependencies: We import syn for parsing Rust syntax, quote for generating Rust code, proc_macro2 for token stream manipulation, and paste.
  2. MuchinModel Struct: We define a struct MuchinModel to hold the parsed information from the macro input. For now, it only stores the model_name and state_type.
  3. Parse Trait Implementation: We implement the Parse trait from syn for our MuchinModel struct. The parse method is the core of the parser. It takes a ParseStream (representing the input token stream) and tries to parse it according to our DSL grammar.
    • We use input.parse()? with Token![...] to match keywords (e.g., model, state_type).
    • We use input.parse()? with type hints (e.g., let model_name: Ident = ...) to parse identifiers, types, etc.
    • We use braced!, bracketed!, parenthesized! later on for parsing blocks.
  4. muchin_model Proc Macro: This is the entry point for our proc-macro.
    • parse_macro_input!: Parses the input TokenStream using our MuchinModel's Parse implementation.
    • quote!: This is where we generate Rust code. For now, it's a placeholder that just creates an empty struct with name provided by the macro invocation and an associated type. We'll expand this significantly.
    • TokenStream::from(...): Converts the generated code (from quote!) back into a TokenStream to be returned.

Testing Stage 1:

To test this basic skeleton, in your main muchin crate (where you'll use the macro), try something like:

// In your main muchin crate:
use muchin_macros::muchin_model;

// Define a dummy state type (for now)
#[derive(Debug, Default)] // Add Debug for inspection
struct MyState;

muchin_model! {
    model MyExampleModel,
    state_type MyState,
}

fn main() {
    // Example of usage (will be more elaborate later)
    let _model = MyExampleModel; // Ensure the generated struct is usable
    println!("{:?}", MyStateType::default());

}

Run cargo build. This should:

  1. Compile successfully (if there are no syntax errors in your macro).
  2. Expand the muchin_model! macro into the placeholder code (which just defines an empty struct and a type alias).

If the build succeeds and you don't get any errors from syn, you've got the basic parsing and code generation working!

Stage 2: Parsing Action Declarations

Now, let's extend the macro to parse action declarations.

// In muchin_macros/src/lib.rs

// ... (Previous code from Stage 1) ...

// Enum to represent ActionKind (Pure/Effectful)
#[derive(Debug)]
enum ParsedActionKind {
    Pure,
    Effectful,
}

// Struct to represent a parsed action
#[derive(Debug)]
struct ParsedAction {
    name: Ident,
    params: Vec<(Ident, syn::Type)>, // (param_name, param_type)
    kind: ParsedActionKind,
}

// Parsing implementation for Action
impl Parse for ParsedAction {
    fn parse(input: ParseStream) -> Result<Self> {
        let _: Token![action] = input.parse()?;
        let name: Ident = input.parse()?;

        // Parse parameters (inside parentheses)
        let paren_content;
        parenthesized!(paren_content in input);
        let params = syn::punctuated::Punctuated::<ActionParam, Token![,]>::parse_terminated(&amp;paren_content)?
            .into_iter()
            .collect();

        // Parse ActionKind (Pure/Effectful)
        let kind = if input.peek(Token![Pure]) {
            let _: Token![Pure] = input.parse()?;
            ParsedActionKind::Pure
        } else if input.peek(Token![Effectful]) {
            let _: Token![Effectful] = input.parse()?;
            ParsedActionKind::Effectful
        } else {
            return Err(input.error("Expected 'Pure' or 'Effectful' after action parameters"));
        };

        Ok(ParsedAction { name, params, kind })
    }
}


#[derive(Debug)]
struct ActionParam {
    name: Ident,
    ty: syn::Type
}


impl Parse for ActionParam {
    fn parse(input: ParseStream) -> Result<Self> {
        let name = input.parse()?;
        let _: Token![:] = input.parse()?;
        let ty = input.parse()?;
        Ok(ActionParam { name, ty })
    }
}


// Add actions to MuchinModel
#[derive(Debug)]
struct MuchinModel {
    model_name: Ident,
    state_type: Ident,
    actions: Vec<ParsedAction>, // Add the actions field
}

// Modify parsing for MuchinModel
impl Parse for MuchinModel {
    fn parse(input: ParseStream) -> Result<Self> {
        let _: Token![model] = input.parse()?;
        let model_name: Ident = input.parse()?;

        let _: Token![state_type] = input.parse()?;
        let state_type: Ident = input.parse()?;


        // Parse actions
        let mut actions = Vec::new();
        while input.peek(Token![action]) {
            actions.push(input.parse()?);
        }

        //Parse the arrow/closure action
        let _: Token![=>] = input.parse()?;

        //Parse the rest of the model definition (handlers, register)
        let content;
        let _ = braced!(content in input);

        //Ok:
        Ok(MuchinModel {
            model_name,
            state_type,
            actions,
        })
    }
}

Key Changes in Stage 2:

  1. ParsedAction Struct: This struct holds the parsed information for a single action:

    • name: The identifier of the action (e.g., Init, PollCreateSuccess).
    • params: A vector of tuples, each representing a parameter: (parameter_name, parameter_type).
    • kind: An enum (ParsedActionKind) indicating whether the action is Pure or Effectful.
  2. ActionParam Struct: Holds a parsed parameter

  3. Parse for ParsedAction: The parse method for ParsedAction now:

    • Parses the action keyword.
    • Parses the action name.
    • Parses the parameters within parentheses (). It uses Punctuated from syn to handle comma-separated parameters.
    • Parses the Pure or Effectful keyword to determine the ActionKind.
  4. MuchinModel Changes:

    • Adds an actions field: Vec<ParsedAction> to store the parsed actions.
    • The parse method for MuchinModel now includes a loop:
      • while input.peek(Token![action]): This loop continues as long as it finds an action keyword, indicating another action definition.
      • actions.push(input.parse()?);: Parses and adds the action to the actions vector.
    • The Parse impl parses the arrow and body

Testing Stage 2:

Modify your main muchin crate to test this new parsing:

use muchin_macros::muchin_model;
use muchin::automaton::action::Redispatch;

#[derive(Debug, Default)] // Add Debug for inspection
struct MyState;

muchin_model! {
    model MyExampleModel,
    state_type MyState,

    action Init(instance: Uid, on_success: Redispatch<Uid>, on_error: Redispatch<(Uid, String)>) Pure => {},
    action PollCreateSuccess(poll: Uid) Pure => {},
    action PollCreateError(poll: Uid, error: String) Pure => {},
    action TcpListen(listener: Uid, address: String) Effectful => {},
}

fn main() {
    // For now, we just build to check for parsing errors
}

Run cargo build. If it compiles without errors, your macro is now successfully parsing action declarations!

Stage 3: Generating Action Enum and Action Trait Implementation

Now, let's generate the actual enum for actions and implement the Action trait. This is where we use quote! to generate Rust code.

// ... (Previous code from Stage 1 and 2) ...
// Inside muchin_macros/src/lib.rs

#[proc_macro]
pub fn muchin_model(input: TokenStream) -> TokenStream {
    let model_def = parse_macro_input!(input as MuchinModel);

    let model_name = &model_def.model_name;
    let state_type = &model_def.state_type;

    // 1. Generate Action Enum Name (e.g., MyTcpModelAction)
    let action_enum_name = format_ident!("{}Action", model_name);
    let effectful_action_enum_name = format_ident!("{}EffectfulAction", model_name);

    // 2. Generate Action Enum Variants
    let mut pure_variants = vec![];
    let mut effectful_variants = vec![];

    for action in &model_def.actions {
        let action_name = &action.name;
        let params = action.params.iter().map(|(name, ty)| quote! { #name: #ty });
        let variant = quote! {
            #action_name(#(#params),*)
        };

        match action.kind {
            ParsedActionKind::Pure => pure_variants.push(variant),
            ParsedActionKind::Effectful => effectful_variants.push(variant),
        }
    }

    // 3. Generate Action Enum (using quote!)
    let pure_action_enum = quote! {
        #[derive(Clone, PartialEq, Eq, ::type_uuid::TypeUuid, ::serde_derive::Serialize, ::serde_derive::Deserialize, Debug)]
        #[uuid = "00000000-0000-0000-0000-000000000000"]  // TODO: Generate a real UUID
        pub enum #action_enum_name {
            #(#pure_variants),*
        }

        impl ::muchin::automaton::action::Action for #action_enum_name {
            const KIND: ::muchin::automaton::action::ActionKind = ::muchin::automaton::action::ActionKind::Pure;
        }
    };

    let effectful_action_enum = quote! {
        #[derive(Clone, PartialEq, Eq, ::type_uuid::TypeUuid, ::serde_derive::Serialize, ::serde_derive::Deserialize, Debug)]
        #[uuid = "00000000-0000-0000-0000-000000000001"]  // TODO: Generate a real UUID
        pub enum #effectful_action_enum_name {
            #(#effectful_variants),*
        }

        impl ::muchin::automaton::action::Action for #effectful_action_enum_name {
            const KIND: ::muchin::automaton::action::ActionKind = ::muchin::automaton::action::ActionKind::Effectful;
        }
    };

    // 4. Combine generated code
    let expanded = quote! {
        #pure_action_enum
        #effectful_action_enum
    };

    TokenStream::from(expanded)
}

Key Changes in Stage 3:

  1. Action Enum Name: We generate the action enum name (e.g., MyTcpModelAction) using format_ident!. This creates a valid Rust identifier from a string.
  2. Action Enum Variants: We iterate through the parsed actions and create a quote! fragment for each variant:
    • action.name: The action variant name.
    • params: We map the (name, ty) tuples to name: ty for the parameter list.
    • quote! { #action_name(#(#params),*) }: This generates the variant definition (e.g., Init { instance: Uid, ... }).
  3. quote! for Enum Definition: We use quote! to construct the entire enum definition, including:
    • Derive macros: Clone, PartialEq, Eq, TypeUuid, Serialize, Deserialize, Debug.
    • #[uuid = "..."]: Important: You'll need to generate a unique UUID for each action enum. You can use the uuid crate for this. For this example, I'm using a placeholder.
    • pub enum #action_enum_name { ... }: Defines the enum with the generated name.
    • #(#variants),*: This is where the generated variants are inserted. The #(#variants),* syntax is a "repetition" in quote!. It iterates over the variants vector and inserts each variant, separated by commas.
  4. Action Trait Implementation:
    impl ::muchin::automaton::action::Action for #action_enum_name {
        const KIND: ::muchin::automaton::action::ActionKind = ::muchin::automaton::action::ActionKind::Pure;
    }
    We implement the Action trait. We use ::muchin::automaton::action::ActionKind::Pure and ::muchin::automaton::action::ActionKind::Effectful to refer to the correct enum variants.

Testing Stage 3:

Use a more complete example in your main crate:

use muchin::automaton::action::{Action, ActionKind, Redispatch, Timeout};
use muchin::automaton::state::Uid;
use muchin_macros::muchin_model;
use serde_derive::{Deserialize, Serialize};

#[derive(Debug, Default, Deserialize, Serialize)]
struct MyState;

muchin_model! {
    model MyExampleModel,
    state_type MyState,

    action Init(instance: Uid, on_success: Redispatch<Uid>, on_error: Redispatch<(Uid, String)>) Pure => {},
    action PollCreateSuccess(poll: Uid) Pure => {},
    action PollCreateError(poll: Uid, error: String) Pure => {},
    action TcpListen(listener: Uid, address: String, on_success: Redispatch<Uid>, on_error: Redispatch<(Uid, String)>) Effectful => {},
}

fn main() {
    // Test if the generated enum and variants are usable
    let init_action = MyExampleModelAction::Init(
        Uid::default(),
        Redispatch::new("dummy", |_| panic!()),
        Redispatch::new("dummy", |_| panic!()),
    );

    println!("{:?}", init_action);
    let _poll_action = MyExampleModelEffectfulAction::TcpListen(Uid::default(), "127.0.0.1:8080".to_string(), Redispatch::new("dummy", |_| panic!()), Redispatch::new("dummy", |_| panic!()));

}

Run cargo build and cargo run. The build should succeed, and the output will show the debug print of the Init action. This confirms that the macro is generating the enum and variants correctly.

Stage 4: Handling register_model (Simplified)

For now, let's handle the register_model block with a simplified approach, just parsing and storing the dependencies. We'll generate the full registration logic later.

// In muchin_macros/src/lib.rs

// ... (Previous code from Stages 1-3) ...

// Struct to represent a dependency
#[derive(Debug)]
struct Dependency {
    name: Ident,
}

// Parsing implementation for Dependency
impl Parse for Dependency {
    fn parse(input: ParseStream) -> Result<Self> {
        let name: Ident = input.parse()?;
        Ok(Dependency { name })
    }
}

// Add dependencies to MuchinModel
struct MuchinModel {
    model_name: Ident,
    state_type: Ident,
    actions: Vec<ParsedAction>,
    dependencies: Vec<Dependency>, // Add the dependencies field
}

// Modify parsing for MuchinModel to include dependencies
impl Parse for MuchinModel {
    fn parse(input: ParseStream) -> Result<Self> {
        let _: Token![model] = input.parse()?;
        let model_name: Ident = input.parse()?;

        let _: Token![state_type] = input.parse()?;
        let state_type: Ident = input.parse()?;

        // Parse actions (same as before)
        let mut actions = Vec::new();
        while input.peek(Token![action]) {
            actions.push(input.parse()?);
        }

        // Parse register_model block
        let _: Token![register_model] = input.parse()?;
        let register_block;
        braced!(register_block in input);

        // Parse dependencies within register_model
        let _: Token![dependencies] = register_block.parse()?;
        let dependencies_block;
        bracketed!(dependencies_block in register_block);
        let dependencies = syn::punctuated::Punctuated::<Dependency, Token![,]>::parse_terminated(&amp;dependencies_block)?
            .into_iter()
            .collect();

        //Parse the arrow/closure action
        let _: Token![=>] = input.parse()?;

        //Parse the rest of the model definition (handlers, register)
        let content;
        let _ = braced!(content in input);

        Ok(MuchinModel {
            model_name,
            state_type,
            actions,
            dependencies,
        })
    }
}

Key Changes in Stage 4:

  1. Dependency Struct: A simple struct to hold the parsed dependency (just the type name for now).
  2. Parse for Dependency: Parses a single identifier representing the dependency type.
  3. dependencies Field in MuchinModel: Added to store the parsed dependencies.
  4. Parsing register_model Block: The parse method for MuchinModel now:
    • Parses the register_model keyword.
    • Uses braced! to parse the content within the {} block.
    • Parses the dependencies keyword.
    • Uses bracketed! to parse the content within the [] (the list of dependencies).
    • Uses Punctuated to parse comma-separated dependencies.

Testing Stage 4:

Update your main crate's example:

use muchin::automaton::action::{Action, ActionKind, Redispatch, Timeout};
use muchin::automaton::state::Uid;
use muchin_macros::muchin_model;
use serde_derive::{Deserialize, Serialize};

#[derive(Debug, Default, Deserialize, Serialize)]
struct MyState;
#[derive(Debug, Default, Deserialize, Serialize)]
struct TimeState;
#[derive(Debug, Default, Deserialize, Serialize)]
struct MioState;


muchin_model! {
    model MyExampleModel,
    state_type MyState,

    action Init(instance: Uid, on_success: Redispatch<Uid>, on_error: Redispatch<(Uid, String)>) Pure => {},
    action PollCreateSuccess(poll: Uid) Pure => {},
    action PollCreateError(poll: Uid, error: String) Pure => {},
    action TcpListen(listener: Uid, address: String, on_success: Redispatch<Uid>, on_error: Redispatch<(Uid, String)>) Effectful => {},

    register_model: {
        dependencies: [TimeState, MioState],
    },
}

fn main() {}

Rebuild. If there are no errors, your macro is now correctly parsing the register_model block and extracting the dependencies.

Next Steps (Future Stages - Not Implemented Here):

  • Stage 5: Generate RegisterModel Implementation: Use the parsed dependencies to generate the RegisterModel trait implementation for the model.
  • Stage 6: Parse Action Handlers: Parse the => { ... } blocks associated with each action. This will likely involve using syn's expression parsing capabilities to parse the Rust code within the blocks.
  • Stage 7: Generate process_pure and process_effectful: Generate the match statements within process_pure and process_effectful, using the parsed action handlers.
  • Stage 8: Handle callback! Macro Invocations: Parse and process callback! invocations within the handler blocks, potentially generating the callback action types as you mentioned.
  • Stage 9: Error Handling: Add robust error handling to the proc-macro to provide helpful error messages if the DSL syntax is incorrect.
  • Stage 10: UUID Generation: Integrate a mechanism for generating unique UUIDs for the action enums (using the uuid crate or similar).

This multi-stage approach allows you to build the proc-macro incrementally, testing each stage thoroughly before moving on to the next. The parsing logic, especially for the action handlers and potentially nested closures, will become the most challenging part, but syn provides the necessary tools to handle this. Remember to consult the syn documentation extensively as you work through the parsing of more complex Rust code structures.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment