Skip to content

Instantly share code, notes, and snippets.

@mikehostetler
Created May 7, 2026 13:19
Show Gist options
  • Select an option

  • Save mikehostetler/c85af6e5581d6096131f88c919f2d3e2 to your computer and use it in GitHub Desktop.

Select an option

Save mikehostetler/c85af6e5581d6096131f88c919f2d3e2 to your computer and use it in GitHub Desktop.
Lua Tool Layer for Jido/Jidoka Agents

Lua Tool Layer for Jido/Jidoka Agents

Summary

This proposal explores a Lua-based tool layer for agents that need access to large application capability surfaces without sending every tool schema to the model on every turn.

The core idea is simple:

  • Developers continue to write normal Jido.Action or Jidoka.Tool modules.
  • Those actions are collected into an Elixir catalog data structure.
  • The agent sees a small model-visible interface for querying, describing, and executing catalog capabilities.
  • Lua is used as a compact, sandboxed language for composing catalog queries and selected host actions.

The goal is not to replace normal tools. Normal tools are still the right fit for small, explicit capability sets. The Lua layer is for agents that need to work with hundreds or thousands of possible actions while keeping model context small.

Problem

Modern tool-calling agents typically send tool names, descriptions, and schemas with each model request. That works well for a handful of tools, but it breaks down as the tool set grows:

  • tool schemas consume substantial input tokens
  • the model must choose from too many options
  • unrelated tools add noise to every turn
  • multi-step tasks require repeated model/tool round trips
  • large application APIs are difficult to expose cleanly

For example, an internal operations agent may need access to CRM, billing, support, commerce, identity, reporting, and admin actions. That could easily mean 100 to 1,000+ actions. Loading every schema into every model call is wasteful and likely harms tool selection.

Proposal

Introduce a Lua tool layer backed by a catalog of ordinary Jido/Jidoka actions.

At runtime, the model sees only a small interface:

  • lua_tools_query - search and plan against the action catalog
  • lua_tools_describe - fetch exact specs for selected actions
  • lua_tools_execute - run a short sandboxed Lua script against selected actions

The full catalog remains in Elixir. The model never receives the whole catalog. It asks for a relevant slice, receives compact specs, then executes only the selected capabilities.

Catalog Data Model

The catalog should be a plain Elixir data structure over normal actions:

%Jidoka.LuaTools.Catalog{
  id: "backoffice",
  description: "Backoffice customer, billing, order, and support operations.",
  entries: %{
    "billing.invoice.list_unpaid" => %Jidoka.LuaTools.Entry{
      id: "billing.invoice.list_unpaid",
      action: MyApp.Billing.Tools.ListUnpaidInvoices,
      namespace: ["billing", "invoice"],
      operation: :read,
      entities: ["customer", "invoice"],
      tags: ["billing", "invoice", "unpaid", "collections"],
      aliases: ["open invoices", "past due invoices"],
      description: "List unpaid invoices for a customer.",
      input_fields: ["customer_id", "limit"],
      output_fields: ["invoice_id", "amount_cents", "due_date", "status"],
      mutates?: false,
      risk: :low
    }
  },
  index: %Jidoka.LuaTools.Index{}
}

The action module remains the executable unit. The catalog adds metadata, indexing, policy, discovery, and execution constraints around it.

Runtime Catalog Views

The catalog should be adjustable between turns. A running agent should not always receive the same global surface.

Each turn can operate against a catalog view:

%Jidoka.LuaTools.View{
  catalog_id: "backoffice",
  actor: actor,
  tenant_id: tenant_id,
  allowed_namespaces: ["crm", "billing"],
  denied_operations: [:admin_delete],
  discovered_ids: MapSet.new(),
  described_ids: MapSet.new()
}

The view can be narrowed by:

  • actor permissions
  • tenant or workspace
  • current session context
  • handoff owner
  • workflow step
  • guardrail state
  • prior search results
  • per-turn options

This keeps host code in control. The model can only execute actions that the view allows and the turn has selected.

Catalog Querying With Lua

The most important step is catalog discovery. At large scale, a fixed parameter search API may be too rigid. Instead, lua_tools_query can accept a small Lua query script that composes indexed catalog primitives.

The script should not scan the raw catalog. It should only call safe, host-backed index operations:

local customer_lookup = catalog.search("find customer account by company name", {
  domains = {"crm", "accounts"},
  limit = 100
})

local invoice_lookup = catalog.search("list unpaid invoices for a customer", {
  domains = {"billing"},
  limit = 100
})

local refund_notes = catalog.search("draft refund note without issuing refund", {
  domains = {"billing", "support"},
  include_mutations = false,
  limit = 100
})

return catalog.plan({
  catalog.pick(customer_lookup, {
    needs_input = "company_name",
    needs_output = "customer_id",
    limit = 2
  }),
  catalog.pick(invoice_lookup, {
    needs_input = "customer_id",
    needs_output = "invoice_list",
    mutates = false,
    limit = 3
  }),
  catalog.pick(refund_notes, {
    needs_input = "invoice_id",
    mutates = false,
    limit = 3
  })
})

The Lua query runtime should expose primitives like:

  • catalog.search(query, opts)
  • catalog.namespace(names)
  • catalog.tags(tags)
  • catalog.entities(entities)
  • catalog.inputs(fields)
  • catalog.outputs(fields)
  • catalog.filter(result_set, opts)
  • catalog.intersect(a, b)
  • catalog.union(a, b)
  • catalog.boost(result_set, opts)
  • catalog.top(result_set, n)
  • catalog.facets(result_set)
  • catalog.pick(result_set, opts)
  • catalog.plan(steps)

Every primitive is backed by Elixir indexes. Lua provides a compact planning language, not a replacement search engine.

Description Step

After querying, the model should fetch exact specs for the smallest useful set of actions:

lua_tools_describe(%{
  ids: [
    "crm.customer.search",
    "billing.invoice.list_unpaid",
    "billing.refund.draft_note"
  ]
})

The response should include only what is needed to call those actions:

billing.invoice.list_unpaid(args) -> list<invoice>

Args:
- customer_id: string, required
- limit: integer, optional, default 25

Returns:
- id: string
- amount_cents: integer
- due_date: string
- status: string

Example:
local invoices = billing.invoice.list_unpaid({
  customer_id = customer.id,
  limit = 10
})

Safety:
read_only

This is the token savings point: the model receives a compact slice of the catalog, not every schema.

Execution Step

Execution uses a separate Lua sandbox from catalog querying. It can call only the actions that have been selected and allowed for the turn:

local customers = crm.customer.search({query = "Acme", limit = 1})
local customer = customers[1]

local invoices = billing.invoice.list_unpaid({
  customer_id = customer.id,
  limit = 10
})

local notes = {}

for _, invoice in ipairs(invoices) do
  table.insert(notes, billing.refund.draft_note({
    invoice_id = invoice.id
  }))
end

return {
  customer = customer,
  invoices = invoices,
  notes = notes
}

Host code validates every action call:

  • action id is allowed in the current view
  • arguments match the action schema
  • actor/context permissions pass
  • mutation policy passes
  • result is sanitized before returning to the model
  • every call is traced

Agent Prompting

When the Lua layer is enabled, the agent should receive a short instruction:

You have access to a Lua tool layer for a large catalog of application actions.

When a task may require hidden application actions, call lua_tools_query first.
Write a short Lua query script using the catalog API to search, filter, rank, or
plan over the catalog. Do not guess action ids. Do not scan the full catalog.

After querying, call lua_tools_describe for the smallest set of candidate action
ids needed for the task. Use only described and allowed actions in
lua_tools_execute.

Prefer read-only actions unless the user clearly asked to change application
state. For mutations, follow the action safety notes and ask for confirmation
when authorization is unclear.

The prompt should include a tiny catalog card, not the full action list:

Available catalog domains:
- crm: accounts, customers, contacts, leads
- billing: invoices, payments, refunds, credits
- support: tickets, macros, escalations, SLAs
- commerce: orders, products, inventory, fulfillment

Safety Model

The Lua layer should be conservative:

  • fresh Lua VM per query or execution
  • no file, OS, network, package loading, or dynamic loading APIs
  • no raw catalog iteration
  • bounded result sets
  • hard timeout
  • max reductions
  • memory/process limit
  • action callback count limit
  • mutation count limit
  • read-only mode by default
  • explicit mutates?: true metadata
  • policy hooks for every mutation
  • trace every hidden action call

The model should never gain more authority than the host view grants.

Scale Considerations

For 100,000 actions, the catalog cannot be searched by scanning all entries per request. It needs prebuilt indexes:

  • namespace index
  • tag index
  • entity index
  • operation index
  • input field index
  • output field index
  • name and alias inverted index
  • description lexical index
  • policy/context filters
  • optional usage/popularity ranking

Search should be deterministic and local. The model provides a query plan; the host executes that plan against indexes.

Hard runtime caps are important:

  • query returns at most 20 compact hits
  • describe accepts at most 10 action ids
  • execute allows at most 20 action ids
  • script timeout around 500-1500ms
  • host callback count around 25-50
  • mutation count low unless explicitly approved

Measurement

This idea should be validated against a baseline where actions are exposed as normal model-visible tools.

Measure:

  • total input tokens
  • total output tokens
  • tool schema tokens
  • query/describe tokens
  • number of model turns
  • number of host action calls
  • success rate
  • wrong-action rate
  • invalid-argument rate
  • Lua runtime error rate
  • unsafe mutation attempts
  • latency
  • tokens per successful task

The Lua layer is worth the complexity only if it improves token cost and tool selection quality without hurting task success or safety.

Expected wins:

  • fewer tool schemas in model context
  • fewer irrelevant capabilities competing for attention
  • fewer model/tool round trips for compositional tasks
  • better ability to expose large application APIs

Expected costs:

  • extra query/describe steps
  • Lua script generation errors
  • more runtime/debugging infrastructure
  • stronger need for tracing and inspection

Developer Experience

The setup should feel like exposing a domain API, not managing a Lua runtime:

defmodule MyApp.BackofficeCatalog do
  use Jidoka.LuaTools.Catalog,
    id: "backoffice",
    description: "Backoffice CRM, billing, and support operations."

  action MyApp.CRM.Tools.SearchCustomers,
    id: "crm.customer.search",
    namespace: ["crm", "customer"],
    tags: ["crm", "customer", "search"],
    aliases: ["find customer", "lookup account"],
    operation: :read

  action MyApp.Billing.Tools.ListUnpaidInvoices,
    id: "billing.invoice.list_unpaid",
    namespace: ["billing", "invoice"],
    tags: ["billing", "invoice", "unpaid"],
    operation: :read
end

Then attach it to an agent:

capabilities do
  lua_tools MyApp.BackofficeCatalog,
    prefix: "backoffice",
    mode: :read_only
end

The catalog remains compatible with normal Jido/Jidoka action execution. The Lua layer adds discovery, composition, policy, and tracing around existing actions.

Open Questions

  • Should query and describe be separate tools, or one combined discovery tool?
  • Should execution use Lua only, or also support a structured action-plan form?
  • How should repeated successful scripts be promoted into first-class workflows?
  • What metadata is required for high-quality catalog search?
  • Should ranking include prior successful executions from traces?
  • Where should this live: Jidoka core, an optional package, or eventually Jido.AI?

Positioning

This is not a default agent feature. It is a scale feature for large capability surfaces.

Use normal tools when the agent has a small, obvious tool set.

Use the Lua tool layer when the application has a large action catalog and the agent needs to discover, inspect, compose, and execute a small relevant slice without loading every action schema into the model context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment