Lua Tool Layer for Jido/Jidoka Agents

Summary

This proposal explores a Lua-based tool layer for agents that need access to large application capability surfaces without sending every tool schema to the model on every turn.

The core idea is simple:

Developers continue to write normal Jido.Action or Jidoka.Tool modules.
Those actions are collected into an Elixir catalog data structure.
The agent sees a small model-visible interface for querying, describing, and executing catalog capabilities.
Lua is used as a compact, sandboxed language for composing catalog queries and selected host actions.

The goal is not to replace normal tools. Normal tools are still the right fit for small, explicit capability sets. The Lua layer is for agents that need to work with hundreds or thousands of possible actions while keeping model context small.

Problem

Modern tool-calling agents typically send tool names, descriptions, and schemas with each model request. That works well for a handful of tools, but it breaks down as the tool set grows:

tool schemas consume substantial input tokens
the model must choose from too many options
unrelated tools add noise to every turn
multi-step tasks require repeated model/tool round trips
large application APIs are difficult to expose cleanly

For example, an internal operations agent may need access to CRM, billing, support, commerce, identity, reporting, and admin actions. That could easily mean 100 to 1,000+ actions. Loading every schema into every model call is wasteful and likely harms tool selection.

Proposal

Introduce a Lua tool layer backed by a catalog of ordinary Jido/Jidoka actions.

At runtime, the model sees only a small interface:

lua_tools_query - search and plan against the action catalog
lua_tools_describe - fetch exact specs for selected actions
lua_tools_execute - run a short sandboxed Lua script against selected actions

The full catalog remains in Elixir. The model never receives the whole catalog. It asks for a relevant slice, receives compact specs, then executes only the selected capabilities.

Catalog Data Model

The catalog should be a plain Elixir data structure over normal actions:

%Jidoka.LuaTools.Catalog{
  id: "backoffice",
  description: "Backoffice customer, billing, order, and support operations.",
  entries: %{
    "billing.invoice.list_unpaid" => %Jidoka.LuaTools.Entry{
      id: "billing.invoice.list_unpaid",
      action: MyApp.Billing.Tools.ListUnpaidInvoices,
      namespace: ["billing", "invoice"],
      operation: :read,
      entities: ["customer", "invoice"],
      tags: ["billing", "invoice", "unpaid", "collections"],
      aliases: ["open invoices", "past due invoices"],
      description: "List unpaid invoices for a customer.",
      input_fields: ["customer_id", "limit"],
      output_fields: ["invoice_id", "amount_cents", "due_date", "status"],
      mutates?: false,
      risk: :low
    }
  },
  index: %Jidoka.LuaTools.Index{}
}

The action module remains the executable unit. The catalog adds metadata, indexing, policy, discovery, and execution constraints around it.

Runtime Catalog Views

The catalog should be adjustable between turns. A running agent should not always receive the same global surface.

Each turn can operate against a catalog view:

%Jidoka.LuaTools.View{
  catalog_id: "backoffice",
  actor: actor,
  tenant_id: tenant_id,
  allowed_namespaces: ["crm", "billing"],
  denied_operations: [:admin_delete],
  discovered_ids: MapSet.new(),
  described_ids: MapSet.new()
}

The view can be narrowed by:

actor permissions
tenant or workspace
current session context
handoff owner
workflow step
guardrail state
prior search results
per-turn options

This keeps host code in control. The model can only execute actions that the view allows and the turn has selected.

Catalog Querying With Lua

The most important step is catalog discovery. At large scale, a fixed parameter search API may be too rigid. Instead, lua_tools_query can accept a small Lua query script that composes indexed catalog primitives.

The script should not scan the raw catalog. It should only call safe, host-backed index operations:

local customer_lookup = catalog.search("find customer account by company name", {
  domains = {"crm", "accounts"},
  limit = 100
})

local invoice_lookup = catalog.search("list unpaid invoices for a customer", {
  domains = {"billing"},
  limit = 100
})

local refund_notes = catalog.search("draft refund note without issuing refund", {
  domains = {"billing", "support"},
  include_mutations = false,
  limit = 100
})

return catalog.plan({
  catalog.pick(customer_lookup, {
    needs_input = "company_name",
    needs_output = "customer_id",
    limit = 2
  }),
  catalog.pick(invoice_lookup, {
    needs_input = "customer_id",
    needs_output = "invoice_list",
    mutates = false,
    limit = 3
  }),
  catalog.pick(refund_notes, {
    needs_input = "invoice_id",
    mutates = false,
    limit = 3
  })
})

The Lua query runtime should expose primitives like:

catalog.search(query, opts)
catalog.namespace(names)
catalog.tags(tags)
catalog.entities(entities)
catalog.inputs(fields)
catalog.outputs(fields)
catalog.filter(result_set, opts)
catalog.intersect(a, b)
catalog.union(a, b)
catalog.boost(result_set, opts)
catalog.top(result_set, n)
catalog.facets(result_set)
catalog.pick(result_set, opts)
catalog.plan(steps)

Every primitive is backed by Elixir indexes. Lua provides a compact planning language, not a replacement search engine.

Description Step

After querying, the model should fetch exact specs for the smallest useful set of actions:

lua_tools_describe(%{
  ids: [
    "crm.customer.search",
    "billing.invoice.list_unpaid",
    "billing.refund.draft_note"
  ]
})

The response should include only what is needed to call those actions:

billing.invoice.list_unpaid(args) -> list<invoice>

Args:
- customer_id: string, required
- limit: integer, optional, default 25

Returns:
- id: string
- amount_cents: integer
- due_date: string
- status: string

Example:
local invoices = billing.invoice.list_unpaid({
  customer_id = customer.id,
  limit = 10
})

Safety:
read_only

This is the token savings point: the model receives a compact slice of the catalog, not every schema.

Execution Step

Execution uses a separate Lua sandbox from catalog querying. It can call only the actions that have been selected and allowed for the turn:

local customers = crm.customer.search({query = "Acme", limit = 1})
local customer = customers[1]

local invoices = billing.invoice.list_unpaid({
  customer_id = customer.id,
  limit = 10
})

local notes = {}

for _, invoice in ipairs(invoices) do
  table.insert(notes, billing.refund.draft_note({
    invoice_id = invoice.id
  }))
end

return {
  customer = customer,
  invoices = invoices,
  notes = notes
}

Host code validates every action call:

action id is allowed in the current view
arguments match the action schema
actor/context permissions pass
mutation policy passes
result is sanitized before returning to the model
every call is traced

Agent Prompting

When the Lua layer is enabled, the agent should receive a short instruction:

You have access to a Lua tool layer for a large catalog of application actions.

When a task may require hidden application actions, call lua_tools_query first.
Write a short Lua query script using the catalog API to search, filter, rank, or
plan over the catalog. Do not guess action ids. Do not scan the full catalog.

After querying, call lua_tools_describe for the smallest set of candidate action
ids needed for the task. Use only described and allowed actions in
lua_tools_execute.

Prefer read-only actions unless the user clearly asked to change application
state. For mutations, follow the action safety notes and ask for confirmation
when authorization is unclear.

The prompt should include a tiny catalog card, not the full action list:

Available catalog domains:
- crm: accounts, customers, contacts, leads
- billing: invoices, payments, refunds, credits
- support: tickets, macros, escalations, SLAs
- commerce: orders, products, inventory, fulfillment

Safety Model

The Lua layer should be conservative:

fresh Lua VM per query or execution
no file, OS, network, package loading, or dynamic loading APIs
no raw catalog iteration
bounded result sets
hard timeout
max reductions
memory/process limit
action callback count limit
mutation count limit
read-only mode by default
explicit mutates?: true metadata
policy hooks for every mutation
trace every hidden action call

The model should never gain more authority than the host view grants.

Scale Considerations

For 100,000 actions, the catalog cannot be searched by scanning all entries per request. It needs prebuilt indexes:

namespace index
tag index
entity index
operation index
input field index
output field index
name and alias inverted index
description lexical index
policy/context filters
optional usage/popularity ranking

Search should be deterministic and local. The model provides a query plan; the host executes that plan against indexes.

Hard runtime caps are important:

query returns at most 20 compact hits
describe accepts at most 10 action ids
execute allows at most 20 action ids
script timeout around 500-1500ms
host callback count around 25-50
mutation count low unless explicitly approved

Measurement

This idea should be validated against a baseline where actions are exposed as normal model-visible tools.

Measure:

total input tokens
total output tokens
tool schema tokens
query/describe tokens
number of model turns
number of host action calls
success rate
wrong-action rate
invalid-argument rate
Lua runtime error rate
unsafe mutation attempts
latency
tokens per successful task

The Lua layer is worth the complexity only if it improves token cost and tool selection quality without hurting task success or safety.

Expected wins:

fewer tool schemas in model context
fewer irrelevant capabilities competing for attention
fewer model/tool round trips for compositional tasks
better ability to expose large application APIs

Expected costs:

extra query/describe steps
Lua script generation errors
more runtime/debugging infrastructure
stronger need for tracing and inspection

Developer Experience

The setup should feel like exposing a domain API, not managing a Lua runtime:

defmodule MyApp.BackofficeCatalog do
  use Jidoka.LuaTools.Catalog,
    id: "backoffice",
    description: "Backoffice CRM, billing, and support operations."

  action MyApp.CRM.Tools.SearchCustomers,
    id: "crm.customer.search",
    namespace: ["crm", "customer"],
    tags: ["crm", "customer", "search"],
    aliases: ["find customer", "lookup account"],
    operation: :read

  action MyApp.Billing.Tools.ListUnpaidInvoices,
    id: "billing.invoice.list_unpaid",
    namespace: ["billing", "invoice"],
    tags: ["billing", "invoice", "unpaid"],
    operation: :read
end

Then attach it to an agent:

capabilities do
  lua_tools MyApp.BackofficeCatalog,
    prefix: "backoffice",
    mode: :read_only
end

The catalog remains compatible with normal Jido/Jidoka action execution. The Lua layer adds discovery, composition, policy, and tracing around existing actions.

Open Questions

Should query and describe be separate tools, or one combined discovery tool?
Should execution use Lua only, or also support a structured action-plan form?
How should repeated successful scripts be promoted into first-class workflows?
What metadata is required for high-quality catalog search?
Should ranking include prior successful executions from traces?
Where should this live: Jidoka core, an optional package, or eventually Jido.AI?

Positioning

This is not a default agent feature. It is a scale feature for large capability surfaces.

Use normal tools when the agent has a small, obvious tool set.

Use the Lua tool layer when the application has a large action catalog and the agent needs to discover, inspect, compose, and execute a small relevant slice without loading every action schema into the model context.

mikehostetler/lua-tool-layer-proposal.md

Select an option

No results found