Skip to content

Instantly share code, notes, and snippets.

@debasishg
Created May 27, 2025 17:34
Show Gist options
  • Save debasishg/be3d16e810538fcfa04a66e91cb6fe72 to your computer and use it in GitHub Desktop.
Save debasishg/be3d16e810538fcfa04a66e91cb6fe72 to your computer and use it in GitHub Desktop.

Here’s a reference end-to-end architecture for giving LLMs seamless, MCP-powered access to ClickHouse data and rendering results in a rich web UI.


1. High-level Components

  1. ClickHouse Cluster Your OLAP datastore, hosting raw event or business data.

  2. MCP Server (“ClickHouse Tool”)

    • Exposes a standard MCP “tool” interface (e.g. /mcp/tools discovery and /mcp/invoke).
    • Implements operations like listTables(), getSchema(table), runQuery(sql).
    • Connects directly to ClickHouse via the native HTTP or TCP driver.
  3. LLM Agent (OpenAI Agents SDK)

    • Configured with an MCP “block” pointing at your MCP Server.
    • Receives user prompts (“Show me monthly revenue by region”), plans calls to your ClickHouse tool, and formats results.
  4. Backend API / Orchestrator

    • Lightweight Node.js/Flask service that proxies between your front-end and the LLM Agent.
    • Maintains session state, user authentication, and optionally a query cache.
  5. Rich Web UI (e.g. React + Recharts / shadcn/ui)

    • Sends natural-language questions via the Backend API.
    • Receives JSON or tabular results from the LLM Agent.
    • Renders interactive charts, tables, drill-downs, etc.
  6. Monitoring & Logging

    • Track usage, query performance, errors.
    • Audit what SQL the LLM actually ran.

2. Implementation Steps

  1. Provision & Secure ClickHouse

    • Stand up ClickHouse cluster; load your datasets.
    • Configure user accounts and network rules.
  2. Build the MCP Server

    • Scaffold a small web server (e.g. Express/Koa in Node.js or FastAPI in Python).
    • Define the MCP discovery endpoint (GET /mcp/tools) listing your “ClickHouse” tool.
    • Implement the invoke endpoint (POST /mcp/invoke) to accept JSON with { tool, fn, args }.
    • Under the hood, translate runQuery calls into ClickHouse SQL over HTTP/TCP.
  3. Define Tool Schema

    // Example MCP tool definition
    {
      "name": "clickhouse",
      "description": "Run analytics on ClickHouse tables",
      "functions": [
        {
          "name": "listTables",
          "description": "Return all table names",
          "parameters": {}
        },
        {
          "name": "getSchema",
          "description": "Get column metadata for a table",
          "parameters": { "tableName": "string" }
        },
        {
          "name": "runQuery",
          "description": "Execute an arbitrary SQL SELECT",
          "parameters": { "sql": "string" }
        }
      ]
    }
  4. Configure Your LLM Agent

    import OpenAI from "openai";
    const openai = new OpenAI();
    const agent = openai.agents.create({
      model: "gpt-4o-mini",
      tools: [
        {
          type: "mcp",
          url: "https://your-domain.com/mcp/tools"
        }
      ]
    });
    • The agent will fetch your tool list at startup and know how to call runQuery.
  5. Build the Orchestrator API

    • Endpoint /ask: receives { question }, calls agent.run({ input: question }), returns the agent’s final answer plus any chart-ready data.
  6. Develop the Rich UI

    • Use React + shadcn/ui for layout, headless components.
    • Provide an input box for NL queries.
    • When results come back with structured data (e.g. JSON arrays), pass to chart components (Recharts, D3, etc.) for bar/line/pie.
  7. Add Monitoring & Auth

    • Protect your /mcp/invoke with API keys or mTLS so only your agent can call it.
    • Log every runQuery call for audit and performance.
  8. Iterate & Extend

    • Add more tool functions (e.g. explainQuery, previewTable(limit)).
    • Plug in other data sources (e.g. S3, Redis) as additional MCP tools.

3. Architecture Diagram

flowchart LR
  subgraph Data Layer
    CH[ClickHouse Cluster]
  end

  subgraph MCP Server
    M1[(GET /mcp/tools)]
    M2[(POST /mcp/invoke)]
    M1 --> M2
    M2 --> CH
  end

  subgraph LLM Layer
    A[OpenAI Agent<br/>(gpt-4o-mini)]
    A -- MCP calls --> M1
    A -- MCP calls --> M2
  end

  subgraph Backend API
    B[Orchestrator<br/>(Node.js/Flask)]
    B --> A
  end

  subgraph Frontend UI
    U[React App<br/>& Visualization]
    U --> B
  end

  CH --> M2
  M2 --> A
  A --> B
  B --> U
Loading

This setup gives you:

  • Clean separation between data, tool interface, agent logic, and UI
  • Scalability, since ClickHouse and your MCP server scale independently
  • Flexibility, because you can add new MCP tools (e.g. for other databases or APIs) without changing your UI.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment