Skip to content

Instantly share code, notes, and snippets.

@decagondev
Created April 28, 2026 18:20
Show Gist options
  • Select an option

  • Save decagondev/696ed595eef9da819b4cc92e8ac69965 to your computer and use it in GitHub Desktop.

Select an option

Save decagondev/696ed595eef9da819b4cc92e8ac69965 to your computer and use it in GitHub Desktop.

Draft: High-Level ARCHITECTURE.md Summary (≈500 words)

High-Level Architecture – Clinical Co-Pilot

The Clinical Co-Pilot will be a verified, observable, agentic chatbot embedded directly inside OpenEMR via a custom module. It solves the 90-second physician context problem while respecting every hard constraint in the requirements.

Core Decisions & Tradeoffs:

  • Deployment boundary: Same Vultr VPS + Docker Compose. Add a lightweight agent service (Node.js/Python + LangChain/LlamaIndex or equivalent) as a fourth container. This keeps everything under our control and simplifies observability.
  • Embedding strategy: Custom OpenEMR module (using official skeleton). It registers a new UI panel/sidebar that loads the agent iframe or React component. The module reuses OpenEMR’s session/ACL so the agent inherits exact user permissions — no separate auth.
  • Data access: Agent never queries the DB directly. It calls OpenEMR’s existing REST/FHIR API (authenticated via current user token). This enforces authorization and keeps PHI inside the trusted OpenEMR boundary.
  • Verification layer: Post-generation step. Every agent response is parsed for claims → each claim is re-queried via API → citations are attached or the claim is softened/removed. Clinical rules (dosage, interactions) enforced via simple lookup tables or lightweight rules engine. Tradeoff: adds ~200–500 ms latency but eliminates hallucinations — non-negotiable per the PDF.
  • Agentic flow: Multi-turn conversational interface (only where USERS.md requires it). Tool-calling limited to patient-specific read operations. No write actions in MVP.
  • Observability: Structured logging (OpenTelemetry or simple JSON) for every step: user → agent → tool call → verification → response. Token usage, latency, failures all captured. Meets the exact questions the PDF requires us to answer from logs.
  • Failure modes: Graceful degradation baked in (e.g., “I cannot find that record — here is what I do have”). Transparent error messages to user.
  • Scalability & cost: Starts cheap (one VPS). At 100/1K/10K users we add Redis caching, read replicas, and rate limiting. Cost analysis will be in the final deliverable.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment