High-Level Architecture – Clinical Co-Pilot
The Clinical Co-Pilot will be a verified, observable, agentic chatbot embedded directly inside OpenEMR via a custom module. It solves the 90-second physician context problem while respecting every hard constraint in the requirements.
Core Decisions & Tradeoffs:
- Deployment boundary: Same Vultr VPS + Docker Compose. Add a lightweight agent service (Node.js/Python + LangChain/LlamaIndex or equivalent) as a fourth container. This keeps everything under our control and simplifies observability.
- Embedding strategy: Custom OpenEMR module (using official skeleton). It registers a new UI panel/sidebar that loads the agent iframe or React component. The module reuses OpenEMR’s session/ACL so the agent inherits exact user permissions — no separate auth.
- Data access: Agent never queries the DB directly. It calls OpenEMR’s existing REST/FHIR API (authenticated via current user token). This enforces authorization and keeps PHI inside the trusted OpenEMR boundary.
- Verification layer: Post-generation step. Every agent response is parsed for claims → each claim is re-queried via API → citations are attached or the claim is softened/removed. Clinical rules (dosage, interactions) enforced via simple lookup tables or lightweight rules engine. Tradeoff: adds ~200–500 ms latency but eliminates hallucinations — non-negotiable per the PDF.
- Agentic flow: Multi-turn conversational interface (only where USERS.md requires it). Tool-calling limited to patient-specific read operations. No write actions in MVP.
- Observability: Structured logging (OpenTelemetry or simple JSON) for every step: user → agent → tool call → verification → response. Token usage, latency, failures all captured. Meets the exact questions the PDF requires us to answer from logs.
- Failure modes: Graceful degradation baked in (e.g., “I cannot find that record — here is what I do have”). Transparent error messages to user.
- Scalability & cost: Starts cheap (one VPS). At 100/1K/10K users we add Redis caching, read replicas, and rate limiting. Cost analysis will be in the final deliverable.