HMS Reference Architecture

Executive Summary

This chapter synthesizes every concept, pattern, and architectural decision from the preceding chapters into a complete, deployable Hospital Management System (HMS) AI platform reference architecture. It is the flagship document of this repository: the single place where a principal engineer or hospital CIO can see how AI strategy, governance, EHR integration, clinical RAG, clinical decision support, observability, cost management, security, and change management fit together into a cohesive clinical AI platform. The architecture described here is not a proof of concept — it is a production-grade reference for a hospital deploying 7 AI use cases serving 300+ clinical and administrative users under HIPAA and Joint Commission compliance requirements.

Learning Objectives

After reading this chapter, you will be able to:

Describe the complete HMS AI platform architecture at component, integration, and data flow levels
Identify the dependencies between platform components and the sequence in which they should be implemented
Apply the reference architecture to evaluate gaps in an existing hospital AI deployment
Use this reference architecture as the basis for a hospital AI platform RFP or technical design review

Business Problem

The Reference Healthcare Organization has deployed its first three clinical AI use cases independently — each with its own EHR integration, its own LLM API access, its own prompt management, and its own observability approach. Governance operates through individual project reviews. Cost attribution is per-project budget lines that the CFO cannot reconcile against actual AI platform costs.

This architecture solves the sprawl problem: it defines the shared infrastructure that each AI use case leverages, the governance structures that span all use cases, and the integration patterns that connect the AI platform to the EHR and clinical workflow. It is designed as an incremental build — the platform can be implemented use-case by use-case, with each piece of infrastructure added as it becomes the binding constraint on the next use case.

Why This Technology Exists

The HMS reference architecture exists because healthcare AI platforms are not commodity infrastructure — the combination of HIPAA compliance requirements, EHR integration complexity, clinical workflow constraints, FDA regulatory considerations, and patient safety obligations creates architectural requirements that are materially different from general enterprise AI platforms. This reference architecture encodes the design decisions that address healthcare-specific requirements so that engineers building clinical AI systems do not need to rediscover them.

Conceptual Explanation

The HMS AI platform is organized around three principles that govern every architectural decision:

Clinical workflow primacy: AI must serve the clinical workflow, not require the workflow to adapt to AI. Every component that touches a clinician's daily workflow — CDS alerts, documentation tools, prior auth assistance — is designed around the clinical interaction pattern, not around the AI system's technical convenience.

Defense in depth for PHI: No single control is sufficient for PHI protection. The architecture layers HIPAA controls — network isolation, encryption at rest and in transit, access control, audit logging, BAA coverage — so that failure of any single control does not expose patient data.

Governance at every boundary: Every boundary in the architecture — between the clinical application and the AI gateway, between the AI gateway and the LLM vendor, between the AI output and the EHR medical record — is a governance control point. Governance is not a post-deployment review; it is embedded in the technical architecture.

Core Architecture

graph TD subgraph "Clinical Users" U1["Physicians\n(Hospitalists, Specialists)"] U2["Nurses\n(Floor, ICU, ED)"] U3["Care Coordinators\nCase Management"] U4["Coders\nRevenue Cycle"] end subgraph "EHR Layer — Epic FHIR R4" EHR["Epic EHR\nSMART on FHIR\nCDS Hooks\nHL7 v2 ADT"] FHIR["FHIR R4 API\nPatient · Encounter\nCondition · Medication\nObservation · Document"] ADT["HL7 v2 ADT\nAdmit · Transfer\nDischarge Events"] end subgraph "AI Platform — Control Plane" GW["AI Gateway\nAuth · Rate Limit\nRoute · Audit"] IE["Integration Engine\nHL7 v2 Ingestion\nFHIR Subscription"] PR["Prompt Registry\nVersioned Prompts\nClinical Validation"] MR["Model Registry\nApproved Models\nBAA Status"] end subgraph "AI Use Cases" UC1["Discharge Summary AI\nSMART App + FHIR"] UC2["Prior Auth Agent\nAgentic Workflow"] UC3["Clinical Knowledge\nRAG Search"] UC4["CDS — Sepsis Alert\nCDS Hooks Service"] UC5["Medical Coding\nAdministrative AI"] UC6["Care Gap Analysis\nBackground Process"] UC7["Patient Chatbot\nPatient-facing"] end subgraph "Shared AI Services" ES["Embedding Service\nClinical Domain Model"] VS["Clinical Vector Store\nGuidelines · Formulary\nPrior Auth Criteria"] EP["Evaluation Pipeline\nCI/CD for AI Quality"] end subgraph "LLM Vendors — BAA Signed" LLM1["Anthropic API\nClaude — Tier 1 Clinical"] LLM2["Azure OpenAI\nGPT — Tier 2 Admin"] end subgraph "Observability" OT["OpenTelemetry\nCollector"] QS["Quality Scorer\nAsync Eval"] DD["Drift Detector\n7d vs 30d baseline"] DASH["Clinical AI Dashboard\nQuality · Cost · Adoption"] end subgraph "Governance" MB["Model Review Board\nCMIO · AI Architect\nClinical Champions"] AL["Audit Log\nHashed IDs\nImmutable"] RT["Risk Tier Registry\nTier 1 / 2 / 3 Use Cases"] end U1 & U2 & U3 & U4 --> EHR EHR --> FHIR & ADT ADT --> IE --> GW FHIR --> UC1 & UC2 & UC3 & UC4 & UC5 & UC6 EHR --> UC7 UC1 & UC2 & UC3 & UC4 & UC5 & UC6 & UC7 --> GW GW --> PR & MR GW --> LLM1 & LLM2 GW --> ES --> VS GW --> AL GW --> OT OT --> QS --> DD --> DASH AL --> MB DD --> MB EP --> MR & PR MB --> RT

Architecture Diagram

The high-level architecture diagram is shown in the Core Architecture section above. Standalone .mmd files:

architecture/mermaid/07-hms-full-architecture.mmd — Full system diagram
architecture/mermaid/07-hms-ehr-integration-sequence.mmd — EHR integration sequence
architecture/mermaid/07-hms-governance-flow.mmd — Governance decision flow

Enterprise Considerations

Platform Implementation Timeline: The HMS AI platform described here is a 12–18 month implementation for an organization starting from no dedicated AI infrastructure. The Phase A foundation takes 8–12 weeks. The first use case can go live in weeks 16–20 (including governance review, clinical validation, and champion training). Platform payback accelerates with each additional use case.

Team Structure: The HMS AI platform requires a dedicated AI platform team of 4–6 engineers: AI platform architect, 2 AI/ML engineers, healthcare integration engineer (FHIR/HL7 specialist), clinical informatics specialist, and DevSecOps engineer. The clinical informatics specialist bridges the AI platform team and clinical operations — this role is the most often understaffed.

Budget Model: The Reference Healthcare Organization projects an HMS AI platform operating budget allocation across three cost categories (illustrative — verify current vendor pricing):

LLM API usage: scales with use case volume; model tier routing reduces this by 35–50% vs. all-premium routing
Infrastructure: AI gateway hosting, vector store, integration engine, observability stack
Team: AI platform team personnel cost, which is the dominant cost center

Vendor Lock-In Risk: The architecture uses the AI gateway to abstract LLM vendor choice. The FHIR R4 and CDS Hooks integrations use HL7 standards and are EHR-portable. The clinical vector store and evaluation pipeline are not vendor-locked. The highest lock-in risk is the EHR SMART on FHIR registration — switching EHR platforms requires re-registering and re-validating all SMART applications.

Healthcare Example

⊕ Healthcare Example

Educational Example — Illustrative Workflow. Not intended for clinical decision making.

A hospitalist physician at the Reference Healthcare Organization begins the discharge process for a patient. The complete AI platform workflow:

Discharge workflow initiated: The physician clicks "Begin Discharge" in Epic. Epic fires the encounter-discharge CDS Hook. The Discharge Assistance CDS service returns a card: "AI Discharge Summary draft available — click to open."

SMART app launch: The physician clicks the card. Epic initiates a SMART EHR launch for the Discharge Summary AI application. The application receives the patient ID, encounter ID, and a SMART access token scoped to patient/Condition.read, patient/MedicationRequest.read, patient/Observation.read, and patient/DocumentReference.write.

FHIR data retrieval: The application calls the FHIR R4 API: GET /Patient/{id}, GET /Condition (active, encounter-scoped), GET /MedicationRequest (active), GET /Observation (vital signs and labs, encounter-scoped). The FHIR responses are assembled into the clinical context bundle.

AI inference: The application calls the AI gateway with the clinical context and the use case identifier "discharge_summary". The gateway: validates the virtual key, retrieves the current production prompt version from the Prompt Registry, routes to the Premium tier (Claude Opus) per the Model Registry configuration, records the audit log entry with hashed patient ID, and forwards the request to the Anthropic API.

Response delivery: The Anthropic API returns the draft discharge summary. The gateway returns it to the application. The application renders the draft in a side panel within the Epic workflow.

Physician review: The physician reviews the draft, makes modifications (documents 3 changes in the UI), and clicks "Save to Epic." The application calls FHIR POST /DocumentReference with the finalized summary. Epic saves it to the patient's medical record. The application records the override flag and change count in the audit log.

Observability: The gateway emits an OpenTelemetry trace with: request ID, use case, model version, prompt version, input tokens, output tokens, latency. The trace does not contain the prompt text or response content (no PHI in traces). The quality scorer runs async evaluation on the summary structure and completeness. The drift detector updates the 7-day rolling quality average for the discharge summary use case.

Common Mistakes

Deploying the Platform in Full Before Any Use Case Is Live. Organizations that spend 9 months building the complete platform before deploying a single use case cannot validate platform design decisions against real clinical workflows. Build incrementally: the platform emerges from use case requirements, not from design documents.

FHIR API Rate Limit Discovery at Production Scale. Epic's FHIR API rate limits are not published; they are negotiated with the health system. Organizations that design AI systems without confirming API rate limits discover the constraint when they go to production with high-volume use cases. Confirm rate limits with the EHR vendor early.

Governance Without Accountability. A Model Review Board that approves use cases without defined accountability for post-deployment quality is governance theater. The MRB must own the monitoring responsibility: who reviews the quality dashboard weekly, who escalates override rate anomalies, who initiates incident response when a quality event occurs.

Best Practices

Build the AI gateway and prompt registry before the first use case — retrofitting governance infrastructure after use cases are live is significantly harder
Register CDS Hooks services and SMART on FHIR applications with the EHR vendor before technical build — the registration process takes weeks and may require EHR vendor review
Sign HIPAA BAAs with all LLM vendors before any clinical data is transmitted
Maintain the clinical vector store as a shared resource — never allow individual use cases to maintain their own copies of institutional clinical knowledge
Design the governance model around the CMIO, not the IT department — clinical AI governance requires clinical leadership to be credible

Alternatives

The HMS reference architecture uses a custom AI gateway (LiteLLM-based) + independent LLM vendors. Alternative architectural approaches:

Approach	Trade-off
Azure OpenAI Service (all-in)	Single vendor, simplified BAA, but constrained to Microsoft's model release schedule
Epic-native AI (Cognitive Computing)	No separate integration; constrained to Epic's AI capabilities
AWS HealthLake + Bedrock	AWS-native HIPAA infrastructure; strong compliance posture, higher cloud commitment
Google Vertex AI + CCAI	Google Cloud-native; strong for NLP and structured data AI use cases

Trade-offs

Dimension	Centralized Platform (this architecture)	Decentralized (per-use-case)
Governance auditability	High	Low
Use case delivery speed (after platform built)	High	Low
Use case delivery speed (before platform built)	Low	High
Cost attribution accuracy	High	Partial
HIPAA control surface	Concentrated (easier to audit)	Distributed (harder to audit)
Platform maintenance overhead	Medium	Low initially, high at scale

Interview Questions

Q: A hospital CIO asks you to design the AI architecture for a hospital deploying 7 clinical AI use cases over the next 18 months. Walk me through your design.

Category: System Design Difficulty: Principal Role: AI Architect / FDE

Answer Framework:

Start with the governance structure, because the technical architecture serves the governance requirements, not the other way around. The first deliverable is a risk tier classification of all 7 use cases: which are Tier 1 (directly influence patient care), which are Tier 2 (operational), which are Tier 3 (administrative). This classification determines what governance approval each use case requires, what evaluation criteria apply, and what the oversight model looks like post-deployment.

Then the shared infrastructure: AI gateway (before use case 1), prompt registry (before use case 1), HIPAA BAAs signed with all LLM vendors (before any clinical data is transmitted), EHR SMART on FHIR registration (weeks of lead time with Epic). These are the blocking infrastructure items that must precede use case development.

The EHR integration pattern depends on use case type: SMART on FHIR applications for workflow-embedded tools (discharge summary, clinical knowledge search); CDS Hooks services for in-workflow recommendations (medication safety, care gap alerts); HL7 v2 ADT feed via integration engine for event-driven background processing (admission event → discharge planning context enrichment).

Shared clinical infrastructure: a single clinical vector store serving all knowledge retrieval use cases (guidelines, formulary, prior auth criteria), populated and maintained by the AI platform team. Individual use cases query the shared store — they do not maintain their own knowledge bases.

Observability is not optional for clinical AI: OpenTelemetry tracing (metadata only — no PHI in traces), a quality scorer that evaluates AI outputs against golden datasets, and a clinical AI dashboard visible to the CMIO and Model Review Board. Override rate monitoring for all clinical use cases.

Key Points to Hit:

Start with governance and risk tier classification
Shared infrastructure first — gateway, prompt registry, BAAs, EHR registration
EHR integration pattern depends on use case type — SMART vs. CDS Hooks vs. ADT
Single shared clinical vector store
Observability with no PHI in traces
Override rate monitoring as the post-deployment quality signal

Key Takeaways

The HMS AI platform is not a single system — it is a set of shared infrastructure components that enable multiple clinical AI use cases without rebuilding foundational capabilities for each one
Build order matters: AI gateway, prompt registry, and BAAs before the first use case; shared vector store and CDS Hooks before use case 3
The CMIO and Model Review Board are governance requirements, not optional oversight — clinical AI governance needs clinical leadership to be credible
PHI flows through the AI gateway under BAA coverage to LLM vendors; audit logs use hashed identifiers; observability traces contain no PHI
FHIR R4, SMART on FHIR, and CDS Hooks are the integration standards that connect AI capabilities to the clinical workflow — standard-based integration is non-negotiable for EHR-embedded AI
The platform pays for itself through use case delivery speed: use case 1 takes months; use case 7 takes weeks

HMS Reference Architecture#

Executive Summary#

Learning Objectives#

Business Problem#

Why This Technology Exists#

Conceptual Explanation#

Core Architecture#

Architecture Diagram#

Enterprise Considerations#

Healthcare Example#

Common Mistakes#

Best Practices#

Alternatives#

Trade-offs#

Interview Questions#

Q: A hospital CIO asks you to design the AI architecture for a hospital deploying 7 clinical AI use cases over the next 18 months. Walk me through your design.#

Key Takeaways#

Further Reading#

HMS Reference Architecture

Executive Summary

Learning Objectives

Business Problem

Why This Technology Exists

Conceptual Explanation

Core Architecture

Architecture Diagram

Enterprise Considerations

Healthcare Example

Common Mistakes

Best Practices

Alternatives

Trade-offs

Interview Questions

Q: A hospital CIO asks you to design the AI architecture for a hospital deploying 7 clinical AI use cases over the next 18 months. Walk me through your design.

Key Takeaways

Further Reading