HMS Reference Architecture
Executive Summary
This chapter synthesizes every concept, pattern, and architectural decision from the preceding chapters into a complete, deployable Hospital Management System (HMS) AI platform reference architecture. It is the flagship document of this repository: the single place where a principal engineer or hospital CIO can see how AI strategy, governance, EHR integration, clinical RAG, clinical decision support, observability, cost management, security, and change management fit together into a cohesive clinical AI platform. The architecture described here is not a proof of concept โ it is a production-grade reference for a hospital deploying 7 AI use cases serving 300+ clinical and administrative users under HIPAA and Joint Commission compliance requirements.
Learning Objectives
After reading this chapter, you will be able to:
- Describe the complete HMS AI platform architecture at component, integration, and data flow levels
- Identify the dependencies between platform components and the sequence in which they should be implemented
- Apply the reference architecture to evaluate gaps in an existing hospital AI deployment
- Use this reference architecture as the basis for a hospital AI platform RFP or technical design review
Business Problem
The Reference Healthcare Organization has deployed its first three clinical AI use cases independently โ each with its own EHR integration, its own LLM API access, its own prompt management, and its own observability approach. Governance operates through individual project reviews. Cost attribution is per-project budget lines that the CFO cannot reconcile against actual AI platform costs.
This architecture solves the sprawl problem: it defines the shared infrastructure that each AI use case leverages, the governance structures that span all use cases, and the integration patterns that connect the AI platform to the EHR and clinical workflow. It is designed as an incremental build โ the platform can be implemented use-case by use-case, with each piece of infrastructure added as it becomes the binding constraint on the next use case.
Why This Technology Exists
The HMS reference architecture exists because healthcare AI platforms are not commodity infrastructure โ the combination of HIPAA compliance requirements, EHR integration complexity, clinical workflow constraints, FDA regulatory considerations, and patient safety obligations creates architectural requirements that are materially different from general enterprise AI platforms. This reference architecture encodes the design decisions that address healthcare-specific requirements so that engineers building clinical AI systems do not need to rediscover them.
Conceptual Explanation
The HMS AI platform is organized around three principles that govern every architectural decision:
Clinical workflow primacy: AI must serve the clinical workflow, not require the workflow to adapt to AI. Every component that touches a clinician's daily workflow โ CDS alerts, documentation tools, prior auth assistance โ is designed around the clinical interaction pattern, not around the AI system's technical convenience.
Defense in depth for PHI: No single control is sufficient for PHI protection. The architecture layers HIPAA controls โ network isolation, encryption at rest and in transit, access control, audit logging, BAA coverage โ so that failure of any single control does not expose patient data.
Governance at every boundary: Every boundary in the architecture โ between the clinical application and the AI gateway, between the AI gateway and the LLM vendor, between the AI output and the EHR medical record โ is a governance control point. Governance is not a post-deployment review; it is embedded in the technical architecture.
Core Architecture
Components
EHR Integration Layer
Epic FHIR R4 API: The primary data source for all clinical AI use cases. The Reference Healthcare Organization uses Epic as its EHR. All AI use cases that require patient clinical context retrieve it via authenticated FHIR R4 API calls, scoped by SMART on FHIR access tokens. The EHR FHIR API is the authoritative source; the AI platform does not maintain a parallel patient data store.
HL7 v2 ADT Feed: Real-time patient event notification. Patient admits, transfers, and discharges are communicated via HL7 v2 ADT messages routed through the integration engine to the AI platform. Admission events trigger background context enrichment for discharge summary AI; discharge events trigger care gap analysis and follow-up scheduling.
CDS Hooks Services: The Reference Healthcare Organization registers three CDS Hooks services with Epic:
patient-viewโ Care Gap CDS (fires when clinician opens chart)order-signโ Medication Safety CDS (fires when medication order is signed)encounter-dischargeโ Discharge Assistance CDS (fires when discharge workflow starts)
SMART on FHIR Applications: Three AI use cases are registered as SMART on FHIR apps in Epic App Orchard:
- Discharge Summary AI โ EHR launch from discharge documentation workflow
- Clinical Knowledge Search โ EHR launch from clinician order or documentation context
- Patient Engagement Chatbot โ Patient-facing, MyChart integration
AI Platform Control Plane
AI Gateway: Deployed as a LiteLLM Proxy instance with custom clinical middleware, hosted in a HIPAA-compliant cloud VPC. All LLM inference requests from all AI use cases flow through the gateway. The gateway provides: SMART token validation, virtual key management, model tier routing (Claude for Tier 1 clinical use cases; Azure OpenAI for Tier 2 administrative use cases), prompt injection from the registry, cost attribution by use case and department, and immutable audit logging with hashed patient identifiers.
Integration Engine: A healthcare integration engine (Microsoft Azure Health Data Services or equivalent) that receives HL7 v2 ADT messages from the EHR, deduplicates, and publishes them to the AI platform event bus. Also handles FHIR Subscription notifications for use cases that need sub-minute event latency.
Prompt Registry: A version-controlled store of all production prompts, implemented as a Git repository with a lightweight API. Each prompt version is associated with: model compatibility, clinical validation status, evaluation metrics, approval timestamp, and approver identity. The AI gateway retrieves the current production prompt for each use case on each request โ prompts are never hardcoded in application code.
Model Registry: A catalog of approved model versions with HIPAA BAA status, applicable use cases, evaluation results, and governance approval. The gateway enforces that only registered models can be called โ preventing individual use case teams from calling unapproved model versions.
AI Use Cases
| Use Case | Integration Pattern | AI Category | Risk Tier | LLM Tier |
|---|---|---|---|---|
| Discharge Summary AI | SMART on FHIR + FHIR R4 | Clinical Documentation | Tier 2 | Premium (Claude Opus) |
| Prior Auth Agent | Agentic workflow + payer APIs | Administrative | Tier 2 | Standard (Claude Sonnet) |
| Clinical Knowledge RAG | SMART on FHIR + CDS Hooks | Clinical CDS | Tier 2 | Standard (Claude Sonnet) |
| CDS โ Medication Safety | CDS Hooks order-sign |
Clinical CDS | Tier 1 | Rule engine + Standard |
| Medical Coding AI | Batch processing | Administrative | Tier 3 | Economy (Haiku) |
| Care Gap Analysis | Background batch | Administrative CDS | Tier 2 | Economy (Haiku) |
| Patient Chatbot | Patient-facing MyChart | Patient Engagement | Tier 2 | Standard (Claude Sonnet) |
Shared AI Services
Embedding Service: A dedicated inference endpoint for clinical text embedding, using a clinical-domain embedding model. All clinical vector store indexing and query-time embedding flows through this service. Centralized embedding ensures consistent model version across all vector indexes.
Clinical Vector Store: A single vector database instance partitioned by knowledge category. Partitions: clinical guidelines, hospital formulary, prior auth criteria (per payer), ICD-10/CPT codes, clinical protocols. The store is updated on the knowledge source update schedule (see Chapter 4).
Evaluation Pipeline: A CI/CD pipeline that runs against golden datasets for each use case when a new model version or prompt version is proposed. Results are published to the Model Review Board governance dashboard and gate deployment promotion.
Governance Structures
Model Review Board: Chaired by the CMIO. Standing members: Chief AI Architect, Clinical Informatics Director, Privacy Officer, Risk Manager, and two physician champions (rotating). Meeting cadence: bi-weekly for standard agenda items; urgent session within 72 hours for safety events. Responsibilities: approve Tier 1 AI deployments, review safety events, review override rate trends, approve model and prompt version promotions.
Risk Tier Registry: A maintained catalog of all AI use cases with their assigned risk tier (Tier 1 = clinical patient care impact, Tier 2 = clinical operations, Tier 3 = administrative). Tier classification determines: governance approval requirements, evaluation criteria, override monitoring thresholds, and incident response priority.
Audit Log: An immutable append-only log of every AI inference request, stored in a HIPAA-compliant log store with 6-year retention. Each record: request ID, timestamp, hashed patient ID (SHA-256 of MRN + system salt), use case, model version, prompt version, input and output token counts, override flag. No raw PHI in the log.
Architecture Diagram
The high-level architecture diagram is shown in the Core Architecture section above. Standalone .mmd files:
architecture/mermaid/07-hms-full-architecture.mmdโ Full system diagramarchitecture/mermaid/07-hms-ehr-integration-sequence.mmdโ EHR integration sequencearchitecture/mermaid/07-hms-governance-flow.mmdโ Governance decision flow
Implementation Patterns
HMS Platform Build Sequence
The HMS AI platform should be built incrementally, not as a complete system before any use case is deployed. The recommended sequence:
Phase A โ Foundation (before use case 1):
1. AI Gateway (LiteLLM Proxy with custom middleware)
2. Prompt Registry (Git + lightweight API)
3. Model Registry (database + API)
4. Audit Log infrastructure
5. HIPAA BAAs signed with LLM vendor and cloud provider
Phase B โ First Use Case (Discharge Summary AI):
6. SMART on FHIR registration with Epic
7. FHIR R4 client (clinical context assembly)
8. Discharge Summary SMART application
9. Evaluation pipeline (initial golden dataset)
10. Model Review Board charter and first meeting
Phase C โ Platform Expansion (before use case 3):
11. Embedding Service
12. Clinical Vector Store (initial: clinical guidelines, formulary)
13. Integration Engine (HL7 v2 ADT ingestion)
14. CDS Hooks services registration
15. Observability stack (OpenTelemetry + dashboards)
Phase D โ Scale (use cases 4โ7):
16. Prior Auth agentic workflow
17. Medical Coding batch pipeline
18. Care Gap background processor
19. Patient Chatbot (patient-facing SMART app)
20. Full clinical champion networkPHI Data Flow Summary
Patient PHI enters the AI platform via:
- FHIR R4 API calls (application-level, SMART-authenticated)
- CDS Hooks prefetch payloads (at hook fire events)
- HL7 v2 ADT messages (patient event notifications)
PHI exits the AI platform to:
- LLM vendor API (in inference request payloads โ BAA required)
- Clinical Vector Store (if clinical knowledge includes patient-derived content โ BAA required)
- EHR via FHIR DocumentReference (AI-generated clinical documents)
- Audit log (hashed patient IDs only โ not raw PHI)
PHI does not exit to:
- Observability traces (scrubbed at gateway โ metadata only)
- Evaluation datasets (de-identified via Safe Harbor before use)
- LLM training data (confirmed via BAA provisions)Enterprise Considerations
Platform Implementation Timeline: The HMS AI platform described here is a 12โ18 month implementation for an organization starting from no dedicated AI infrastructure. The Phase A foundation takes 8โ12 weeks. The first use case can go live in weeks 16โ20 (including governance review, clinical validation, and champion training). Platform payback accelerates with each additional use case.
Team Structure: The HMS AI platform requires a dedicated AI platform team of 4โ6 engineers: AI platform architect, 2 AI/ML engineers, healthcare integration engineer (FHIR/HL7 specialist), clinical informatics specialist, and DevSecOps engineer. The clinical informatics specialist bridges the AI platform team and clinical operations โ this role is the most often understaffed.
Budget Model: The Reference Healthcare Organization projects an HMS AI platform operating budget allocation across three cost categories (illustrative โ verify current vendor pricing):
- LLM API usage: scales with use case volume; model tier routing reduces this by 35โ50% vs. all-premium routing
- Infrastructure: AI gateway hosting, vector store, integration engine, observability stack
- Team: AI platform team personnel cost, which is the dominant cost center
Vendor Lock-In Risk: The architecture uses the AI gateway to abstract LLM vendor choice. The FHIR R4 and CDS Hooks integrations use HL7 standards and are EHR-portable. The clinical vector store and evaluation pipeline are not vendor-locked. The highest lock-in risk is the EHR SMART on FHIR registration โ switching EHR platforms requires re-registering and re-validating all SMART applications.
Security Considerations
Network Architecture:
- The AI gateway and all platform services are deployed in a HIPAA-eligible VPC
- LLM vendor API calls are routed over private link where available (Azure Private Link for Azure OpenAI; no equivalent currently available for direct Anthropic API)
- Clinical vector store access is network-restricted to the AI platform VPC
- No direct internet access to the EHR FHIR API from AI platform components โ all FHIR access is via authenticated API calls
Authentication:
- AI use case applications authenticate to the AI gateway via virtual keys
- FHIR API access is via SMART on FHIR access tokens scoped to the requesting application
- CDS Hooks services authenticate incoming requests using EHR-provided JSON Web Tokens (JWTs)
Encryption:
- All data in transit: TLS 1.2+ (TLS 1.3 preferred)
- All PHI at rest: AES-256
- Audit log: write-once storage with cryptographic integrity verification
Healthcare Example
Educational Example โ Illustrative Workflow. Not intended for clinical decision making.
A hospitalist physician at the Reference Healthcare Organization begins the discharge process for a patient. The complete AI platform workflow:
- Discharge workflow initiated: The physician clicks "Begin Discharge" in Epic. Epic fires the
encounter-dischargeCDS Hook. The Discharge Assistance CDS service returns a card: "AI Discharge Summary draft available โ click to open."
- SMART app launch: The physician clicks the card. Epic initiates a SMART EHR launch for the Discharge Summary AI application. The application receives the patient ID, encounter ID, and a SMART access token scoped to
patient/Condition.read,patient/MedicationRequest.read,patient/Observation.read, andpatient/DocumentReference.write.
- FHIR data retrieval: The application calls the FHIR R4 API: GET /Patient/{id}, GET /Condition (active, encounter-scoped), GET /MedicationRequest (active), GET /Observation (vital signs and labs, encounter-scoped). The FHIR responses are assembled into the clinical context bundle.
- AI inference: The application calls the AI gateway with the clinical context and the use case identifier "discharge_summary". The gateway: validates the virtual key, retrieves the current production prompt version from the Prompt Registry, routes to the Premium tier (Claude Opus) per the Model Registry configuration, records the audit log entry with hashed patient ID, and forwards the request to the Anthropic API.
- Response delivery: The Anthropic API returns the draft discharge summary. The gateway returns it to the application. The application renders the draft in a side panel within the Epic workflow.
- Physician review: The physician reviews the draft, makes modifications (documents 3 changes in the UI), and clicks "Save to Epic." The application calls FHIR POST /DocumentReference with the finalized summary. Epic saves it to the patient's medical record. The application records the override flag and change count in the audit log.
- Observability: The gateway emits an OpenTelemetry trace with: request ID, use case, model version, prompt version, input tokens, output tokens, latency. The trace does not contain the prompt text or response content (no PHI in traces). The quality scorer runs async evaluation on the summary structure and completeness. The drift detector updates the 7-day rolling quality average for the discharge summary use case.
Common Mistakes
Deploying the Platform in Full Before Any Use Case Is Live. Organizations that spend 9 months building the complete platform before deploying a single use case cannot validate platform design decisions against real clinical workflows. Build incrementally: the platform emerges from use case requirements, not from design documents.
FHIR API Rate Limit Discovery at Production Scale. Epic's FHIR API rate limits are not published; they are negotiated with the health system. Organizations that design AI systems without confirming API rate limits discover the constraint when they go to production with high-volume use cases. Confirm rate limits with the EHR vendor early.
Governance Without Accountability. A Model Review Board that approves use cases without defined accountability for post-deployment quality is governance theater. The MRB must own the monitoring responsibility: who reviews the quality dashboard weekly, who escalates override rate anomalies, who initiates incident response when a quality event occurs.
Best Practices
- Build the AI gateway and prompt registry before the first use case โ retrofitting governance infrastructure after use cases are live is significantly harder
- Register CDS Hooks services and SMART on FHIR applications with the EHR vendor before technical build โ the registration process takes weeks and may require EHR vendor review
- Sign HIPAA BAAs with all LLM vendors before any clinical data is transmitted
- Maintain the clinical vector store as a shared resource โ never allow individual use cases to maintain their own copies of institutional clinical knowledge
- Design the governance model around the CMIO, not the IT department โ clinical AI governance requires clinical leadership to be credible
Alternatives
The HMS reference architecture uses a custom AI gateway (LiteLLM-based) + independent LLM vendors. Alternative architectural approaches:
| Approach | Trade-off |
|---|---|
| Azure OpenAI Service (all-in) | Single vendor, simplified BAA, but constrained to Microsoft's model release schedule |
| Epic-native AI (Cognitive Computing) | No separate integration; constrained to Epic's AI capabilities |
| AWS HealthLake + Bedrock | AWS-native HIPAA infrastructure; strong compliance posture, higher cloud commitment |
| Google Vertex AI + CCAI | Google Cloud-native; strong for NLP and structured data AI use cases |
Trade-offs
| Dimension | Centralized Platform (this architecture) | Decentralized (per-use-case) |
|---|---|---|
| Governance auditability | High | Low |
| Use case delivery speed (after platform built) | High | Low |
| Use case delivery speed (before platform built) | Low | High |
| Cost attribution accuracy | High | Partial |
| HIPAA control surface | Concentrated (easier to audit) | Distributed (harder to audit) |
| Platform maintenance overhead | Medium | Low initially, high at scale |
Interview Questions
Q: A hospital CIO asks you to design the AI architecture for a hospital deploying 7 clinical AI use cases over the next 18 months. Walk me through your design.
Category: System Design Difficulty: Principal Role: AI Architect / FDE
Answer Framework:
Start with the governance structure, because the technical architecture serves the governance requirements, not the other way around. The first deliverable is a risk tier classification of all 7 use cases: which are Tier 1 (directly influence patient care), which are Tier 2 (operational), which are Tier 3 (administrative). This classification determines what governance approval each use case requires, what evaluation criteria apply, and what the oversight model looks like post-deployment.
Then the shared infrastructure: AI gateway (before use case 1), prompt registry (before use case 1), HIPAA BAAs signed with all LLM vendors (before any clinical data is transmitted), EHR SMART on FHIR registration (weeks of lead time with Epic). These are the blocking infrastructure items that must precede use case development.
The EHR integration pattern depends on use case type: SMART on FHIR applications for workflow-embedded tools (discharge summary, clinical knowledge search); CDS Hooks services for in-workflow recommendations (medication safety, care gap alerts); HL7 v2 ADT feed via integration engine for event-driven background processing (admission event โ discharge planning context enrichment).
Shared clinical infrastructure: a single clinical vector store serving all knowledge retrieval use cases (guidelines, formulary, prior auth criteria), populated and maintained by the AI platform team. Individual use cases query the shared store โ they do not maintain their own knowledge bases.
Observability is not optional for clinical AI: OpenTelemetry tracing (metadata only โ no PHI in traces), a quality scorer that evaluates AI outputs against golden datasets, and a clinical AI dashboard visible to the CMIO and Model Review Board. Override rate monitoring for all clinical use cases.
Key Points to Hit:
- Start with governance and risk tier classification
- Shared infrastructure first โ gateway, prompt registry, BAAs, EHR registration
- EHR integration pattern depends on use case type โ SMART vs. CDS Hooks vs. ADT
- Single shared clinical vector store
- Observability with no PHI in traces
- Override rate monitoring as the post-deployment quality signal
Key Takeaways
- The HMS AI platform is not a single system โ it is a set of shared infrastructure components that enable multiple clinical AI use cases without rebuilding foundational capabilities for each one
- Build order matters: AI gateway, prompt registry, and BAAs before the first use case; shared vector store and CDS Hooks before use case 3
- The CMIO and Model Review Board are governance requirements, not optional oversight โ clinical AI governance needs clinical leadership to be credible
- PHI flows through the AI gateway under BAA coverage to LLM vendors; audit logs use hashed identifiers; observability traces contain no PHI
- FHIR R4, SMART on FHIR, and CDS Hooks are the integration standards that connect AI capabilities to the clinical workflow โ standard-based integration is non-negotiable for EHR-embedded AI
- The platform pays for itself through use case delivery speed: use case 1 takes months; use case 7 takes weeks
Glossary
HMS (Hospital Management System): The integrated information system supporting clinical, operational, and administrative functions of a hospital. In this repository, refers to the Reference Healthcare Organization's complete clinical AI platform.
CMIO (Chief Medical Information Officer): The senior physician executive responsible for clinical informatics, EHR governance, and (typically) clinical AI governance.
Model Review Board: The governance body, chaired by the CMIO, that approves clinical AI use cases for production deployment and monitors post-deployment quality.
Risk Tier: The classification of an AI use case based on its proximity to patient care decisions (Tier 1 = direct clinical impact, Tier 2 = operational, Tier 3 = administrative).
Further Reading
- Chapter 1: Healthcare AI Landscape โ FDA SaMD classification for HMS use cases
- Chapter 2: HIPAA and AI โ PHI controls embedded in this architecture
- Chapter 3: EHR Integration โ FHIR, SMART, and CDS Hooks details
- Chapter 4: Clinical RAG โ Clinical vector store design
- Chapter 5: Clinical Decision Support โ CDS Hooks services in this architecture
- Enterprise AI: AI Platform Architecture โ Platform infrastructure layer this architecture builds on
- Enterprise AI: AI Governance โ Governance framework applied to HMS context