HMS Reference Architecture

Executive Summary

This chapter synthesizes every concept, pattern, and architectural decision from the preceding chapters into a complete, deployable Hospital Management System (HMS) AI platform reference architecture. It is the flagship document of this repository: the single place where a principal engineer or hospital CIO can see how AI strategy, governance, EHR integration, clinical RAG, clinical decision support, observability, cost management, security, and change management fit together into a cohesive clinical AI platform. The architecture described here is not a proof of concept โ€” it is a production-grade reference for a hospital deploying 7 AI use cases serving 300+ clinical and administrative users under HIPAA and Joint Commission compliance requirements.

Learning Objectives

After reading this chapter, you will be able to:

  • Describe the complete HMS AI platform architecture at component, integration, and data flow levels
  • Identify the dependencies between platform components and the sequence in which they should be implemented
  • Apply the reference architecture to evaluate gaps in an existing hospital AI deployment
  • Use this reference architecture as the basis for a hospital AI platform RFP or technical design review

Business Problem

The Reference Healthcare Organization has deployed its first three clinical AI use cases independently โ€” each with its own EHR integration, its own LLM API access, its own prompt management, and its own observability approach. Governance operates through individual project reviews. Cost attribution is per-project budget lines that the CFO cannot reconcile against actual AI platform costs.

This architecture solves the sprawl problem: it defines the shared infrastructure that each AI use case leverages, the governance structures that span all use cases, and the integration patterns that connect the AI platform to the EHR and clinical workflow. It is designed as an incremental build โ€” the platform can be implemented use-case by use-case, with each piece of infrastructure added as it becomes the binding constraint on the next use case.

Why This Technology Exists

The HMS reference architecture exists because healthcare AI platforms are not commodity infrastructure โ€” the combination of HIPAA compliance requirements, EHR integration complexity, clinical workflow constraints, FDA regulatory considerations, and patient safety obligations creates architectural requirements that are materially different from general enterprise AI platforms. This reference architecture encodes the design decisions that address healthcare-specific requirements so that engineers building clinical AI systems do not need to rediscover them.

Conceptual Explanation

The HMS AI platform is organized around three principles that govern every architectural decision:

Clinical workflow primacy: AI must serve the clinical workflow, not require the workflow to adapt to AI. Every component that touches a clinician's daily workflow โ€” CDS alerts, documentation tools, prior auth assistance โ€” is designed around the clinical interaction pattern, not around the AI system's technical convenience.

Defense in depth for PHI: No single control is sufficient for PHI protection. The architecture layers HIPAA controls โ€” network isolation, encryption at rest and in transit, access control, audit logging, BAA coverage โ€” so that failure of any single control does not expose patient data.

Governance at every boundary: Every boundary in the architecture โ€” between the clinical application and the AI gateway, between the AI gateway and the LLM vendor, between the AI output and the EHR medical record โ€” is a governance control point. Governance is not a post-deployment review; it is embedded in the technical architecture.

Core Architecture

Components

EHR Integration Layer

Epic FHIR R4 API: The primary data source for all clinical AI use cases. The Reference Healthcare Organization uses Epic as its EHR. All AI use cases that require patient clinical context retrieve it via authenticated FHIR R4 API calls, scoped by SMART on FHIR access tokens. The EHR FHIR API is the authoritative source; the AI platform does not maintain a parallel patient data store.

HL7 v2 ADT Feed: Real-time patient event notification. Patient admits, transfers, and discharges are communicated via HL7 v2 ADT messages routed through the integration engine to the AI platform. Admission events trigger background context enrichment for discharge summary AI; discharge events trigger care gap analysis and follow-up scheduling.

CDS Hooks Services: The Reference Healthcare Organization registers three CDS Hooks services with Epic:

  • patient-view โ†’ Care Gap CDS (fires when clinician opens chart)
  • order-sign โ†’ Medication Safety CDS (fires when medication order is signed)
  • encounter-discharge โ†’ Discharge Assistance CDS (fires when discharge workflow starts)

SMART on FHIR Applications: Three AI use cases are registered as SMART on FHIR apps in Epic App Orchard:

  • Discharge Summary AI โ€” EHR launch from discharge documentation workflow
  • Clinical Knowledge Search โ€” EHR launch from clinician order or documentation context
  • Patient Engagement Chatbot โ€” Patient-facing, MyChart integration

AI Platform Control Plane

AI Gateway: Deployed as a LiteLLM Proxy instance with custom clinical middleware, hosted in a HIPAA-compliant cloud VPC. All LLM inference requests from all AI use cases flow through the gateway. The gateway provides: SMART token validation, virtual key management, model tier routing (Claude for Tier 1 clinical use cases; Azure OpenAI for Tier 2 administrative use cases), prompt injection from the registry, cost attribution by use case and department, and immutable audit logging with hashed patient identifiers.

Integration Engine: A healthcare integration engine (Microsoft Azure Health Data Services or equivalent) that receives HL7 v2 ADT messages from the EHR, deduplicates, and publishes them to the AI platform event bus. Also handles FHIR Subscription notifications for use cases that need sub-minute event latency.

Prompt Registry: A version-controlled store of all production prompts, implemented as a Git repository with a lightweight API. Each prompt version is associated with: model compatibility, clinical validation status, evaluation metrics, approval timestamp, and approver identity. The AI gateway retrieves the current production prompt for each use case on each request โ€” prompts are never hardcoded in application code.

Model Registry: A catalog of approved model versions with HIPAA BAA status, applicable use cases, evaluation results, and governance approval. The gateway enforces that only registered models can be called โ€” preventing individual use case teams from calling unapproved model versions.

AI Use Cases

Use Case Integration Pattern AI Category Risk Tier LLM Tier
Discharge Summary AI SMART on FHIR + FHIR R4 Clinical Documentation Tier 2 Premium (Claude Opus)
Prior Auth Agent Agentic workflow + payer APIs Administrative Tier 2 Standard (Claude Sonnet)
Clinical Knowledge RAG SMART on FHIR + CDS Hooks Clinical CDS Tier 2 Standard (Claude Sonnet)
CDS โ€” Medication Safety CDS Hooks order-sign Clinical CDS Tier 1 Rule engine + Standard
Medical Coding AI Batch processing Administrative Tier 3 Economy (Haiku)
Care Gap Analysis Background batch Administrative CDS Tier 2 Economy (Haiku)
Patient Chatbot Patient-facing MyChart Patient Engagement Tier 2 Standard (Claude Sonnet)

Shared AI Services

Embedding Service: A dedicated inference endpoint for clinical text embedding, using a clinical-domain embedding model. All clinical vector store indexing and query-time embedding flows through this service. Centralized embedding ensures consistent model version across all vector indexes.

Clinical Vector Store: A single vector database instance partitioned by knowledge category. Partitions: clinical guidelines, hospital formulary, prior auth criteria (per payer), ICD-10/CPT codes, clinical protocols. The store is updated on the knowledge source update schedule (see Chapter 4).

Evaluation Pipeline: A CI/CD pipeline that runs against golden datasets for each use case when a new model version or prompt version is proposed. Results are published to the Model Review Board governance dashboard and gate deployment promotion.

Governance Structures

Model Review Board: Chaired by the CMIO. Standing members: Chief AI Architect, Clinical Informatics Director, Privacy Officer, Risk Manager, and two physician champions (rotating). Meeting cadence: bi-weekly for standard agenda items; urgent session within 72 hours for safety events. Responsibilities: approve Tier 1 AI deployments, review safety events, review override rate trends, approve model and prompt version promotions.

Risk Tier Registry: A maintained catalog of all AI use cases with their assigned risk tier (Tier 1 = clinical patient care impact, Tier 2 = clinical operations, Tier 3 = administrative). Tier classification determines: governance approval requirements, evaluation criteria, override monitoring thresholds, and incident response priority.

Audit Log: An immutable append-only log of every AI inference request, stored in a HIPAA-compliant log store with 6-year retention. Each record: request ID, timestamp, hashed patient ID (SHA-256 of MRN + system salt), use case, model version, prompt version, input and output token counts, override flag. No raw PHI in the log.

Architecture Diagram

The high-level architecture diagram is shown in the Core Architecture section above. Standalone .mmd files:

  • architecture/mermaid/07-hms-full-architecture.mmd โ€” Full system diagram
  • architecture/mermaid/07-hms-ehr-integration-sequence.mmd โ€” EHR integration sequence
  • architecture/mermaid/07-hms-governance-flow.mmd โ€” Governance decision flow

Implementation Patterns

HMS Platform Build Sequence

The HMS AI platform should be built incrementally, not as a complete system before any use case is deployed. The recommended sequence:

text
Phase A โ€” Foundation (before use case 1):
  1. AI Gateway (LiteLLM Proxy with custom middleware)
  2. Prompt Registry (Git + lightweight API)
  3. Model Registry (database + API)
  4. Audit Log infrastructure
  5. HIPAA BAAs signed with LLM vendor and cloud provider

Phase B โ€” First Use Case (Discharge Summary AI):
  6. SMART on FHIR registration with Epic
  7. FHIR R4 client (clinical context assembly)
  8. Discharge Summary SMART application
  9. Evaluation pipeline (initial golden dataset)
  10. Model Review Board charter and first meeting

Phase C โ€” Platform Expansion (before use case 3):
  11. Embedding Service
  12. Clinical Vector Store (initial: clinical guidelines, formulary)
  13. Integration Engine (HL7 v2 ADT ingestion)
  14. CDS Hooks services registration
  15. Observability stack (OpenTelemetry + dashboards)

Phase D โ€” Scale (use cases 4โ€“7):
  16. Prior Auth agentic workflow
  17. Medical Coding batch pipeline
  18. Care Gap background processor
  19. Patient Chatbot (patient-facing SMART app)
  20. Full clinical champion network

PHI Data Flow Summary

text
Patient PHI enters the AI platform via:
  - FHIR R4 API calls (application-level, SMART-authenticated)
  - CDS Hooks prefetch payloads (at hook fire events)
  - HL7 v2 ADT messages (patient event notifications)

PHI exits the AI platform to:
  - LLM vendor API (in inference request payloads โ€” BAA required)
  - Clinical Vector Store (if clinical knowledge includes patient-derived content โ€” BAA required)
  - EHR via FHIR DocumentReference (AI-generated clinical documents)
  - Audit log (hashed patient IDs only โ€” not raw PHI)

PHI does not exit to:
  - Observability traces (scrubbed at gateway โ€” metadata only)
  - Evaluation datasets (de-identified via Safe Harbor before use)
  - LLM training data (confirmed via BAA provisions)

Enterprise Considerations

Platform Implementation Timeline: The HMS AI platform described here is a 12โ€“18 month implementation for an organization starting from no dedicated AI infrastructure. The Phase A foundation takes 8โ€“12 weeks. The first use case can go live in weeks 16โ€“20 (including governance review, clinical validation, and champion training). Platform payback accelerates with each additional use case.

Team Structure: The HMS AI platform requires a dedicated AI platform team of 4โ€“6 engineers: AI platform architect, 2 AI/ML engineers, healthcare integration engineer (FHIR/HL7 specialist), clinical informatics specialist, and DevSecOps engineer. The clinical informatics specialist bridges the AI platform team and clinical operations โ€” this role is the most often understaffed.

Budget Model: The Reference Healthcare Organization projects an HMS AI platform operating budget allocation across three cost categories (illustrative โ€” verify current vendor pricing):

  • LLM API usage: scales with use case volume; model tier routing reduces this by 35โ€“50% vs. all-premium routing
  • Infrastructure: AI gateway hosting, vector store, integration engine, observability stack
  • Team: AI platform team personnel cost, which is the dominant cost center

Vendor Lock-In Risk: The architecture uses the AI gateway to abstract LLM vendor choice. The FHIR R4 and CDS Hooks integrations use HL7 standards and are EHR-portable. The clinical vector store and evaluation pipeline are not vendor-locked. The highest lock-in risk is the EHR SMART on FHIR registration โ€” switching EHR platforms requires re-registering and re-validating all SMART applications.

Security Considerations

Network Architecture:

  • The AI gateway and all platform services are deployed in a HIPAA-eligible VPC
  • LLM vendor API calls are routed over private link where available (Azure Private Link for Azure OpenAI; no equivalent currently available for direct Anthropic API)
  • Clinical vector store access is network-restricted to the AI platform VPC
  • No direct internet access to the EHR FHIR API from AI platform components โ€” all FHIR access is via authenticated API calls

Authentication:

  • AI use case applications authenticate to the AI gateway via virtual keys
  • FHIR API access is via SMART on FHIR access tokens scoped to the requesting application
  • CDS Hooks services authenticate incoming requests using EHR-provided JSON Web Tokens (JWTs)

Encryption:

  • All data in transit: TLS 1.2+ (TLS 1.3 preferred)
  • All PHI at rest: AES-256
  • Audit log: write-once storage with cryptographic integrity verification

Healthcare Example

โŠ• Healthcare Example

Educational Example โ€” Illustrative Workflow. Not intended for clinical decision making.

A hospitalist physician at the Reference Healthcare Organization begins the discharge process for a patient. The complete AI platform workflow:

  1. Discharge workflow initiated: The physician clicks "Begin Discharge" in Epic. Epic fires the encounter-discharge CDS Hook. The Discharge Assistance CDS service returns a card: "AI Discharge Summary draft available โ€” click to open."
  1. SMART app launch: The physician clicks the card. Epic initiates a SMART EHR launch for the Discharge Summary AI application. The application receives the patient ID, encounter ID, and a SMART access token scoped to patient/Condition.read, patient/MedicationRequest.read, patient/Observation.read, and patient/DocumentReference.write.
  1. FHIR data retrieval: The application calls the FHIR R4 API: GET /Patient/{id}, GET /Condition (active, encounter-scoped), GET /MedicationRequest (active), GET /Observation (vital signs and labs, encounter-scoped). The FHIR responses are assembled into the clinical context bundle.
  1. AI inference: The application calls the AI gateway with the clinical context and the use case identifier "discharge_summary". The gateway: validates the virtual key, retrieves the current production prompt version from the Prompt Registry, routes to the Premium tier (Claude Opus) per the Model Registry configuration, records the audit log entry with hashed patient ID, and forwards the request to the Anthropic API.
  1. Response delivery: The Anthropic API returns the draft discharge summary. The gateway returns it to the application. The application renders the draft in a side panel within the Epic workflow.
  1. Physician review: The physician reviews the draft, makes modifications (documents 3 changes in the UI), and clicks "Save to Epic." The application calls FHIR POST /DocumentReference with the finalized summary. Epic saves it to the patient's medical record. The application records the override flag and change count in the audit log.
  1. Observability: The gateway emits an OpenTelemetry trace with: request ID, use case, model version, prompt version, input tokens, output tokens, latency. The trace does not contain the prompt text or response content (no PHI in traces). The quality scorer runs async evaluation on the summary structure and completeness. The drift detector updates the 7-day rolling quality average for the discharge summary use case.

Common Mistakes

Deploying the Platform in Full Before Any Use Case Is Live. Organizations that spend 9 months building the complete platform before deploying a single use case cannot validate platform design decisions against real clinical workflows. Build incrementally: the platform emerges from use case requirements, not from design documents.

FHIR API Rate Limit Discovery at Production Scale. Epic's FHIR API rate limits are not published; they are negotiated with the health system. Organizations that design AI systems without confirming API rate limits discover the constraint when they go to production with high-volume use cases. Confirm rate limits with the EHR vendor early.

Governance Without Accountability. A Model Review Board that approves use cases without defined accountability for post-deployment quality is governance theater. The MRB must own the monitoring responsibility: who reviews the quality dashboard weekly, who escalates override rate anomalies, who initiates incident response when a quality event occurs.

Best Practices

  • Build the AI gateway and prompt registry before the first use case โ€” retrofitting governance infrastructure after use cases are live is significantly harder
  • Register CDS Hooks services and SMART on FHIR applications with the EHR vendor before technical build โ€” the registration process takes weeks and may require EHR vendor review
  • Sign HIPAA BAAs with all LLM vendors before any clinical data is transmitted
  • Maintain the clinical vector store as a shared resource โ€” never allow individual use cases to maintain their own copies of institutional clinical knowledge
  • Design the governance model around the CMIO, not the IT department โ€” clinical AI governance requires clinical leadership to be credible

Alternatives

The HMS reference architecture uses a custom AI gateway (LiteLLM-based) + independent LLM vendors. Alternative architectural approaches:

Approach Trade-off
Azure OpenAI Service (all-in) Single vendor, simplified BAA, but constrained to Microsoft's model release schedule
Epic-native AI (Cognitive Computing) No separate integration; constrained to Epic's AI capabilities
AWS HealthLake + Bedrock AWS-native HIPAA infrastructure; strong compliance posture, higher cloud commitment
Google Vertex AI + CCAI Google Cloud-native; strong for NLP and structured data AI use cases

Trade-offs

Dimension Centralized Platform (this architecture) Decentralized (per-use-case)
Governance auditability High Low
Use case delivery speed (after platform built) High Low
Use case delivery speed (before platform built) Low High
Cost attribution accuracy High Partial
HIPAA control surface Concentrated (easier to audit) Distributed (harder to audit)
Platform maintenance overhead Medium Low initially, high at scale

Interview Questions

Q: A hospital CIO asks you to design the AI architecture for a hospital deploying 7 clinical AI use cases over the next 18 months. Walk me through your design.

Category: System Design Difficulty: Principal Role: AI Architect / FDE

Answer Framework:

Start with the governance structure, because the technical architecture serves the governance requirements, not the other way around. The first deliverable is a risk tier classification of all 7 use cases: which are Tier 1 (directly influence patient care), which are Tier 2 (operational), which are Tier 3 (administrative). This classification determines what governance approval each use case requires, what evaluation criteria apply, and what the oversight model looks like post-deployment.

Then the shared infrastructure: AI gateway (before use case 1), prompt registry (before use case 1), HIPAA BAAs signed with all LLM vendors (before any clinical data is transmitted), EHR SMART on FHIR registration (weeks of lead time with Epic). These are the blocking infrastructure items that must precede use case development.

The EHR integration pattern depends on use case type: SMART on FHIR applications for workflow-embedded tools (discharge summary, clinical knowledge search); CDS Hooks services for in-workflow recommendations (medication safety, care gap alerts); HL7 v2 ADT feed via integration engine for event-driven background processing (admission event โ†’ discharge planning context enrichment).

Shared clinical infrastructure: a single clinical vector store serving all knowledge retrieval use cases (guidelines, formulary, prior auth criteria), populated and maintained by the AI platform team. Individual use cases query the shared store โ€” they do not maintain their own knowledge bases.

Observability is not optional for clinical AI: OpenTelemetry tracing (metadata only โ€” no PHI in traces), a quality scorer that evaluates AI outputs against golden datasets, and a clinical AI dashboard visible to the CMIO and Model Review Board. Override rate monitoring for all clinical use cases.

Key Points to Hit:

  • Start with governance and risk tier classification
  • Shared infrastructure first โ€” gateway, prompt registry, BAAs, EHR registration
  • EHR integration pattern depends on use case type โ€” SMART vs. CDS Hooks vs. ADT
  • Single shared clinical vector store
  • Observability with no PHI in traces
  • Override rate monitoring as the post-deployment quality signal

Key Takeaways

  • The HMS AI platform is not a single system โ€” it is a set of shared infrastructure components that enable multiple clinical AI use cases without rebuilding foundational capabilities for each one
  • Build order matters: AI gateway, prompt registry, and BAAs before the first use case; shared vector store and CDS Hooks before use case 3
  • The CMIO and Model Review Board are governance requirements, not optional oversight โ€” clinical AI governance needs clinical leadership to be credible
  • PHI flows through the AI gateway under BAA coverage to LLM vendors; audit logs use hashed identifiers; observability traces contain no PHI
  • FHIR R4, SMART on FHIR, and CDS Hooks are the integration standards that connect AI capabilities to the clinical workflow โ€” standard-based integration is non-negotiable for EHR-embedded AI
  • The platform pays for itself through use case delivery speed: use case 1 takes months; use case 7 takes weeks

Glossary

HMS (Hospital Management System): The integrated information system supporting clinical, operational, and administrative functions of a hospital. In this repository, refers to the Reference Healthcare Organization's complete clinical AI platform.

CMIO (Chief Medical Information Officer): The senior physician executive responsible for clinical informatics, EHR governance, and (typically) clinical AI governance.

Model Review Board: The governance body, chaired by the CMIO, that approves clinical AI use cases for production deployment and monitors post-deployment quality.

Risk Tier: The classification of an AI use case based on its proximity to patient care decisions (Tier 1 = direct clinical impact, Tier 2 = operational, Tier 3 = administrative).

Further Reading