Enterprise AI Operations — Quick Reference

Last Updated: 2026-06-30 Full Chapters: docs/03-Enterprise-AI/

AI Strategy — Use Case Scoring

Dimension	Weight	Score 1–5
Clinical / business impact	30%	Incremental → Transformational
Technical feasibility	25%	Low confidence → High confidence
Data readiness	20%	No data → Clean, labeled, available
Regulatory / compliance risk	15%	High risk → Low risk
Time to value	10%	> 18 months → < 3 months

Priority Tiers: Score ≥ 4.0 → Tier 1 (Strategic); 3.0–3.9 → Tier 2 (Tactical); < 3.0 → Defer

AI Governance — Model Risk Tiers

Tier	Description	Examples	Governance Requirement
1 — Clinical	Directly influences patient care	Discharge summary AI, drug interaction alerts	Model Review Board approval, clinical panel validation, explicit override logging
2 — Administrative	Influences clinical operations	Prior auth, scheduling, coding assist	Department manager approval, automated quality evaluation
3 — Informational	No direct care influence	Staff policy search, training material summarization	Standard software change management

Tier 1 AI must have: Model card, bias evaluation across demographic subgroups, clinical validation study, signed-off training dataset lineage

Production Deployment — Rollout Stages

Stage	Traffic	Proceed When	Abort If
Shadow mode	0% delivered	—	Quality score < 0.90 vs. baseline
Canary	5%	48h, zero critical errors	Error rate > 2× baseline
Blue-green	50%	7d stable	P95 latency > 2× SLA
Full production	100%	14d stable	Any Tier 1 safety event

Rollback trigger: Quality score drops > 15% from 30-day baseline → automated rollback to previous model version

Cost Management — Token Economics

Layer	Action	Typical Savings
Prompt caching	Cache stable system prompt prefix	60–80% cost reduction on cached tokens
Model tier routing	Economy for classification, Premium for clinical synthesis	40–60% blended cost reduction
Output length control	`max_tokens` per use case, not global default	15–25% reduction
Batch processing	Async batch API where latency allows	25–50% reduction on batch-eligible workloads

Token budget alert trigger: Burn rate > 110% of daily budget for 3 consecutive days → alert to engineering lead

Observability — Key Metrics

Metric	Warning Threshold	Critical Threshold	Owner
Quality score (7d rolling vs. 30d baseline)	> 10% drop	> 20% drop	AI Platform
Override rate (clinical)	> 30%	> 40%	Clinical Informatics
P95 latency vs. SLA	> 120%	> 150%	AI Platform
Hallucination rate (NLI score)	> 5%	> 10%	AI Platform + Governance
Daily cost vs. budget	> 110%	> 130%	AI Platform + Finance
Human review flag rate	> 5%	> 10%	Clinical Informatics

AI Platform — Gateway Virtual Key Checklist

Before issuing a virtual key for a new clinical AI use case:

[ ] Application has signed use case intent (department, clinical owner)
[ ] Use case classified in model risk tier registry
[ ] Allowed model tiers documented (economy / standard / premium)
[ ] Monthly token budget approved by department budget owner
[ ] Rate limit per minute set (default: 60 RPM for Tier 2, custom for Tier 1)
[ ] HIPAA BAA covers this application's data scope
[ ] PHI handling review complete (no raw PHI in logs — hashed identifiers only)

Vendor Evaluation — Qualification Gate

Must pass ALL criteria before technical evaluation:

Criterion	Requirement
HIPAA BAA	Signed BAA available; covers the specific service
PHI used for training	Confirmed NOT used for training by default
Data residency	Inference in required region (US-only if applicable)
SOC 2 Type II	Current certification (within 12 months)
Data retention	Confirmed retention policy; PHI not retained for training

After qualification: Evaluate model quality on use-case-specific de-identified test set (minimum 100 cases), latency (P50 and P95), and cost at production scale.

AI Platform — Architecture Components

Component	Purpose	Build vs. Buy
AI Gateway	Auth, rate limit, routing, audit log	Buy (LiteLLM) or Build (FastAPI)
Prompt Registry	Versioned prompts, governance lifecycle	Build (version-controlled YAML + API)
Model Registry	Approved models, BAA status, eval results	Build (lightweight DB + API)
Embedding Service	Shared clinical vector store	Build (wrapper) + Buy (vendor model)
Evaluation Pipeline	CI/CD for AI quality	Build on CI/CD platform
Observability	Traces, metrics, dashboards	Buy (OpenTelemetry + vendor backend)

Change Management — Adoption Health Indicators

Signal	Healthy	Investigate
Override rate	5–20% (use case dependent)	< 2% (rubber-stamping?) or > 30% (rejection?)
AI literacy completion	> 95% before access granted	< 80% after first 30 days
Feedback submissions	Steady low volume	Zero (feedback channel broken?) or spike (quality event?)
Champion engagement	Monthly check-in, active	No feedback in 4 weeks
Shadow AI tool use	None reported	Any reported use of non-approved AI for clinical tasks

Change Management — Rollout Phase Gates

Phase	Duration	Advance When
Champion pilot	≥ 14 days	Quality meets baseline; champions satisfied; ≥ 1 integration design issue resolved
Department rollout	≥ 21 days	Override rate stable; satisfaction ≥ 3.5/5.0; zero unresolved safety flags
Hospital-wide	≥ 30 days	All department metrics met; Model Review Board sign-off

Interview Quick Reference

AI Strategy:

Build/Buy/Partner decision turns on: strategic differentiation value, time-to-value, internal ML engineering capacity
Use case scoring must include regulatory risk — high-risk clinical AI changes the ROI denominator

AI Governance:

Tier 1 AI requires Model Review Board approval, not just engineering sign-off
Audit records must use hashed patient identifiers — never raw PHI in AI logs

Production Deployment:

Shadow mode first, always — validate before clinicians see output
Rollback policy must be defined in advance, not in response to an incident

Cost Management:

Prompt caching: cache_control with type="ephemeral" on stable system prompt prefix
Model tier routing: classify request complexity first, then route — don't send everything to the premium model

Observability:

Quality drift detection: compare 7-day rolling average to 30-day baseline, not to a fixed threshold
Override rate is a quality signal — both very high and very low rates are problems

AI Platform:

AI gateway is the security boundary; enforce at network layer, not by convention
Prompt registry is a governance requirement, not a developer convenience

Vendor Evaluation:

BAA signed before PHI can be transmitted — this is a legal prerequisite
Model training opt-out must be confirmed in writing, not assumed

Change Management:

3% override rate on a clinical document tool = rubber-stamping risk, not success
Alert fatigue is a design failure — lower volume, higher specificity

Enterprise AI Operations — Quick Reference#

AI Strategy — Use Case Scoring#

AI Governance — Model Risk Tiers#

Production Deployment — Rollout Stages#

Cost Management — Token Economics#

Observability — Key Metrics#

AI Platform — Gateway Virtual Key Checklist#

Vendor Evaluation — Qualification Gate#

AI Platform — Architecture Components#

Change Management — Adoption Health Indicators#

Change Management — Rollout Phase Gates#

Interview Quick Reference#

See Also#