Clinical Decision Support
Executive Summary
Clinical Decision Support (CDS) is the technology category that delivers the right information to the right person at the right time to improve clinical decisions and patient outcomes. It is also the category with the longest history in healthcare informatics, the most studied failure mode (alert fatigue), and the most complex regulatory classification picture of any healthcare AI category. Modern LLM-based CDS extends classical rule-based CDS into natural language interaction, nuanced multi-factor reasoning, and proactive identification — but it inherits all the challenges of classical CDS plus new ones. This chapter covers CDS architecture, the alert fatigue problem and its engineering mitigations, the clinical workflow integration patterns that determine whether CDS is used or ignored, and the role of LLMs in extending classical CDS.
Learning Objectives
After reading this chapter, you will be able to:
- Design a CDS system architecture that distinguishes rule-based, retrieval-augmented, and LLM-based CDS and routes queries appropriately
- Identify the alert fatigue failure mode and apply specificity, severity stratification, and workflow integration design principles to reduce it
- Implement a CDS Hooks service that delivers CDS at the appropriate workflow trigger with an appropriate response time
- Evaluate CDS quality using override rate, action rate, and patient outcome metrics
Business Problem
CDS in healthcare is simultaneously one of the highest-value and most poorly executed capabilities in clinical AI. The value case is clear: drug-drug interaction alerts prevent adverse drug events; sepsis early warning systems enable earlier intervention; care gap reminders close preventive care gaps that increase population health costs. The execution problem is equally clear: a seminal study found that emergency physicians overrode 91% of drug-drug interaction alerts in a major academic medical center. When clinicians override 9 of 10 alerts, the alert system is not a decision support tool — it is an interruption that the organization has learned to ignore.
The engineering challenge of CDS is not producing alerts. It is producing alerts that are specific enough that the 9 overridden alerts were genuinely non-actionable, while ensuring that the 1 actionable alert is seen and acted upon. This is a precision problem, not a recall problem — and it requires architectural choices about alert thresholds, severity stratification, workflow placement, and clinical context that go far beyond adding a rule engine to the EHR.
Why This Technology Exists
Rule-based CDS systems have existed since the 1970s. MYCIN (1974), developed at Stanford, was an expert system for diagnosing bacterial infections — the ancestor of modern CDS. Commercial CDS content (drug-drug interaction databases, clinical order sets, dosing calculators) became standard EHR components in the 1990s and 2000s.
The limitation of rule-based CDS is that rules cannot express clinical nuance: a drug-drug interaction alert fires based on a drug pair without knowing the patient's renal function, the indication for the drug, the dose, or whether the prescribing physician is aware of the interaction and managing it. This context-blindness is the root cause of alert fatigue — the alert system fires for interactions that an informed clinician would judge acceptable, creating alert noise that trains clinicians to dismiss alerts without reading them.
LLMs enter CDS at the nuance layer: they can reason about whether a drug-drug interaction is clinically significant given the patient's specific context, they can synthesize multi-factor risk scores with explanatory text, and they can interact with clinicians in natural language to clarify the clinical reasoning behind a recommendation.
Conceptual Explanation
CDS Intervention Types
CDS is not a single technology; it encompasses a spectrum of intervention types that vary in their timing, intrusiveness, and required clinical action:
| Intervention Type | Timing | Intrusiveness | Example |
|---|---|---|---|
| Alerting / Notification | Real-time, workflow-blocking | High | Drug allergy alert at order sign |
| Advisory | Real-time, non-blocking | Medium | Sepsis score display on patient chart |
| Order facilitation | Triggered, non-blocking | Low | Pre-populated order sets |
| Relevant information | On-demand | None | Knowledge retrieval (clinical RAG) |
| Expert systems | Asynchronous, proactive | None | Care gap identification |
The Alert Fatigue Problem
Alert fatigue is the state in which clinicians systematically dismiss CDS alerts without reading them because past experience has taught them that most alerts are not actionable for their specific patient and situation. It is a conditioned behavior, not a character flaw — it is the rational response to a high-noise, low-signal alert environment.
Alert fatigue has two components:
- Notification fatigue: Too many alerts interrupt clinical workflow
- Relevance fatigue: Too many alerts are not relevant to the specific patient's situation
The engineering solutions address both:
- Threshold calibration: Raise alert thresholds to fire only for interactions/situations above a clinical significance threshold
- Context enrichment: Add patient context to the alert logic (renal function, existing medications, indication) to filter out alerts that are non-actionable given the patient's situation
- Severity stratification: Distinguish truly dangerous alerts (which warrant workflow interruption) from informational alerts (which can be asynchronous)
- Actionability design: Every alert must offer a specific action the clinician can take; alerts without an actionable response are noise
Core Architecture
Components
Rule Engine (Classical CDS)
The rule engine handles structured, high-confidence CDS functions: drug-drug interactions, drug-allergy contraindications, dosing range violations, and contraindicated order combinations. These are deterministic: the alert condition is a logical expression over structured EHR data (drug codes, allergy codes, lab values, order codes).
The rule engine remains the appropriate component for CDS that must fire consistently and immediately at order entry — LLMs add latency and non-determinism that are unacceptable for mandatory safety alerts.
ML-Based Predictive CDS
Predictive CDS models (sepsis early warning, readmission risk, deterioration prediction) use ML models trained on historical patient data to generate risk scores for current patients. These models are specific to the institution's patient population and EHR data structure, which is both their strength (population-specific calibration) and their operational challenge (retraining, drift detection, and performance monitoring are ongoing requirements).
Predictive CDS models require the bias evaluation discipline described in Chapter 2 (AI Governance): risk scores that perform differently across demographic subgroups (race, sex, age, insurance status) create health equity risks. This is not hypothetical — pulse oximetry overestimation in patients with darker skin pigmentation is one documented example of clinically deployed technology with embedded demographic bias.
LLM-Based Nuanced CDS
LLMs add value in CDS contexts where the alert condition cannot be expressed as a rule because it requires reasoning over unstructured clinical context. Examples:
- A drug-drug interaction that is clinically significant at one dose combination but manageable at another, given the patient's renal function — a rule fires or doesn't; an LLM can explain the nuance
- A clinical guideline recommendation that applies differently to a patient with the specific comorbidity pattern in the chart — RAG retrieves the guideline; an LLM synthesizes the recommendation against the patient's context
- A discharge planning recommendation that considers the patient's home situation, support network, and chronic disease management needs — a rule cannot incorporate unstructured social history; an LLM can
LLM-based CDS requires the context enrichment described in Chapter 4 (Clinical RAG) and the latency and workflow integration design described in Chapter 3 (EHR Integration).
Severity Stratification
Every CDS alert must be classified by severity before delivery:
| Severity | Delivery Mode | Override Mechanism | Example |
|---|---|---|---|
| Critical | Workflow-blocking alert | Requires explicit acknowledgment | Active allergy to prescribed drug |
| Warning | Non-blocking card | Dismissible with reason required | Drug-drug interaction requiring monitoring |
| Informational | Non-blocking card or async | Dismissible without reason | Recommended preventive care gap |
| Advisory | Dashboard / background | No acknowledgment required | Population risk stratification |
Critical alerts must be rare — reserved for genuinely dangerous situations where clinical error would cause immediate patient harm. Every critical alert must be reviewed periodically: if clinicians are overriding > 20% of critical alerts, the severity classification is wrong.
Implementation Patterns
CDS Hooks Service with Context-Aware Severity
# Educational Example — CDS Hooks Service with Severity Classification
# Illustrates rule-based + LLM hybrid CDS delivery via CDS Hooks
# Educational disclaimer: Not intended for clinical use
from dataclasses import dataclass
from typing import Optional
from fastapi import FastAPI, Request
import anthropic
app = FastAPI(title="Clinical CDS Service")
anthropic_client = anthropic.Anthropic()
DRUG_INTERACTION_SEVERITIES = {
("warfarin", "aspirin"): "warning",
("warfarin", "NSAIDs"): "warning",
("methotrexate", "NSAIDs"): "critical",
("clopidogrel", "omeprazole"): "informational",
}
@dataclass
class CDSAlert:
severity: str # "critical" | "warning" | "informational"
summary: str
detail: str
source_label: str
indicator: str # CDS Hooks indicator value
suggestions: list[dict]
def check_drug_interactions(
new_drug: str,
current_medications: list[str],
) -> list[CDSAlert]:
"""
Check new drug against current medication list for known interactions.
Returns CDS alerts for significant interactions.
"""
alerts = []
new_drug_lower = new_drug.lower()
for current_med in current_medications:
pair = tuple(sorted([new_drug_lower, current_med.lower()]))
severity = DRUG_INTERACTION_SEVERITIES.get(pair)
if severity:
alerts.append(
CDSAlert(
severity=severity,
summary=f"Drug Interaction: {new_drug} + {current_med}",
detail=(
f"Potential interaction between {new_drug} and {current_med}. "
f"Severity: {severity.capitalize()}. Review before proceeding."
),
source_label="Clinical Drug Interaction Database",
indicator=severity if severity in ("warning", "critical") else "info",
suggestions=[
{
"label": "View Interaction Details",
"actions": [{"type": "create", "description": "Open interaction reference"}],
}
],
)
)
return alerts
def get_llm_clinical_context_recommendation(
new_order: dict,
patient_context: dict,
relevant_alerts: list[CDSAlert],
) -> Optional[CDSAlert]:
"""
Use LLM to generate a nuanced clinical recommendation when
the patient context changes the clinical significance of alerts.
Only invoked when rule-based alerts have been found — not for every order.
"""
if not relevant_alerts:
return None
alert_summary = "; ".join([a.summary for a in relevant_alerts])
patient_summary = (
f"Renal function: {patient_context.get('creatinine', 'unknown')} mg/dL creatinine, "
f"eGFR {patient_context.get('egfr', 'unknown')}. "
f"Active diagnoses: {', '.join(patient_context.get('diagnoses', [])[:5])}."
)
response = anthropic_client.messages.create(
model="claude-sonnet-4-6", # verify current model IDs
max_tokens=300,
system=(
"You are a clinical pharmacist CDS assistant. "
"Given a new medication order, detected drug interactions, and patient context, "
"provide a brief clinical recommendation (2-3 sentences) about whether the "
"interaction is clinically significant given this specific patient's situation. "
"Do not make a prescribing decision — advise the clinician on the relevant factors."
),
messages=[{
"role": "user",
"content": (
f"New order: {new_order.get('drug', 'Unknown')} "
f"{new_order.get('dose', '')} {new_order.get('route', '')}\n"
f"Detected interactions: {alert_summary}\n"
f"Patient context: {patient_summary}"
),
}],
)
recommendation_text = response.content[0].text
return CDSAlert(
severity="informational",
summary="Clinical Context Assessment",
detail=recommendation_text,
source_label="Clinical AI Assistant",
indicator="info",
suggestions=[],
)
@app.post("/cds-services/medication-safety")
async def medication_safety_hook(request: Request):
"""
Handles order-sign CDS Hook for medication safety checking.
Combines rule-based drug interaction checking with LLM context assessment.
"""
body = await request.json()
context = body.get("context", {})
prefetch = body.get("prefetch", {})
new_order = context.get("draftOrders", {}).get("entry", [{}])[0].get("resource", {})
drug_name = (
new_order.get("medicationCodeableConcept", {})
.get("text", "")
)
current_meds = [
entry.get("resource", {}).get("medicationCodeableConcept", {}).get("text", "")
for entry in prefetch.get("medications", {}).get("entry", [])
]
patient_context = {
"creatinine": prefetch.get("labs", {}).get("creatinine"),
"egfr": prefetch.get("labs", {}).get("egfr"),
"diagnoses": [
entry.get("resource", {}).get("code", {}).get("text", "")
for entry in prefetch.get("conditions", {}).get("entry", [])
],
}
rule_alerts = check_drug_interactions(drug_name, current_meds)
cards = [
{
"summary": alert.summary,
"indicator": alert.indicator,
"detail": alert.detail,
"source": {"label": alert.source_label},
"suggestions": alert.suggestions,
}
for alert in rule_alerts
]
# Only invoke LLM for nuanced context assessment if rule alerts were found
if rule_alerts:
llm_alert = get_llm_clinical_context_recommendation(
new_order={"drug": drug_name},
patient_context=patient_context,
relevant_alerts=rule_alerts,
)
if llm_alert:
cards.append({
"summary": llm_alert.summary,
"indicator": llm_alert.indicator,
"detail": llm_alert.detail,
"source": {"label": llm_alert.source_label},
"suggestions": llm_alert.suggestions,
})
return {"cards": cards}Enterprise Considerations
CDS Governance Committee: Every new CDS alert requires a clinical governance review before deployment: Who should this alert fire for? What severity is appropriate? What is the expected override rate? Who will monitor performance? Organizations that deploy CDS alerts without governance review accumulate alert debt — a large catalog of poorly calibrated alerts that collectively produce the alert fatigue state.
Alert Override Documentation: When a clinician overrides a CDS alert, capturing the override reason (structured selection, not free text) creates a dataset for alert calibration. Override reasons cluster: if 70% of overrides for a drug interaction alert select "Patient already on this regimen — clinician aware," the alert is firing for a population that is not the intended target and the threshold needs adjustment.
CDS Performance Monitoring: Measure: alert fire rate, override rate, action rate (clinician took the suggested action), and (where measurable) patient outcome correlation. Alert fire rate > 10 per provider per shift is a fatigue risk. Override rate > 70% indicates insufficient specificity. Action rate < 10% indicates that the suggested action is not relevant to the alert population.
Security Considerations
- CDS Hooks services receive patient clinical data in the prefetch payload — they are PHI receivers requiring BAA coverage
- CDS Hook services must not store patient data beyond what is necessary for the immediate response (no accumulation of patient clinical data in the CDS service)
- CDS Hook endpoints must be behind authentication if they receive protected patient data in the request body (some implementations use EHR-provided JWTs for request authentication)
Healthcare Example
Educational Example — Illustrative Workflow. Not intended for clinical decision making.
The Reference Healthcare Organization deploys a sepsis early warning CDS system integrated with the nursing workflow:
Architecture:
- An ML model trained on historical HMS patient data calculates a modified Early Warning Score (NEWS2) every 15 minutes for all inpatients
- Scores are displayed on the nursing station dashboard (non-interrupting, advisory)
- When a patient's score crosses 7 (clinical threshold validated against HMS patient population), an alert is generated and delivered to the assigned nurse's workstation
Alert design:
- Severity: Warning (not Critical — the score does not confirm sepsis, it identifies patients requiring assessment)
- Display: Non-blocking card in the nursing EHR workflow with three action options: "Assess patient now," "Physician notified," "Score inconsistent — document reason"
- No alert is delivered for scores 5–6 (elevated but sub-threshold) — these appear on the dashboard only
Override rate monitoring:
- Month 1 post-deployment: override rate 41% (too high)
- Root cause analysis: 38% of overrides selected "Patient recently assessed — no change"
- Adjustment: suppress alert if nurse accessed patient chart within 60 minutes
- Month 3: override rate 22% (acceptable)
- Action rate (nurse assessed and documented): 74% of fires
Common Mistakes
Building Blocking Alerts for Non-Critical Situations. Workflow-blocking alerts that require acknowledgment before the clinician can proceed are justified only for genuinely dangerous situations. Organizations that implement blocking alerts for "warning" level interactions produce exactly the dismissal behavior that causes alert fatigue — and train clinicians to acknowledge-without-reading as fast as possible.
No Alert Review Process. CDS alert libraries that are never reviewed accumulate obsolete alerts, miscalibrated thresholds, and retired rules that continue to fire. Establish an annual alert review process that evaluates every active alert for current clinical evidence, override rate, and action rate.
Delivering All Alerts Through One Channel. An alert that requires a decision must be delivered at the point of decision (order sign, discharge); an informational observation does not need to interrupt the workflow. Mixing all alert severity levels into the same workflow interruption channel produces the signal-to-noise ratio that causes fatigue.
Best Practices
- Restrict blocking alerts to genuinely dangerous situations; use non-blocking cards for warnings
- Capture structured override reasons — they are the primary calibration signal for alert improvement
- Monitor override rate and action rate continuously; investigate deviations from targets promptly
- Review the full alert catalog annually; retire alerts that have consistently high override rates and low action rates
- Use LLM reasoning for nuanced context assessment only when rule-based checks have found something worth reasoning about — not as the first-line check for every order
- Govern every new CDS alert through a clinical review process before deployment
Trade-offs
| CDS Design | Alert Specificity | Alert Volume | Implementation Complexity | Alert Fatigue Risk |
|---|---|---|---|---|
| Rule-based only | Low-Medium | High | Low | High |
| Rule + context filtering | Medium | Medium | Medium | Medium |
| Rule + ML severity scoring | High | Low-Medium | Medium-High | Low |
| Rule + ML + LLM nuance | Highest | Lowest | High | Lowest |
Interview Questions
Q: A hospital's CMO tells you that clinicians are overriding 85% of drug-drug interaction alerts in the EHR. What is your diagnosis and what would you recommend?
Category: System Design / Clinical Informatics Difficulty: Senior Role: AI Architect / FDE
Answer Framework:
An 85% override rate is a textbook alert fatigue indicator. The alerts are firing but not being read — clinicians have learned that most alerts for their specific patient population are non-actionable, so the rational response is to dismiss them quickly and continue the workflow.
The diagnosis has three possible root causes, each with different remedies:
Root cause 1 — Threshold too low: The alert fires for drug interactions that are clinically significant in theory but rarely clinically significant in practice given the hospital's patient population. Remedy: analyze the override data — what reason do clinicians select when they override? If the dominant reason is "not clinically significant for this patient," the threshold needs calibration against the actual patient population.
Root cause 2 — Missing patient context: The alert fires without considering patient context (renal function, dose, indication, current monitoring). A drug interaction that requires dose adjustment in renal impairment should not fire the same alert for a patient with normal renal function. Remedy: add patient-context filters to the rule logic before the alert fires.
Root cause 3 — Alert catalog too broad: The alert catalog contains too many interaction pairs, including low-severity interactions that trained clinicians would manage without an alert. Remedy: audit the alert catalog against override rates by alert type; retire alerts with > 80% override rate and limited evidence of clinical consequence.
The recommendation: run a 90-day alert performance analysis with structured override reason capture (if not already in place), then prioritize root cause remediation based on the override reason distribution. Do not add new alerts until the existing catalog is calibrated.
Key Points to Hit:
- 85% override = alert fatigue, not clinical disagreement with legitimate alerts
- Three root causes: threshold, context, catalog breadth
- Override reason capture is the diagnostic tool — without structured reasons, calibration is guesswork
- Retire before adding — the catalog is already over-alerting
Key Takeaways
- CDS spans a spectrum from deterministic rule-based alerts to nuanced LLM reasoning — use the simplest appropriate mechanism for each alert type
- Alert fatigue is the primary failure mode of clinical CDS; it is caused by low specificity (alerts fire when non-actionable) not by high volume per se
- Severity stratification is the most important CDS design decision: blocking alerts must be rare, reserved for genuinely dangerous situations
- Override reason capture is the primary calibration signal; alert review should be a continuous process, not a one-time deployment activity
- LLMs add value to CDS at the nuance layer — reasoning about whether a rule-triggered alert is clinically significant given the specific patient's context
Glossary
Alert fatigue: The state in which clinicians dismiss CDS alerts without reading them because past experience has established that most alerts are not actionable.
Override rate: The percentage of CDS alerts that are dismissed without the suggested action being taken. A key CDS quality metric.
Action rate: The percentage of CDS alerts that result in the suggested action being taken by the clinician.
CDS Hooks: An HL7 standard for delivering CDS at specific clinical workflow events through EHR-embedded web service calls.
Early Warning Score: A composite clinical score that summarizes a patient's vital sign and clinical status trends to identify patients at risk of deterioration (e.g., NEWS2, MEWS).
Further Reading
- Chapter 1: Healthcare AI Landscape — FDA SaMD classification for CDS systems
- Chapter 3: EHR Integration — CDS Hooks implementation for EHR-embedded CDS
- Chapter 4: Clinical RAG — Clinical knowledge retrieval as the guideline layer for CDS
- Chapter 6: HMS Reference Architecture — CDS as a component in the complete HMS AI architecture