Version 1.0.0 Healthcare Edition

Enterprise AI Interview Companion

Architecture Patterns, Trade-offs, and Interview Questions

Interview preparation guide: core concepts, patterns, trade-offs, and structured interview questions per topic

84Chapters
76KWords
6h 22mReading Time
41Diagrams
70Tables
50Code Examples
All Publications
Part IAI Foundations
01
LLM Fundamentals
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Foundational --- | Model Tier | Examples | Token Cost | Latency | Best For |…
1,237w · 6 min
02
Embeddings and Vector Spaces
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Foundational --- Higher-dimensional embeddings capture more nuanced semantic relationships…
1,066w · 5 min
03
Retrieval-Augmented Generation
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- | Failure Mode | Cause | Detection | Mitigation |…
1,029w · 5 min
04
Prompt Engineering
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- System prompts are cached by Claude and many other providers after the…
857w · 4 min
05
Fine-Tuning vs RAG
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- > Note on cost figures: Specific pricing is not quoted here because AI…
787w · 4 min
06
Evaluation and Benchmarking
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Advanced --- | Pitfall | Description | Mitigation | |---------|-------------|-----------|…
946w · 5 min
07
Context Window Management
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- With 200K+ context windows available, the tempting design is to put…
555w · 3 min
08
Multimodal AI
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- | Failure Mode | Description | Mitigation |…
637w · 3 min
Part IIAgentic AI
01
Agent Architecture Fundamentals
An agent is an LLM equipped with: 1. Tools — functions the model can call (search, database query, API call, file write) 2. Memory — access to prior context beyond the current…
1,686w · 8 min
02
Tool Design Patterns
A tool has three components visible to the LLM: 1. Name — what the LLM calls in its tool invocation 2. Description — the primary signal the LLM uses for tool selection 3. Input…
1,352w · 7 min
03
Memory Systems
Memory in agent systems is not a single system — it is a stack of complementary stores, each operating at a different time horizon: No agent needs all five simultaneously. The…
1,375w · 7 min
04
Multi-Agent Systems
Three conditions justify the coordination overhead of multi-agent systems: 1. Work can be parallelized: Independent subtasks that could proceed simultaneously are being…
1,141w · 6 min
05
LangGraph Deep Dive
LangGraph models a workflow as a StateGraph: a directed graph where: - Nodes are Python functions (or agent invocations) that transform state - Edges define transitions between…
1,339w · 7 min
06
CrewAI Patterns
CrewAI's model has four concepts: Agent: An LLM-powered entity with a defined role, goal, and backstory. The role and goal form the agent's system prompt; the backstory provides…
978w · 5 min
07
Human-in-the-Loop (HITL) Design
Confidence-based triggers: The agent's uncertainty exceeds a defined threshold. Relevant when agents produce confidence scores (via log-probabilities, self-assessment, or explicit…
1,216w · 6 min
08
Agent Observability
Traces: End-to-end records of a single agent execution. A trace captures the ordered sequence of LLM calls, tool calls, and routing decisions for one workflow invocation, linked…
874w · 4 min
09
Model Context Protocol (MCP)
MCP defines three primitives: Tools: Functions the AI model can call with arguments to perform actions or retrieve data. Tools are the MCP equivalent of LLM function/tool calls —…
1,118w · 6 min
10
Agentic Security
Prompt Injection: Adversarial instructions embedded in inputs the agent processes (user messages, retrieved documents, tool results) that attempt to override the agent's system…
1,427w · 7 min
Part IIIEnterprise AI
01
Enterprise AI Strategy
Enterprise AI strategy operates at three levels that must be aligned for sustained value creation. Level 1 — Use Case Strategy: Which problems should AI solve? Not all problems…
2,231w · 11 min
02
AI Governance
AI governance operates at three layers that correspond to different organizational functions. Policy Layer: The principles, standards, and requirements that define acceptable AI…
1,692w · 8 min
03
Production Deployment of AI Systems
AI system deployments differ from traditional software deployments in four properties that each require adapted engineering patterns: Non-determinism: The same input to an LLM…
1,758w · 9 min
04
AI Cost Management
The cost of an LLM-based AI system consists of five components, each with different optimization levers: Input Token Cost: The cost of sending context to the model — system…
1,629w · 8 min
05
AI Observability and Monitoring
LLM observability operates across three layers that must work together to provide production confidence in clinical AI systems. Inference-Level Observability: Capturing the…
1,495w · 7 min
06
AI Platform Architecture
An AI platform is not a monolith. It is a set of shared services, each providing a specific capability, that individual AI applications use through well-defined interfaces. The…
1,584w · 8 min
07
AI Vendor Evaluation
Vendor evaluation in enterprise AI has two distinct phases that organizations frequently conflate: Phase 1 — Qualification: Determining which vendors are eligible for clinical AI…
1,435w · 7 min
08
AI Change Management
Clinical AI adoption follows a characteristic pattern that differs from enterprise software adoption in three ways: Professional autonomy: Physicians operate under a licensure…
1,682w · 8 min
Part IVAI Infrastructure
01
Vector Databases
An embedding model converts a piece of text (a sentence, a paragraph, a document) into a vector of floating-point numbers — typically 768 to 3072 dimensions. Two texts that are…
1,922w · 10 min
02
LLM Serving Infrastructure
During transformer inference, each attention layer computes key-value pairs for every token in the context. These KV pairs are reused for all subsequent tokens in the same…
1,234w · 6 min
03
Cloud AI Platforms
All three platforms provide the same core service: managed LLM inference with enterprise packaging. The differentiation is in: Model selection: Which models are available, when,…
1,123w · 6 min
04
Data Pipelines for AI
1. Chunking at fixed token sizes without regard for semantic boundaries. Fixed-size chunking frequently splits a clinical recommendation in the middle, creating chunks that lack…
791w · 4 min
05
Orchestration and Workflow Automation for AI
1. Using Airflow for long-running workflows. Airflow tasks time out and occupy slots for their entire duration. A workflow waiting for physician approval occupies an Airflow…
754w · 4 min
06
Caching Strategies for AI Systems
1. Setting similarity threshold too low. A threshold of 0.85 will return cached responses for semantically similar but meaningfully different queries ("What is the first-line…
641w · 3 min
07
GPU Infrastructure for AI Inference
Quantization reduces the numerical precision of model weights and/or activations to reduce memory footprint and potentially improve throughput. Each quantization scheme makes…
1,019w · 5 min
08
Networking and AI API Gateway Design
1. Request-level rate limiting for LLM traffic. Limiting to 100 requests per minute ignores that a single request may consume 10,000 tokens while another consumes 100.…
682w · 3 min
Part VEnterprise Integration
01
Enterprise Integration Patterns for AI
1. Using synchronous calls for long-running AI operations. A 30-second AI response on a synchronous call holds an HTTP connection and a thread for the entire duration. Under load,…
279w · 1 min
02
API Design for AI Services
1. Exposing model names in the API. model: "gpt-4o" in the request schema forces every consumer to know and specify the model. When the platform upgrades the model, all consumers…
192w · 1 min
03
Event-Driven AI
1. Auto-commit of Kafka offsets. With enableautocommit=True, Kafka commits the offset as soon as the message is polled — before processing. If the AI worker crashes during…
201w · 1 min
04
EHR Integration Patterns
1. FHIR requests in series during CDS Hook. Making FHIR API calls sequentially within a CDS Hook (Patient → Conditions → Medications → Labs → Allergies) consumes 1–2 seconds per…
208w · 1 min
05
Data Warehouse Integration for AI
1. Loading the entire dataset into a Pandas DataFrame. A population health dataset of 500,000 patients does not fit in a typical application server's memory as a DataFrame. Use…
217w · 1 min
06
Identity and Access for AI Systems
1. Hardcoding API keys in application configuration. API keys in config files, environment variables, or code are committed to version control, appear in container images, and…
356w · 2 min
07
Middleware and Enterprise Service Bus for AI
1. Bypassing the ESB to call AI services directly. Development teams often bypass the ESB to avoid "integration overhead." This creates ungoverned AI data flows that bypass…
244w · 1 min
08
Webhook and Callback Patterns for AI
1. Not verifying webhook signatures. A webhook receiver that does not verify the HMAC signature will process fabricated payloads from any source that knows the endpoint URL.…
266w · 1 min
Part VISecurity & Compliance
01
AI Security Fundamentals
1. Treating LLM security as identical to traditional injection defense. SQL injection defenses (parameterized queries) do not translate to prompt injection. LLMs process natural…
213w · 1 min
02
Prompt Injection Defense
1. Relying on a single defense layer. No single prompt injection defense is complete. A defense stack that relies solely on input pattern matching will be bypassed by novel…
202w · 1 min
03
HIPAA Compliance for AI Systems
1. Deploying clinical AI without confirming BAA coverage. Organizations deploy clinical AI features that include PHI in LLM prompts without confirming that the LLM provider has…
234w · 1 min
04
Data Privacy Architecture for AI
1. Assuming Safe Harbor de-identification is sufficient for AI training. Safe Harbor removes explicit identifiers but does not prevent memorization of clinical content or…
210w · 1 min
05
Zero Trust Architecture for AI Systems
1. Implementing network perimeter security and calling it Zero Trust. Placing AI services behind a VPN or private subnet does not implement Zero Trust. Zero Trust requires…
245w · 1 min
06
Audit and Logging for AI Systems
1. Logging request and response content for PHI-handling AI features. Even "debug" logs that include prompt content contain PHI for clinical AI features. Every log store that…
244w · 1 min
07
Model Security
- Models fine-tuned on clinical data can memorize and reproduce training data — conduct memorization audits before deployment - Membership inference allows adversaries to…
102w · 1 min
08
Regulatory Compliance for Enterprise AI
1. Assuming SOC 2 covers HIPAA. SOC 2 is a security framework; HIPAA is a privacy and security law. An organization can be SOC 2 Type II certified while being out of HIPAA…
264w · 1 min
Part VIIHealthcare AI
01
Healthcare AI Landscape
Healthcare AI can be organized along two axes: Axis 1 — Clinical Function: What does the AI do in the clinical workflow? Diagnostic functions (helping identify disease states),…
1,570w · 8 min
02
HIPAA and AI
Protected Health Information is individually identifiable health information held or transmitted by a Covered Entity or Business Associate. The definition has three components: 1.…
1,430w · 7 min
03
EHR Integration
HL7 v2 vs. FHIR R4: Different Purposes HL7 v2 and FHIR R4 are not competing standards; they address different integration scenarios: - HL7 v2 ADT feeds: Real-time event…
1,357w · 7 min
04
Clinical RAG
Clinical RAG differs from general-domain RAG in three important ways: Terminology density: Medical text uses precise, domain-specific vocabulary where term choice is clinically…
1,088w · 5 min
05
Clinical Decision Support
CDS Intervention Types CDS is not a single technology; it encompasses a spectrum of intervention types that vary in their timing, intrusiveness, and required clinical action: |…
1,241w · 6 min
06
HMS Reference Architecture
The HMS AI platform is organized around three principles that govern every architectural decision: Clinical workflow primacy: AI must serve the clinical workflow, not require the…
1,464w · 7 min
07
Medical Imaging AI
DICOM (Digital Imaging and Communications in Medicine) DICOM is the international standard for medical imaging data and communications. It defines: - Image storage format: A DICOM…
1,088w · 5 min
08
Patient Engagement AI
Patient engagement AI use cases span three clinical phases: Pre-visit: Appointment reminders, pre-visit intake (medical history, current medications, chief complaint),…
1,048w · 5 min
09
Clinical Documentation AI
Ambient Documentation Ambient documentation captures the patient-physician encounter in real time — typically through a microphone in the clinical examination room or worn by the…
1,185w · 6 min
10
AI Safety in Clinical Settings
The Four Clinical AI Safety Dimensions Clinical Harm: The AI produces an output (a recommendation, a diagnosis, a risk score, a drug interaction assessment) that is clinically…
1,180w · 6 min
Part VIIIForward Deployed Engineering
01
The Forward Deployed Engineer: Role and Responsibilities
The FDE role is best understood by contrast with adjacent roles that it superficially resembles: Sales Engineer (SE): Demonstrates product capabilities to prospects. The SE's…
2,247w · 11 min
02
Client Discovery Framework
Discovery has three layers, each building on the previous: Layer 1 — Technical environment: What systems exist, how they connect, what data is available, what integration…
2,099w · 10 min
03
AI Readiness Assessment
Readiness has three independent dimensions. An organization can be highly mature on one dimension and poorly prepared on another: Dimension 1 — Data Maturity: Can the AI system…
2,528w · 13 min
04
Demo Engineering
A demo has three layers that must be engineered independently: Layer 1 — The Environment: Where does the demo run? Is it isolated from production? Is it reproducible? What happens…
1,924w · 10 min
05
POC to Production
A well-designed POC has three properties that are often in tension: Feasible: Can be executed with the available time, data, and people. A 6-week POC scope for a 4-week engagement…
1,892w · 9 min
06
Architecture Review Facilitation
An architecture review has two phases that must be kept separate: Phase 1 — Current State Elicitation: The FDE maps the client's actual current-state architecture. This requires…
2,265w · 11 min
07
Value Engineering
Value engineering for AI deployments follows a standard structure: The complexity lies in benefit quantification. AI benefits typically span four categories: 1. Time savings:…
1,949w · 10 min
08
Client Communication
Effective client communication follows three principles: Bottom Line Up Front (BLUF): The most important information goes in the first sentence — not after the context. Executives…
1,045w · 5 min
09
Common Objections
Every objection response follows the same four-step structure: --- 1. Addressing the surface statement instead of the root. "Our data is too sensitive" can mean PHI-in-cloud…
913w · 5 min
10
Healthcare Client Playbook
Healthcare AI engagements follow the same eight-phase lifecycle as general FDE engagements (Discovery → Assessment → POC Design → POC Execution → Architecture Review → Production…
3,653w · 18 min
Part IXInterview Preparation
Quick ReferenceQuick Reference
01
AI Foundations — Quick Reference
> Last Updated: 2026-06-30 > Full Chapters: docs/01-AI-Foundations/(../01-AI-Foundations/) --- 1. "Why do LLMs hallucinate and how do you mitigate it in a clinical system?" → RAG…
125w · 1 min
02
Agentic AI — Quick Reference
| Dimension | Raw SDK (Anthropic) | LangGraph | CrewAI | |-----------|---------------------|-----------|--------| | Control | Maximum | High | Medium | | Configuration effort |…
311w · 2 min
03
Enterprise AI Operations — Quick Reference
> Last Updated: 2026-06-30 > Full Chapters: docs/03-Enterprise-AI/(../03-Enterprise-AI/) ---
16w · 1 min
04
AI Infrastructure — Quick Reference
Q: What chunking strategy would you use for clinical guidelines and why? Section-boundary chunking — clinical guidelines are structured with numbered recommendations that are the…
213w · 1 min
05
Enterprise Integration — Quick Reference
Q: A CDS Hook response is timing out. What are the likely causes? (1) FHIR reads in series instead of parallel — fix with asyncio.gather. (2) AI inference too slow — add 4.5s…
195w · 1 min
06
AI Security — Quick Reference
Q: What makes prompt injection harder to prevent than SQL injection? SQL injection is prevented by parameterized queries because SQL has a clear separation between code (query…
223w · 1 min
07
Healthcare AI — Quick Reference
> Last Updated: 2026-06-30 > Full Chapters: docs/07-Healthcare-AI/(../07-Healthcare-AI/) ---
15w · 1 min
08
Forward Deployed Engineering — Quick Reference
> Last Updated: 2026-06-30 > Full Chapters: docs/08-Forward-Deployed-Engineering/(../08-Forward-Deployed-Engineering/) ---
16w · 1 min
09
Interview Preparation — Quick Reference
Asking strong questions signals principal-level thinking: About the AI systems: - "What is the scale of your current AI workloads? Tokens per day? Concurrent users?" - "What is…
318w · 2 min