Version 1.0.0
Healthcare Edition
Enterprise AI Interview Companion
Architecture Patterns, Trade-offs, and Interview Questions
Interview preparation guide: core concepts, patterns, trade-offs, and structured interview questions per topic
All Publications
Part IAI Foundations
01
02
03
04
05
06
07
08
LLM Fundamentals
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Foundational --- | Model Tier | Examples | Token Cost | Latency | Best For |…
Embeddings and Vector Spaces
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Foundational --- Higher-dimensional embeddings capture more nuanced semantic relationships…
Retrieval-Augmented Generation
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- | Failure Mode | Cause | Detection | Mitigation |…
Prompt Engineering
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- System prompts are cached by Claude and many other providers after the…
Fine-Tuning vs RAG
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- > Note on cost figures: Specific pricing is not quoted here because AI…
Evaluation and Benchmarking
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Advanced --- | Pitfall | Description | Mitigation | |---------|-------------|-----------|…
Context Window Management
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- With 200K+ context windows available, the tempting design is to put…
Multimodal AI
> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- | Failure Mode | Description | Mitigation |…
Part IIAgentic AI
01
02
03
04
05
06
07
08
09
10
Agent Architecture Fundamentals
An agent is an LLM equipped with: 1. Tools — functions the model can call (search, database query, API call, file write) 2. Memory — access to prior context beyond the current…
Tool Design Patterns
A tool has three components visible to the LLM: 1. Name — what the LLM calls in its tool invocation 2. Description — the primary signal the LLM uses for tool selection 3. Input…
Memory Systems
Memory in agent systems is not a single system — it is a stack of complementary stores, each operating at a different time horizon: No agent needs all five simultaneously. The…
Multi-Agent Systems
Three conditions justify the coordination overhead of multi-agent systems: 1. Work can be parallelized: Independent subtasks that could proceed simultaneously are being…
LangGraph Deep Dive
LangGraph models a workflow as a StateGraph: a directed graph where: - Nodes are Python functions (or agent invocations) that transform state - Edges define transitions between…
CrewAI Patterns
CrewAI's model has four concepts: Agent: An LLM-powered entity with a defined role, goal, and backstory. The role and goal form the agent's system prompt; the backstory provides…
Human-in-the-Loop (HITL) Design
Confidence-based triggers: The agent's uncertainty exceeds a defined threshold. Relevant when agents produce confidence scores (via log-probabilities, self-assessment, or explicit…
Agent Observability
Traces: End-to-end records of a single agent execution. A trace captures the ordered sequence of LLM calls, tool calls, and routing decisions for one workflow invocation, linked…
Model Context Protocol (MCP)
MCP defines three primitives: Tools: Functions the AI model can call with arguments to perform actions or retrieve data. Tools are the MCP equivalent of LLM function/tool calls —…
Agentic Security
Prompt Injection: Adversarial instructions embedded in inputs the agent processes (user messages, retrieved documents, tool results) that attempt to override the agent's system…
Part IIIEnterprise AI
01
02
03
04
05
06
07
08
Enterprise AI Strategy
Enterprise AI strategy operates at three levels that must be aligned for sustained value creation. Level 1 — Use Case Strategy: Which problems should AI solve? Not all problems…
AI Governance
AI governance operates at three layers that correspond to different organizational functions. Policy Layer: The principles, standards, and requirements that define acceptable AI…
Production Deployment of AI Systems
AI system deployments differ from traditional software deployments in four properties that each require adapted engineering patterns: Non-determinism: The same input to an LLM…
AI Cost Management
The cost of an LLM-based AI system consists of five components, each with different optimization levers: Input Token Cost: The cost of sending context to the model — system…
AI Observability and Monitoring
LLM observability operates across three layers that must work together to provide production confidence in clinical AI systems. Inference-Level Observability: Capturing the…
AI Platform Architecture
An AI platform is not a monolith. It is a set of shared services, each providing a specific capability, that individual AI applications use through well-defined interfaces. The…
AI Vendor Evaluation
Vendor evaluation in enterprise AI has two distinct phases that organizations frequently conflate: Phase 1 — Qualification: Determining which vendors are eligible for clinical AI…
AI Change Management
Clinical AI adoption follows a characteristic pattern that differs from enterprise software adoption in three ways: Professional autonomy: Physicians operate under a licensure…
Part IVAI Infrastructure
01
02
03
04
05
06
07
08
Vector Databases
An embedding model converts a piece of text (a sentence, a paragraph, a document) into a vector of floating-point numbers — typically 768 to 3072 dimensions. Two texts that are…
LLM Serving Infrastructure
During transformer inference, each attention layer computes key-value pairs for every token in the context. These KV pairs are reused for all subsequent tokens in the same…
Cloud AI Platforms
All three platforms provide the same core service: managed LLM inference with enterprise packaging. The differentiation is in: Model selection: Which models are available, when,…
Data Pipelines for AI
1. Chunking at fixed token sizes without regard for semantic boundaries. Fixed-size chunking frequently splits a clinical recommendation in the middle, creating chunks that lack…
Orchestration and Workflow Automation for AI
1. Using Airflow for long-running workflows. Airflow tasks time out and occupy slots for their entire duration. A workflow waiting for physician approval occupies an Airflow…
Caching Strategies for AI Systems
1. Setting similarity threshold too low. A threshold of 0.85 will return cached responses for semantically similar but meaningfully different queries ("What is the first-line…
GPU Infrastructure for AI Inference
Quantization reduces the numerical precision of model weights and/or activations to reduce memory footprint and potentially improve throughput. Each quantization scheme makes…
Networking and AI API Gateway Design
1. Request-level rate limiting for LLM traffic. Limiting to 100 requests per minute ignores that a single request may consume 10,000 tokens while another consumes 100.…
Part VEnterprise Integration
01
02
03
04
05
06
07
08
Enterprise Integration Patterns for AI
1. Using synchronous calls for long-running AI operations. A 30-second AI response on a synchronous call holds an HTTP connection and a thread for the entire duration. Under load,…
API Design for AI Services
1. Exposing model names in the API. model: "gpt-4o" in the request schema forces every consumer to know and specify the model. When the platform upgrades the model, all consumers…
Event-Driven AI
1. Auto-commit of Kafka offsets. With enableautocommit=True, Kafka commits the offset as soon as the message is polled — before processing. If the AI worker crashes during…
EHR Integration Patterns
1. FHIR requests in series during CDS Hook. Making FHIR API calls sequentially within a CDS Hook (Patient → Conditions → Medications → Labs → Allergies) consumes 1–2 seconds per…
Data Warehouse Integration for AI
1. Loading the entire dataset into a Pandas DataFrame. A population health dataset of 500,000 patients does not fit in a typical application server's memory as a DataFrame. Use…
Identity and Access for AI Systems
1. Hardcoding API keys in application configuration. API keys in config files, environment variables, or code are committed to version control, appear in container images, and…
Middleware and Enterprise Service Bus for AI
1. Bypassing the ESB to call AI services directly. Development teams often bypass the ESB to avoid "integration overhead." This creates ungoverned AI data flows that bypass…
Webhook and Callback Patterns for AI
1. Not verifying webhook signatures. A webhook receiver that does not verify the HMAC signature will process fabricated payloads from any source that knows the endpoint URL.…
Part VISecurity & Compliance
01
02
03
04
05
06
07
08
AI Security Fundamentals
1. Treating LLM security as identical to traditional injection defense. SQL injection defenses (parameterized queries) do not translate to prompt injection. LLMs process natural…
Prompt Injection Defense
1. Relying on a single defense layer. No single prompt injection defense is complete. A defense stack that relies solely on input pattern matching will be bypassed by novel…
HIPAA Compliance for AI Systems
1. Deploying clinical AI without confirming BAA coverage. Organizations deploy clinical AI features that include PHI in LLM prompts without confirming that the LLM provider has…
Data Privacy Architecture for AI
1. Assuming Safe Harbor de-identification is sufficient for AI training. Safe Harbor removes explicit identifiers but does not prevent memorization of clinical content or…
Zero Trust Architecture for AI Systems
1. Implementing network perimeter security and calling it Zero Trust. Placing AI services behind a VPN or private subnet does not implement Zero Trust. Zero Trust requires…
Audit and Logging for AI Systems
1. Logging request and response content for PHI-handling AI features. Even "debug" logs that include prompt content contain PHI for clinical AI features. Every log store that…
Model Security
- Models fine-tuned on clinical data can memorize and reproduce training data — conduct memorization audits before deployment - Membership inference allows adversaries to…
Regulatory Compliance for Enterprise AI
1. Assuming SOC 2 covers HIPAA. SOC 2 is a security framework; HIPAA is a privacy and security law. An organization can be SOC 2 Type II certified while being out of HIPAA…
Part VIIHealthcare AI
01
02
03
04
05
06
07
08
09
10
Healthcare AI Landscape
Healthcare AI can be organized along two axes: Axis 1 — Clinical Function: What does the AI do in the clinical workflow? Diagnostic functions (helping identify disease states),…
HIPAA and AI
Protected Health Information is individually identifiable health information held or transmitted by a Covered Entity or Business Associate. The definition has three components: 1.…
EHR Integration
HL7 v2 vs. FHIR R4: Different Purposes HL7 v2 and FHIR R4 are not competing standards; they address different integration scenarios: - HL7 v2 ADT feeds: Real-time event…
Clinical RAG
Clinical RAG differs from general-domain RAG in three important ways: Terminology density: Medical text uses precise, domain-specific vocabulary where term choice is clinically…
Clinical Decision Support
CDS Intervention Types CDS is not a single technology; it encompasses a spectrum of intervention types that vary in their timing, intrusiveness, and required clinical action: |…
HMS Reference Architecture
The HMS AI platform is organized around three principles that govern every architectural decision: Clinical workflow primacy: AI must serve the clinical workflow, not require the…
Medical Imaging AI
DICOM (Digital Imaging and Communications in Medicine) DICOM is the international standard for medical imaging data and communications. It defines: - Image storage format: A DICOM…
Patient Engagement AI
Patient engagement AI use cases span three clinical phases: Pre-visit: Appointment reminders, pre-visit intake (medical history, current medications, chief complaint),…
Clinical Documentation AI
Ambient Documentation Ambient documentation captures the patient-physician encounter in real time — typically through a microphone in the clinical examination room or worn by the…
AI Safety in Clinical Settings
The Four Clinical AI Safety Dimensions Clinical Harm: The AI produces an output (a recommendation, a diagnosis, a risk score, a drug interaction assessment) that is clinically…
Part VIIIForward Deployed Engineering
01
02
03
04
05
06
07
08
09
10
The Forward Deployed Engineer: Role and Responsibilities
The FDE role is best understood by contrast with adjacent roles that it superficially resembles: Sales Engineer (SE): Demonstrates product capabilities to prospects. The SE's…
Client Discovery Framework
Discovery has three layers, each building on the previous: Layer 1 — Technical environment: What systems exist, how they connect, what data is available, what integration…
AI Readiness Assessment
Readiness has three independent dimensions. An organization can be highly mature on one dimension and poorly prepared on another: Dimension 1 — Data Maturity: Can the AI system…
Demo Engineering
A demo has three layers that must be engineered independently: Layer 1 — The Environment: Where does the demo run? Is it isolated from production? Is it reproducible? What happens…
POC to Production
A well-designed POC has three properties that are often in tension: Feasible: Can be executed with the available time, data, and people. A 6-week POC scope for a 4-week engagement…
Architecture Review Facilitation
An architecture review has two phases that must be kept separate: Phase 1 — Current State Elicitation: The FDE maps the client's actual current-state architecture. This requires…
Value Engineering
Value engineering for AI deployments follows a standard structure: The complexity lies in benefit quantification. AI benefits typically span four categories: 1. Time savings:…
Client Communication
Effective client communication follows three principles: Bottom Line Up Front (BLUF): The most important information goes in the first sentence — not after the context. Executives…
Common Objections
Every objection response follows the same four-step structure: --- 1. Addressing the surface statement instead of the root. "Our data is too sensitive" can mean PHI-in-cloud…
Healthcare Client Playbook
Healthcare AI engagements follow the same eight-phase lifecycle as general FDE engagements (Discovery → Assessment → POC Design → POC Execution → Architecture Review → Production…
Part IXInterview Preparation
01
02
03
04
05
AI Architect Interview Guide
AI System Design Problems
Architecture Questions — Senior and Principal Level
ML Fundamentals for AI Architects
Behavioral Interview Questions
Quick ReferenceQuick Reference
01
02
03
04
05
06
07
08
09
AI Foundations — Quick Reference
> Last Updated: 2026-06-30 > Full Chapters: docs/01-AI-Foundations/(../01-AI-Foundations/) --- 1. "Why do LLMs hallucinate and how do you mitigate it in a clinical system?" → RAG…
Agentic AI — Quick Reference
| Dimension | Raw SDK (Anthropic) | LangGraph | CrewAI | |-----------|---------------------|-----------|--------| | Control | Maximum | High | Medium | | Configuration effort |…
Enterprise AI Operations — Quick Reference
> Last Updated: 2026-06-30 > Full Chapters: docs/03-Enterprise-AI/(../03-Enterprise-AI/) ---
AI Infrastructure — Quick Reference
Q: What chunking strategy would you use for clinical guidelines and why? Section-boundary chunking — clinical guidelines are structured with numbered recommendations that are the…
Enterprise Integration — Quick Reference
Q: A CDS Hook response is timing out. What are the likely causes? (1) FHIR reads in series instead of parallel — fix with asyncio.gather. (2) AI inference too slow — add 4.5s…
AI Security — Quick Reference
Q: What makes prompt injection harder to prevent than SQL injection? SQL injection is prevented by parameterized queries because SQL has a clear separation between code (query…
Healthcare AI — Quick Reference
> Last Updated: 2026-06-30 > Full Chapters: docs/07-Healthcare-AI/(../07-Healthcare-AI/) ---
Forward Deployed Engineering — Quick Reference
> Last Updated: 2026-06-30 > Full Chapters: docs/08-Forward-Deployed-Engineering/(../08-Forward-Deployed-Engineering/) ---
Interview Preparation — Quick Reference
Asking strong questions signals principal-level thinking: About the AI systems: - "What is the scale of your current AI workloads? Tokens per day? Concurrent users?" - "What is…