Version 1.0.0 Healthcare Edition

Enterprise AI Interview Companion

Architecture Patterns, Trade-offs, and Interview Questions

Interview preparation guide: core concepts, patterns, trade-offs, and structured interview questions per topic

📖 Start Reading 🗎 Download DOCX

84Chapters

76KWords

6h 22mReading Time

41Diagrams

70Tables

50Code Examples

All Publications

📐

Architect Playbook

Architecture-first depth — WHY, trade-offs, and design decisions

📚 70 chapters✎ ~112K🕐 ~9.5 hrs

Read Now →

📚

Technical Reference

Complete depth — all code, all patterns, all failure modes

📚 84 chapters✎ ~231K🕐 ~19 hrs

Read Now →

🎯

Interview Companion

Core patterns, trade-offs, and Q&A frameworks distilled

📚 84 chapters✎ ~75K🕐 ~6.5 hrs

Currently viewing

Part IAI Foundations

LLM Fundamentals

1,237w · 6 min

Embeddings and Vector Spaces

> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Foundational --- Higher-dimensional embeddings capture more nuanced semantic relationships…

1,066w · 5 min

Retrieval-Augmented Generation

1,029w · 5 min

Prompt Engineering

> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- System prompts are cached by Claude and many other providers after the…

857w · 4 min

Fine-Tuning vs RAG

> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- > Note on cost figures: Specific pricing is not quoted here because AI…

787w · 4 min

Evaluation and Benchmarking

> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Advanced --- | Pitfall | Description | Mitigation | |---------|-------------|-----------|…

946w · 5 min

Context Window Management

> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- With 200K+ context windows available, the tempting design is to put…

555w · 3 min

Multimodal AI

> Section: 01-AI-Foundations > Status: COMPLETE > Last Updated: 2026-06-30 > Difficulty: Intermediate --- | Failure Mode | Description | Mitigation |…

637w · 3 min

Part IIAgentic AI

Agent Architecture Fundamentals

An agent is an LLM equipped with: 1. Tools — functions the model can call (search, database query, API call, file write) 2. Memory — access to prior context beyond the current…

1,686w · 8 min

Tool Design Patterns

A tool has three components visible to the LLM: 1. Name — what the LLM calls in its tool invocation 2. Description — the primary signal the LLM uses for tool selection 3. Input…

1,352w · 7 min

Memory Systems

Memory in agent systems is not a single system — it is a stack of complementary stores, each operating at a different time horizon: No agent needs all five simultaneously. The…

1,375w · 7 min

Multi-Agent Systems

Three conditions justify the coordination overhead of multi-agent systems: 1. Work can be parallelized: Independent subtasks that could proceed simultaneously are being…

1,141w · 6 min

LangGraph Deep Dive

LangGraph models a workflow as a StateGraph: a directed graph where: - Nodes are Python functions (or agent invocations) that transform state - Edges define transitions between…

1,339w · 7 min

CrewAI Patterns

CrewAI's model has four concepts: Agent: An LLM-powered entity with a defined role, goal, and backstory. The role and goal form the agent's system prompt; the backstory provides…

978w · 5 min

Human-in-the-Loop (HITL) Design

Confidence-based triggers: The agent's uncertainty exceeds a defined threshold. Relevant when agents produce confidence scores (via log-probabilities, self-assessment, or explicit…

1,216w · 6 min

Agent Observability

Traces: End-to-end records of a single agent execution. A trace captures the ordered sequence of LLM calls, tool calls, and routing decisions for one workflow invocation, linked…

874w · 4 min

Model Context Protocol (MCP)

MCP defines three primitives: Tools: Functions the AI model can call with arguments to perform actions or retrieve data. Tools are the MCP equivalent of LLM function/tool calls —…

1,118w · 6 min

Agentic Security

Prompt Injection: Adversarial instructions embedded in inputs the agent processes (user messages, retrieved documents, tool results) that attempt to override the agent's system…

1,427w · 7 min

Part IIIEnterprise AI

Enterprise AI Strategy

Enterprise AI strategy operates at three levels that must be aligned for sustained value creation. Level 1 — Use Case Strategy: Which problems should AI solve? Not all problems…

2,231w · 11 min

AI Governance

AI governance operates at three layers that correspond to different organizational functions. Policy Layer: The principles, standards, and requirements that define acceptable AI…

1,692w · 8 min

Production Deployment of AI Systems

AI system deployments differ from traditional software deployments in four properties that each require adapted engineering patterns: Non-determinism: The same input to an LLM…

1,758w · 9 min

AI Cost Management

The cost of an LLM-based AI system consists of five components, each with different optimization levers: Input Token Cost: The cost of sending context to the model — system…

1,629w · 8 min

AI Observability and Monitoring

LLM observability operates across three layers that must work together to provide production confidence in clinical AI systems. Inference-Level Observability: Capturing the…

1,495w · 7 min

AI Platform Architecture

An AI platform is not a monolith. It is a set of shared services, each providing a specific capability, that individual AI applications use through well-defined interfaces. The…

1,584w · 8 min

AI Vendor Evaluation

Vendor evaluation in enterprise AI has two distinct phases that organizations frequently conflate: Phase 1 — Qualification: Determining which vendors are eligible for clinical AI…

1,435w · 7 min

AI Change Management

Clinical AI adoption follows a characteristic pattern that differs from enterprise software adoption in three ways: Professional autonomy: Physicians operate under a licensure…

1,682w · 8 min

Part IVAI Infrastructure

Vector Databases

An embedding model converts a piece of text (a sentence, a paragraph, a document) into a vector of floating-point numbers — typically 768 to 3072 dimensions. Two texts that are…

1,922w · 10 min

LLM Serving Infrastructure

During transformer inference, each attention layer computes key-value pairs for every token in the context. These KV pairs are reused for all subsequent tokens in the same…

1,234w · 6 min

Cloud AI Platforms

All three platforms provide the same core service: managed LLM inference with enterprise packaging. The differentiation is in: Model selection: Which models are available, when,…

1,123w · 6 min

Data Pipelines for AI

1. Chunking at fixed token sizes without regard for semantic boundaries. Fixed-size chunking frequently splits a clinical recommendation in the middle, creating chunks that lack…

791w · 4 min

Orchestration and Workflow Automation for AI

1. Using Airflow for long-running workflows. Airflow tasks time out and occupy slots for their entire duration. A workflow waiting for physician approval occupies an Airflow…

754w · 4 min

Caching Strategies for AI Systems

641w · 3 min

GPU Infrastructure for AI Inference

Quantization reduces the numerical precision of model weights and/or activations to reduce memory footprint and potentially improve throughput. Each quantization scheme makes…

1,019w · 5 min

Networking and AI API Gateway Design

1. Request-level rate limiting for LLM traffic. Limiting to 100 requests per minute ignores that a single request may consume 10,000 tokens while another consumes 100.…

682w · 3 min

Part VEnterprise Integration

Enterprise Integration Patterns for AI

1. Using synchronous calls for long-running AI operations. A 30-second AI response on a synchronous call holds an HTTP connection and a thread for the entire duration. Under load,…

279w · 1 min

API Design for AI Services

1. Exposing model names in the API. model: "gpt-4o" in the request schema forces every consumer to know and specify the model. When the platform upgrades the model, all consumers…

192w · 1 min

Event-Driven AI

1. Auto-commit of Kafka offsets. With enableautocommit=True, Kafka commits the offset as soon as the message is polled — before processing. If the AI worker crashes during…

201w · 1 min

EHR Integration Patterns

1. FHIR requests in series during CDS Hook. Making FHIR API calls sequentially within a CDS Hook (Patient → Conditions → Medications → Labs → Allergies) consumes 1–2 seconds per…

208w · 1 min

Data Warehouse Integration for AI

1. Loading the entire dataset into a Pandas DataFrame. A population health dataset of 500,000 patients does not fit in a typical application server's memory as a DataFrame. Use…

217w · 1 min

Identity and Access for AI Systems

1. Hardcoding API keys in application configuration. API keys in config files, environment variables, or code are committed to version control, appear in container images, and…

356w · 2 min

Middleware and Enterprise Service Bus for AI

1. Bypassing the ESB to call AI services directly. Development teams often bypass the ESB to avoid "integration overhead." This creates ungoverned AI data flows that bypass…

244w · 1 min

Webhook and Callback Patterns for AI

1. Not verifying webhook signatures. A webhook receiver that does not verify the HMAC signature will process fabricated payloads from any source that knows the endpoint URL.…

266w · 1 min

Part VISecurity & Compliance

AI Security Fundamentals

1. Treating LLM security as identical to traditional injection defense. SQL injection defenses (parameterized queries) do not translate to prompt injection. LLMs process natural…

213w · 1 min

Prompt Injection Defense

1. Relying on a single defense layer. No single prompt injection defense is complete. A defense stack that relies solely on input pattern matching will be bypassed by novel…

202w · 1 min

HIPAA Compliance for AI Systems

1. Deploying clinical AI without confirming BAA coverage. Organizations deploy clinical AI features that include PHI in LLM prompts without confirming that the LLM provider has…

234w · 1 min

Data Privacy Architecture for AI

1. Assuming Safe Harbor de-identification is sufficient for AI training. Safe Harbor removes explicit identifiers but does not prevent memorization of clinical content or…

210w · 1 min

Zero Trust Architecture for AI Systems

1. Implementing network perimeter security and calling it Zero Trust. Placing AI services behind a VPN or private subnet does not implement Zero Trust. Zero Trust requires…

245w · 1 min

Audit and Logging for AI Systems

1. Logging request and response content for PHI-handling AI features. Even "debug" logs that include prompt content contain PHI for clinical AI features. Every log store that…

244w · 1 min

Model Security

- Models fine-tuned on clinical data can memorize and reproduce training data — conduct memorization audits before deployment - Membership inference allows adversaries to…

102w · 1 min

Regulatory Compliance for Enterprise AI

1. Assuming SOC 2 covers HIPAA. SOC 2 is a security framework; HIPAA is a privacy and security law. An organization can be SOC 2 Type II certified while being out of HIPAA…

264w · 1 min

Part VIIHealthcare AI

Healthcare AI Landscape

Healthcare AI can be organized along two axes: Axis 1 — Clinical Function: What does the AI do in the clinical workflow? Diagnostic functions (helping identify disease states),…

1,570w · 8 min

HIPAA and AI

Protected Health Information is individually identifiable health information held or transmitted by a Covered Entity or Business Associate. The definition has three components: 1.…

1,430w · 7 min

EHR Integration

HL7 v2 vs. FHIR R4: Different Purposes HL7 v2 and FHIR R4 are not competing standards; they address different integration scenarios: - HL7 v2 ADT feeds: Real-time event…

1,357w · 7 min

Clinical RAG

Clinical RAG differs from general-domain RAG in three important ways: Terminology density: Medical text uses precise, domain-specific vocabulary where term choice is clinically…

1,088w · 5 min

Clinical Decision Support

CDS Intervention Types CDS is not a single technology; it encompasses a spectrum of intervention types that vary in their timing, intrusiveness, and required clinical action: |…

1,241w · 6 min

HMS Reference Architecture

The HMS AI platform is organized around three principles that govern every architectural decision: Clinical workflow primacy: AI must serve the clinical workflow, not require the…

1,464w · 7 min

Medical Imaging AI

DICOM (Digital Imaging and Communications in Medicine) DICOM is the international standard for medical imaging data and communications. It defines: - Image storage format: A DICOM…

1,088w · 5 min

Patient Engagement AI

Patient engagement AI use cases span three clinical phases: Pre-visit: Appointment reminders, pre-visit intake (medical history, current medications, chief complaint),…

1,048w · 5 min

Clinical Documentation AI

Ambient Documentation Ambient documentation captures the patient-physician encounter in real time — typically through a microphone in the clinical examination room or worn by the…

1,185w · 6 min

AI Safety in Clinical Settings

The Four Clinical AI Safety Dimensions Clinical Harm: The AI produces an output (a recommendation, a diagnosis, a risk score, a drug interaction assessment) that is clinically…

1,180w · 6 min

Part VIIIForward Deployed Engineering

The Forward Deployed Engineer: Role and Responsibilities

The FDE role is best understood by contrast with adjacent roles that it superficially resembles: Sales Engineer (SE): Demonstrates product capabilities to prospects. The SE's…

2,247w · 11 min

Client Discovery Framework

Discovery has three layers, each building on the previous: Layer 1 — Technical environment: What systems exist, how they connect, what data is available, what integration…

2,099w · 10 min

AI Readiness Assessment

Readiness has three independent dimensions. An organization can be highly mature on one dimension and poorly prepared on another: Dimension 1 — Data Maturity: Can the AI system…

2,528w · 13 min

Demo Engineering

A demo has three layers that must be engineered independently: Layer 1 — The Environment: Where does the demo run? Is it isolated from production? Is it reproducible? What happens…

1,924w · 10 min

POC to Production

A well-designed POC has three properties that are often in tension: Feasible: Can be executed with the available time, data, and people. A 6-week POC scope for a 4-week engagement…

1,892w · 9 min

Architecture Review Facilitation

An architecture review has two phases that must be kept separate: Phase 1 — Current State Elicitation: The FDE maps the client's actual current-state architecture. This requires…

2,265w · 11 min

Value Engineering

Value engineering for AI deployments follows a standard structure: The complexity lies in benefit quantification. AI benefits typically span four categories: 1. Time savings:…

1,949w · 10 min

Client Communication

Effective client communication follows three principles: Bottom Line Up Front (BLUF): The most important information goes in the first sentence — not after the context. Executives…

1,045w · 5 min

Common Objections

Every objection response follows the same four-step structure: --- 1. Addressing the surface statement instead of the root. "Our data is too sensitive" can mean PHI-in-cloud…

913w · 5 min

Healthcare Client Playbook

Healthcare AI engagements follow the same eight-phase lifecycle as general FDE engagements (Discovery → Assessment → POC Design → POC Execution → Architecture Review → Production…

3,653w · 18 min

Part IXInterview Preparation

AI Architect Interview Guide

5w · 1 min

AI System Design Problems

5w · 1 min

Architecture Questions — Senior and Principal Level

8w · 1 min

ML Fundamentals for AI Architects

6w · 1 min

Behavioral Interview Questions

4w · 1 min

Quick ReferenceQuick Reference

AI Foundations — Quick Reference

> Last Updated: 2026-06-30 > Full Chapters: docs/01-AI-Foundations/(../01-AI-Foundations/) --- 1. "Why do LLMs hallucinate and how do you mitigate it in a clinical system?" → RAG…

125w · 1 min

Agentic AI — Quick Reference

| Dimension | Raw SDK (Anthropic) | LangGraph | CrewAI | |-----------|---------------------|-----------|--------| | Control | Maximum | High | Medium | | Configuration effort |…

311w · 2 min

Enterprise AI Operations — Quick Reference

> Last Updated: 2026-06-30 > Full Chapters: docs/03-Enterprise-AI/(../03-Enterprise-AI/) ---

16w · 1 min

AI Infrastructure — Quick Reference

Q: What chunking strategy would you use for clinical guidelines and why? Section-boundary chunking — clinical guidelines are structured with numbered recommendations that are the…

213w · 1 min

Enterprise Integration — Quick Reference

Q: A CDS Hook response is timing out. What are the likely causes? (1) FHIR reads in series instead of parallel — fix with asyncio.gather. (2) AI inference too slow — add 4.5s…

195w · 1 min

AI Security — Quick Reference

Q: What makes prompt injection harder to prevent than SQL injection? SQL injection is prevented by parameterized queries because SQL has a clear separation between code (query…

223w · 1 min

Healthcare AI — Quick Reference

> Last Updated: 2026-06-30 > Full Chapters: docs/07-Healthcare-AI/(../07-Healthcare-AI/) ---

15w · 1 min

Forward Deployed Engineering — Quick Reference

> Last Updated: 2026-06-30 > Full Chapters: docs/08-Forward-Deployed-Engineering/(../08-Forward-Deployed-Engineering/) ---

16w · 1 min

Interview Preparation — Quick Reference

Asking strong questions signals principal-level thinking: About the AI systems: - "What is the scale of your current AI workloads? Tokens per day? Concurrent users?" - "What is…

318w · 2 min