Demo Engineering

Conceptual Explanation

A demo has three layers that must be engineered independently:

Layer 1 — The Environment: Where does the demo run? Is it isolated from production? Is it reproducible? What happens if the primary environment fails?

Layer 2 — The Data: What data does the demo process? Synthetic? De-identified client data? Sandbox data from the client's EHR? The data choice determines how convincing the demo is and what regulatory constraints apply.

Layer 3 — The Narrative: What story does the demo tell? Demos that start with features fail. Demos that start with the client's specific problem and show how the product solves it succeed.

All three layers must be designed together. A brilliant narrative with unreliable infrastructure produces a failed demo. A reliable infrastructure with generic data produces a forgettable demo.

Core Architecture: Demo Infrastructure

Environment Design Principles

Isolation: The demo environment must be completely isolated from any client data, production systems, or shared infrastructure that could introduce failures. A demo that depends on a production LLM endpoint that is experiencing latency will fail at exactly the moment the audience's attention is highest.

Reproducibility: Every component of the demo must be reproducible on demand. Scripts, not clicks. Infrastructure as code, not manual setup. A demo that requires 45 minutes of manual setup before each session is not demo-ready.

Fallback tiers: Every demo must have at least two fallback tiers:

  • Tier 1: Live demo against the primary environment
  • Tier 2: Live demo against a local backup environment (no internet dependency)
  • Tier 3: Recorded walkthrough (never the first choice, but always available)
python
# Demo environment configuration pattern
# All values are environment variables — never hardcoded in demo scripts

import os
from dataclasses import dataclass
from enum import Enum

class DemoEnvironment(Enum):
    PRIMARY = "primary"     # Cloud environment with optimal performance
    LOCAL_BACKUP = "local"  # Local container fallback — no internet dependency
    RECORDED = "recorded"   # Pre-recorded walkthrough — last resort

@dataclass
class DemoConfig:
    environment: DemoEnvironment
    llm_endpoint: str        # Primary: cloud API; Local: Ollama or LM Studio
    fhir_endpoint: str       # Synthetic FHIR server — never production
    demo_data_path: str      # Pre-loaded synthetic patient encounters
    model_id: str            # Specific model version — pinned, not "latest"
    max_response_timeout: float  # 8.0 seconds — if LLM exceeds this, show fallback
    
    @classmethod
    def from_environment(cls) -> "DemoConfig":
        env = DemoEnvironment(os.getenv("DEMO_ENV", "primary"))
        
        if env == DemoEnvironment.PRIMARY:
            return cls(
                environment=env,
                llm_endpoint=os.getenv("LLM_API_ENDPOINT"),
                fhir_endpoint=os.getenv("FHIR_DEMO_ENDPOINT"),
                demo_data_path=os.getenv("DEMO_DATA_PATH", "./demo-data"),
                model_id=os.getenv("DEMO_MODEL_ID"),  # Pin exact version
                max_response_timeout=8.0
            )
        elif env == DemoEnvironment.LOCAL_BACKUP:
            return cls(
                environment=env,
                llm_endpoint="http://localhost:11434",  # Ollama
                fhir_endpoint="http://localhost:8080",  # Local HAPI FHIR
                demo_data_path="./demo-data",
                model_id="llama3",  # Local model
                max_response_timeout=30.0  # Local inference is slower
            )

Pinned versions: Every demo dependency — LLM model version, API version, library version — must be pinned. "Latest" in a demo is an undefined behavior contract. A model update that changes output style overnight can render a carefully scripted demo incoherent.

Data Strategy

The data strategy for a demo determines its credibility and its regulatory risk:

Data Type Credibility Regulatory Risk Best For
Synthetic (generated) Low — obviously fake None General product demos, early evaluation
De-identified client data High — client recognizes their patterns Low (if properly de-identified) Client-specific demos post-assessment
Sandbox EHR data High — authentic data structure Low (sandbox = not real PHI) Technical integration demos
Production data Maximum High (HIPAA risk — avoid) Never in demo contexts

For healthcare demos, the rule is absolute: no real PHI in demo environments. Synthetic patient data generators (Synthea, custom scripts) can produce realistic clinical encounters that demonstrate the AI's capability without creating HIPAA risk.

python
# Synthetic patient encounter builder for HMS demo
# Educational Example — Not intended for clinical use

from typing import Optional
import json
from datetime import date, timedelta
import random

def build_synthetic_discharge_encounter(
    primary_diagnosis: str = "Community-acquired pneumonia",
    age_range: tuple = (65, 80),
    los_days_range: tuple = (3, 7),
    comorbidities: Optional[list[str]] = None,
) -> dict:
    """
    Build a synthetic inpatient encounter for demo purposes.
    All values are synthetic — no real patient data.
    """
    if comorbidities is None:
        comorbidities = ["Type 2 diabetes mellitus", "Hypertension", "Chronic kidney disease, Stage 3"]
    
    admit_date = date.today() - timedelta(days=random.randint(*los_days_range))
    discharge_date = date.today()
    
    return {
        "patient": {
            "id": "DEMO-PT-001",  # Clearly synthetic ID
            "name": "Demo Patient",  # Generic name — never a real name
            "age": random.randint(*age_range),
            "gender": "Male",
            "mrn": "DEMO-00001"  # Clearly demo MRN
        },
        "encounter": {
            "id": "DEMO-ENC-001",
            "type": "Inpatient",
            "admit_date": admit_date.isoformat(),
            "discharge_date": discharge_date.isoformat(),
            "los_days": (discharge_date - admit_date).days,
            "attending_physician": "Demo Attending, MD"
        },
        "diagnoses": [
            {"code": "J18.9", "description": primary_diagnosis, "type": "Primary"},
            *[{"code": "DEMO", "description": c, "type": "Secondary"} for c in comorbidities]
        ],
        "medications": [
            {"name": "Azithromycin", "dose": "500mg", "route": "PO", "frequency": "Daily"},
            {"name": "Ceftriaxone", "dose": "1g", "route": "IV", "frequency": "Q24H"},
            {"name": "Metformin", "dose": "1000mg", "route": "PO", "frequency": "BID"},
            {"name": "Lisinopril", "dose": "10mg", "route": "PO", "frequency": "Daily"},
        ],
        "vitals_summary": {
            "admit": {"temp": 38.8, "hr": 102, "sbp": 128, "dbp": 78, "o2_sat": 91, "rr": 22},
            "current": {"temp": 37.1, "hr": 82, "sbp": 124, "dbp": 76, "o2_sat": 97, "rr": 16}
        },
        "labs_summary": {
            "wbc_admit": 14.2,
            "wbc_current": 9.1,
            "creatinine_admit": 1.6,
            "creatinine_current": 1.4,
            "cxr_finding": "Right lower lobe infiltrate, improving compared to admission"
        },
        "disclaimer": "SYNTHETIC DEMO DATA — NOT REAL PATIENT INFORMATION"
    }

Demo Script Engineering

A demo script is not a spoken script — it is a structured guide that defines:

  • The problem statement opening
  • The transition from problem to product
  • The specific actions performed in the product (keystrokes, clicks, inputs)
  • The expected outputs and how to narrate them
  • The audience questions anticipated at each stage
  • The fallback response if the expected output does not appear

Demo script structure for HMS discharge summary demo:

markdown
# Demo Script: Discharge Summary AI
# Audience: CMIO + Physician Champion + IT Director
# Duration: 20 minutes + Q&A
# Environment: Demo environment, synthetic patient data

## Architecture Diagram

```mermaid
graph TD
    subgraph "Demo Environment (Isolated)"
        FHIR["Synthetic FHIR Server\nHAPI FHIR or local"]
        DATA["Demo Data Store\nSynthetic patients (Synthea)"]
        APP["Demo Application\nSMART on FHIR app"]
        GW["Demo AI Gateway\nLiteLLM (demo config)"]
        LLM["LLM API\nPinned model version"]
    end

    subgraph "Fallback Tiers"
        FB1["Tier 1: Cloud Primary\nFull capabilities"]
        FB2["Tier 2: Local Backup\nOllama + local HAPI FHIR"]
        FB3["Tier 3: Recording\nPre-captured walkthrough"]
    end

    subgraph "Audience"
        CMIO["CMIO\nClinical value"]
        PHYS["Physician Champion\nWorkflow realism"]
        IT["IT Director\nIntegration architecture"]
    end

    DATA --> FHIR
    FHIR --> APP
    APP --> GW
    GW --> LLM
    LLM --> APP

    APP --> CMIO & PHYS & IT

    FB1 -.->|"Network failure"| FB2
    FB2 -.->|"Local failure"| FB3

Common Mistakes

1. Using "latest" as the model version. A model update the night before the demo can change output style, introduce unexpected formatting, or produce unexpected content. Pin the exact model version used during rehearsal.

2. Depending on live internet during the demo. Hotel WiFi, conference center networks, and even enterprise office networks are unreliable during demos. The local backup environment must be ready and pre-tested.

3. Starting with the product. FDEs who open the demo application as their first action lose the audience who needed to understand the problem first. The problem statement is the opening.

4. Not running rehearsal on the actual demo network. Rehearsals on the FDE's home network do not reveal corporate proxy issues, DNS resolution failures, or port blocks that appear in the client's environment.

5. Using obviously fake data that breaks clinical realism. "Patient Name: Test User, Diagnosis: Test Diagnosis" destroys the demo's credibility. Synthetic data must look realistic.

6. Not preparing for the "Can you show me X?" question. Having only one pre-built scenario means that any question outside the scripted path produces an "I'll show you that later" deflection — which signals lack of depth.

Best Practices

  • Pin every version in the demo environment — model, library, API
  • Always have a local backup environment ready and pre-tested on the day
  • Open with the client's specific problem, not the product
  • Synthetic patient data must be clinically coherent — have a clinician review it
  • Prepare 3–5 pre-built scenarios for different clinical contexts
  • Run a full rehearsal on the client's network (or representative network) the day before
  • Have pre-generated fallback outputs for every generation step
  • Never show real PHI in a demo — not even de-identified data that might be traceable
  • End every demo with a specific next step — not "let us know if you have questions"

Alternatives

Demo Approach When to Use Trade-off
Live interactive demo (primary) When audience includes technical stakeholders; when client-specific data is available Highest credibility; failure risk
Recorded walkthrough When live demo is too risky (key executive meeting, spotty network) No failure risk; lowest credibility
Client-data sandbox demo When assessment is complete and sandbox access is available Maximum relevance; setup complexity
Collaborative build session When technical audience wants to see the engineering Deepest credibility; time-intensive

Trade-offs

Realism vs. risk: The more realistic the demo (real client data structures, authentic clinical scenarios), the higher the audience engagement — but also the higher the setup complexity and the failure risk. FDEs must calibrate the realism level to the stakes of the meeting.

Depth vs. breadth: A 20-minute demo that goes deep on one use case is more credible than a 20-minute demo that skims five features. Healthcare FDEs should default to depth on the specific use case that matches the client's primary pain point.

Interview Questions

Q: How do you design a demo environment to be reliable for live client presentations?

Category: System Design Difficulty: Senior Role: FDE

Answer Framework:

Demo reliability is an engineering problem, not a hope. The design principles are isolation, reproducibility, and fallback tiers.

Isolation means the demo environment has no dependencies on production systems, shared infrastructure, or the client's network that the FDE does not control. All FHIR data is pre-fetched and cached. The LLM model version is pinned. All API credentials are demo-specific.

Reproducibility means everything that runs in the demo was created by a script, not manually. The environment can be torn down and rebuilt in under 30 minutes if something goes wrong.

Fallback tiers mean there are always at least two alternatives if the primary environment fails. A local container environment (Ollama for LLM inference, HAPI FHIR for data) can run the same demo without internet. Pre-recorded walkthroughs exist for catastrophic failure scenarios.

Additionally, all LLM generation steps have timeout handling that gracefully falls back to a pre-generated response. The narration for fallback activation is scripted in advance.

Key Points to Hit:

  • Isolation from production/shared infrastructure
  • Pinned versions for all dependencies
  • Local backup environment always ready
  • Pre-generated fallbacks for all generation steps
  • Rehearsal on client-representative network

Red Flags:

  • "We rely on the live API — it's usually fast enough"
  • Not having a local backup environment

Key Takeaways

  • A demo is an engineered artifact, not a screen share — treat demo reliability as a first-class constraint
  • Three layers must be engineered independently: environment, data, and narrative
  • Pin every version — model, library, API — nothing "latest"
  • Always have a local backup environment ready and pre-tested on demo day
  • Open with the client's problem, not the product
  • Synthetic data must be clinically coherent to retain healthcare audience credibility
  • All healthcare demos require a medical disclaimer — verbal and visible
  • Pre-generate fallback outputs for every LLM generation step