Architecture Review Facilitation

Conceptual Explanation

An architecture review has two phases that must be kept separate:

Phase 1 — Current State Elicitation: The FDE maps the client's actual current-state architecture. This requires listening and questioning, not presenting. The FDE's job in this phase is to build an accurate model of what exists, not to evaluate it.

Phase 2 — Gap and Risk Analysis: The FDE compares the current state against the required state for the target AI deployment. Gaps are the differences. Risks are the consequences of not closing those gaps.

Mixing these phases — critiquing the architecture while still eliciting information — causes the client to become defensive and stop sharing accurate information. The phases must be distinct.

Core Architecture: The Review Framework

Pre-Review Preparation (FDE Working Independently)

Before the review session, the FDE builds a pre-review architecture map from discovery and assessment artifacts:

python

@dataclass
class PreReviewArchitectureMap:
    """
    FDE's working model of the client's architecture before the review session.
    All items should be marked as CONFIRMED (verified) or ASSUMED (to confirm in review).
    """
    
    # EHR Layer
    ehr_system: str                    # "Epic FHIR R4, version 2023.1 — CONFIRMED"
    ehr_integration_standard: str     # "FHIR R4 + HL7 v2 ADT — CONFIRMED"
    smart_on_fhir_status: str         # "Available, App Orchard pending — CONFIRMED"
    cds_hooks_availability: str       # "Available in Epic — ASSUMED: confirm in review"
    
    # Cloud / Network
    cloud_provider: str               # "Azure — CONFIRMED"
    phi_in_cloud_policy: str          # "Approved for HIPAA BAA vendors — CONFIRMED"
    llm_outbound_access: str          # "Permitted — ASSUMED: test not yet run"
    ai_gateway_status: str            # "Not deployed — CONFIRMED"
    network_topology: str             # "Single datacenter + Azure region — ASSUMED"
    
    # Security
    tls_policy: str                   # "TLS 1.2+ required — CONFIRMED"
    audit_logging_capability: str     # "Splunk SIEM — CONFIRMED"
    baa_status: str                   # "Azure BAA signed; Anthropic BAA pending — CONFIRMED"
    security_review_process: str      # "IT Security review + Compliance sign-off — CONFIRMED"
    
    # Organizational
    ai_gateway_owner: str             # "IT Infrastructure team — CONFIRMED"
    model_governance: str             # "No formal process — ASSUMED: confirm in review"
    prompt_management: str            # "Ad hoc — ASSUMED"
    
    # Unknown / To Elicit
    open_questions: list[str]         # Items to resolve in the review session

Review preparation checklist:

text

[ ] Current state architecture map drafted with CONFIRMED / ASSUMED tags
[ ] Discovery Summary and Assessment Report reviewed
[ ] Integration risk patterns identified (from standard pattern library)
[ ] Open questions documented (to resolve in review session)
[ ] Review agenda prepared and shared with client 48h before session
[ ] Whiteboard or diagramming tool ready for live architecture drawing
[ ] Architecture Review Report template prepared (fill during session)

Review Session Structure

Duration: 3–4 hours for a major architecture review. Break at 90-minute intervals.

Participants:

FDE (facilitator)
Client: IT Architect, IT Director, Clinical Informatics Engineer (for healthcare)
Optional: Cloud architect, security architect (for infrastructure-heavy reviews)
Not in the room: Executives, sales, non-technical stakeholders (they get the report)

Agenda:

text

BLOCK 1 — Current State Architecture Walk (60–90 min)

Objective: Build an accurate current-state diagram collaboratively.
Method: FDE draws on whiteboard; client corrects and adds.

Opening: "I want to start by making sure I have an accurate picture of your
 current architecture. I've built a working map from our prior conversations —
 let's validate and correct it together."

Technique: Draw the FDE's pre-review map on the whiteboard. Ask the client
to correct it. Incorrect assumptions surface more information than open-ended
questions. "I have it that your integration engine sends ADT^A01 messages 
to Epic — is that right, or does it go the other direction?"

Cover:
  - Data flows for the target use case
  - Authentication and authorization paths
  - Network topology (on-prem / cloud / peering)
  - Security architecture (where PHI flows, what controls exist)
  - Existing integration points relevant to the AI use case
  - Current monitoring and observability

BREAK — 15 minutes

BLOCK 2 — Target State and Gap Identification (60 min)

Objective: Map where the AI system needs to connect and identify the gaps.

Technique: Add the target AI architecture to the current-state diagram.
Draw the connections that need to exist for the AI system to work.
Ask: "What has to be true for this connection to work? Does that condition
currently hold?"

Cover:
  - AI system component placement (SMART app, CDS Hooks service, AI gateway, LLM)
  - Data paths from EHR to AI to LLM and back
  - Authentication path (SMART token, AI gateway virtual key, LLM API key)
  - Security controls on each path
  - Failure modes for each connection

BREAK — 15 minutes

BLOCK 3 — Risk Assessment and Prioritization (45 min)

Objective: Identify, categorize, and prioritize the architectural risks.

Technique: Walk through each gap identified in Block 2. For each gap:
  1. Is it a risk? (Could it cause failure or security incident?)
  2. What is the likelihood? (High / Medium / Low given client's environment)
  3. What is the consequence? (Service unavailability / HIPAA incident / clinical harm)
  4. What is the mitigation?

Cover:
  - PHI data path risks
  - Availability risks (LLM downtime, latency, rate limits)
  - Security risks (authentication gaps, audit log gaps)
  - Governance risks (model updates without re-evaluation)
  - Clinical safety risks (AI failure affecting clinical workflow)

BLOCK 4 — Recommendations and Next Steps (30 min)

Objective: Agree on which risks to mitigate, in what order, and who owns each.

Output: Architecture Review Report action items with owners and dates.

Risk Pattern Library

Experienced FDEs recognize a small set of recurring architectural risk patterns in enterprise AI systems. Building these into a pattern library allows faster and more consistent risk identification:

python

ARCHITECTURE_RISK_PATTERNS = [
    {
        "pattern": "PHI in observability traces",
        "description": "LLM inference requests containing PHI are logged verbatim in observability systems",
        "detection_question": "Where do LLM inference request and response payloads go? Are they logged?",
        "consequence": "HIPAA Security Rule violation; PHI accessible to any engineer with logging access",
        "mitigation": "AI gateway scrubs PHI from all traces; only hashed patient IDs and metadata in logs"
    },
    {
        "pattern": "No AI gateway — direct LLM API calls",
        "description": "Application code calls LLM APIs directly without a centralized gateway",
        "detection_question": "Where do LLM API keys live? Are they per-application or centralized?",
        "consequence": "No cost attribution; no audit logging; no rate limiting; no prompt versioning",
        "mitigation": "Deploy AI gateway (LiteLLM, Azure AI Foundry) before production"
    },
    {
        "pattern": "CDS Hook with blocking LLM dependency",
        "description": "CDS Hook service makes synchronous LLM call without circuit breaker",
        "detection_question": "What happens in your CDS Hook service if the LLM API is unavailable or slow?",
        "consequence": "EHR workflow blocked when LLM API times out; patient care delay",
        "mitigation": "3-second timeout with circuit breaker; return empty card array on timeout"
    },
    {
        "pattern": "No model version pinning",
        "description": "Production system calls LLM API with 'latest' model or unpinned version",
        "detection_question": "How is the model version specified in production API calls?",
        "consequence": "Unexpected behavior change when vendor updates model; production incident",
        "mitigation": "Pin exact model version; define evaluation and approval process for updates"
    },
    {
        "pattern": "AI output without physician review gate",
        "description": "AI-generated clinical content is written to EHR without physician approval",
        "detection_question": "What is the workflow from AI generation to note appearing in the EHR?",
        "consequence": "AI error enters medical record; potential patient harm; liability",
        "mitigation": "Physician review and approval required before DocumentReference write; no auto-filing"
    },
    {
        "pattern": "No prompt version management",
        "description": "Prompts are modified directly in production without version control or evaluation",
        "detection_question": "How are prompts changed in production? Who approves changes?",
        "consequence": "Undetected quality regression; no rollback capability",
        "mitigation": "Prompt Registry with versioning; evaluation before production deployment"
    },
    {
        "pattern": "FHIR access with over-broad scopes",
        "description": "SMART application requests system-level FHIR scopes instead of patient-level",
        "detection_question": "What FHIR scopes does the application request? System/* or patient/*?",
        "consequence": "HIPAA Minimum Necessary violation; access to all patients instead of current patient",
        "mitigation": "Minimum Necessary scopes: patient/{resource}.read per use case"
    },
    {
        "pattern": "No fallback when AI service is unavailable",
        "description": "Clinical workflow has no fallback path when AI system is unavailable",
        "detection_question": "What do clinicians do if the AI tool is down? Is there a manual fallback?",
        "consequence": "Clinical workflow disruption; workarounds that bypass safety controls",
        "mitigation": "Design fallback workflow before launch; never make AI availability a dependency"
    },
]

Architecture Diagram

graph TD subgraph "Pre-Review" PREP["FDE Preparation\nArchitecture map\nOpen questions\nRisk patterns"] AGENDA["Agenda Distributed\n48h before session"] end subgraph "Review Session" B1["Block 1: Current State\nCollaborative mapping\n60–90 min"] B2["Block 2: Target + Gaps\nAdd AI system to map\n60 min"] B3["Block 3: Risk Assessment\nLikelihood × Consequence\n45 min"] B4["Block 4: Recommendations\nPrioritized actions\n30 min"] end subgraph "Architecture Review Report" EXEC["Executive Summary\n(for CIO/CMIO)"] ARCH["Architecture Diagram\n(current + target)"] RISKS["Risk Register\nPriority / Owner / Date"] ACTIONS["Action Plan\nBlocking / Important / Advisory"] end subgraph "Follow-Up" FU1["Architecture Review Report delivered\nWithin 48h of session"] FU2["Action item tracking\nWeekly check-in"] FU3["Risk closure confirmation\nBefore production go-live"] end PREP --> AGENDA --> B1 --> B2 --> B3 --> B4 B4 --> EXEC & ARCH & RISKS & ACTIONS EXEC & ARCH & RISKS & ACTIONS --> FU1 --> FU2 --> FU3

Architecture Diagram

[Current state + target state diagram with AI system components added. Mark each connection with its security classification (PHI path / internal / external)]

Common Mistakes

1. Skipping the current state elicitation and jumping to recommendations. FDEs who arrive at the review with a pre-determined architecture recommendation and spend the session defending it are running a sales presentation, not an architecture review. Current state must be accurately mapped first.

2. Letting executives in the review room. Executive presence changes the dynamic — technical staff become less forthcoming about problems and limitations. Architecture reviews are engineering sessions. Executives get the report.

3. Not documenting CONFIRMED vs. ASSUMED. Architecture maps that do not distinguish confirmed facts from assumptions create false confidence. Assumptions that turn out to be wrong in production are architectural failures that were foreseeable.

4. Missing the AI-specific risk patterns. Reviewers who apply only traditional software architecture risk patterns miss the AI-specific risks (no prompt version management, no circuit breaker on LLM dependency, PHI in traces). The pattern library must include AI-specific patterns.

5. Producing a report that is too long to be read. An Architecture Review Report that is 40 pages long will not be read by the IT Director. The report must be scannable — Risk Register table, Action Plan with owners and dates, Architecture Diagram. Supporting detail goes in appendices.

Best Practices

Prepare a pre-review architecture map with CONFIRMED / ASSUMED tags before the session
Draw on the whiteboard; let the client correct; never present a complete diagram and ask for agreement
Run Block 1 (current state) and Block 2 (gap analysis) as separate sessions — do not mix
Apply the AI-specific risk pattern library systematically
Produce the Architecture Review Report within 48 hours of the session
Require blocking issues to be resolved before production go-live — not after
Schedule the next architecture review at 30 days post-launch

Trade-offs

Depth vs. breadth: A review that covers only the integration path (narrow) misses organizational and governance risks. A review that covers everything (broad) loses focus. The right scope is determined by the highest-risk dimensions identified in the assessment.

Directness vs. client relationship: Surfacing a significant architectural flaw that the client's team designed can create awkwardness. The FDE's credibility is built on directness — but the delivery must be constructive. Frame risks as "here's what we need to solve together" rather than "this is wrong."

Interview Questions

Q: What AI-specific architectural risks do you look for that traditional software architecture reviews miss?

Category: Architecture Difficulty: Principal Role: FDE / AI Architect

Answer Framework:

Traditional software architecture reviews miss several risk patterns that are specific to AI systems:

PHI in observability: LLM requests contain PHI in the prompt payload. Traditional monitoring would log the full request. In healthcare, this creates a HIPAA incident. AI systems need gateway-level PHI scrubbing before logs are written.

Non-determinism and drift: Traditional systems have deterministic behavior. AI systems degrade gradually — model version updates, prompt changes, and data distribution shifts all cause quality drift that is invisible without monitoring. The review must ask: how will you know when the output quality has degraded?

LLM API as a single point of failure in clinical workflows: A synchronous LLM dependency in a clinical workflow (CDS Hook) will block the workflow when the API experiences latency. The circuit breaker pattern is required — but traditional architecture reviews do not ask about it.

Model version as a governance artifact: When an LLM vendor updates a model, the system behavior changes. The architecture must include a model version registry and a re-evaluation process before updating. Traditional software architecture does not have an analog for this.

Key Points to Hit:

PHI scrubbing at gateway level for observability
Quality drift monitoring as a production requirement
Circuit breaker on LLM dependencies in clinical workflows
Model version governance as an architectural requirement

Red Flags:

Not distinguishing AI risks from traditional software risks
Not mentioning PHI/HIPAA considerations

Key Takeaways

Architecture reviews are the highest-leverage technical activity an FDE performs
Current state elicitation and gap/risk analysis must be run as separate phases
The pre-review architecture map should distinguish CONFIRMED from ASSUMED
Eight AI-specific risk patterns require systematic evaluation in every review
PHI data flow must be explicitly traced and every control confirmed
Architecture Review Report should be delivered within 48 hours of the session
Blocking risks must be resolved before production go-live — not tracked as open items

Architecture Review Facilitation#

Conceptual Explanation#

Core Architecture: The Review Framework#

Pre-Review Preparation (FDE Working Independently)#

Review Session Structure#

Risk Pattern Library#

Architecture Diagram#

Architecture Diagram#

Common Mistakes#

Best Practices#

Trade-offs#

Interview Questions#

Q: What AI-specific architectural risks do you look for that traditional software architecture reviews miss?#

Key Takeaways#

Architecture Review Facilitation

Conceptual Explanation

Core Architecture: The Review Framework

Pre-Review Preparation (FDE Working Independently)

Review Session Structure

Risk Pattern Library

Architecture Diagram

Architecture Diagram

Common Mistakes

Best Practices

Trade-offs

Interview Questions

Q: What AI-specific architectural risks do you look for that traditional software architecture reviews miss?

Key Takeaways