AI Security Fundamentals

Executive Summary

AI systems introduce a qualitatively different threat model from traditional software: the model itself is an attack surface, adversarial inputs can produce outputs with arbitrary content, agents with tool access can be manipulated into unauthorized actions, and PHI flowing through inference pipelines creates privacy risks that existing data security controls do not address. Understanding this threat model is the prerequisite for every security decision in an enterprise AI deployment. This chapter establishes the foundational AI security threat model, maps it to the HMS scenario, and introduces the defense-in-depth architecture that subsequent chapters detail.

Learning Objectives

  • Describe the AI-specific threat categories that are absent from traditional application threat models
  • Construct a threat model for an AI system with RAG, tool calling, and PHI access
  • Map the defense-in-depth layers appropriate for each threat category
  • Apply the AI threat model to the Hospital Management System scenario

Business Problem

Traditional application security threat modeling — focused on injection, authentication bypass, broken access control, and data exposure — covers the infrastructure around an AI system but misses the system itself. An AI model can be manipulated through its inputs (prompt injection), can memorize and inadvertently reveal training data (data extraction), and can be induced into actions through its tool calling capability that no explicit authorization check would prevent. Enterprise security teams that apply only traditional threat models to AI systems leave the most significant AI-specific risks unaddressed.

Enterprise Considerations

Threat model as a living document: AI threat models must be updated when: new AI capabilities are added (new tools, new agent workflows), the underlying model is upgraded (new capabilities may introduce new threats), new data sources are added to the RAG pipeline, and when incidents occur that reveal previously unconsidered attack vectors.

Shared responsibility with AI providers: LLM providers (Anthropic, Azure OpenAI, Google) operate the inference infrastructure and are responsible for model-layer security (training data privacy, inference isolation). The enterprise is responsible for input validation, output validation, tool access control, and PHI handling. Know the boundary.

Common Mistakes

1. Treating LLM security as identical to traditional injection defense. SQL injection defenses (parameterized queries) do not translate to prompt injection. LLMs process natural language, not structured queries; input sanitization alone does not prevent prompt injection.

2. No threat model before deployment. AI capabilities are deployed without a systematic threat assessment, leaving significant risks unaddressed. Conduct a structured threat modeling session (STRIDE or an AI-adapted equivalent) before any clinical AI capability goes to production.

3. Assuming the AI provider handles all security. LLM providers handle inference-layer security. The enterprise is responsible for authentication, authorization, input validation, output validation, PHI handling, and audit logging. These are not provided by the LLM API.

Key Takeaways

  • AI systems introduce four threat categories absent from traditional models: prompt injection, data exfiltration, agent privilege escalation, and model-specific DoS
  • Indirect prompt injection (via RAG-retrieved documents) is harder to detect and prevent than direct injection
  • Defense-in-depth for AI requires controls at six layers: perimeter, AI gateway, orchestration, inference, and data
  • Threat models for AI systems must be maintained as living documents and updated when capabilities change
  • The enterprise is responsible for input validation, output validation, PHI handling, and audit logging — the LLM provider handles inference-layer security

Further Reading