Zero Trust Architecture for AI Systems
Executive Summary
Zero Trust security replaces the perimeter-based model ("trust everything inside the network") with continuous verification of every request regardless of network location. AI systems are a natural fit for Zero Trust because they call external LLM APIs, process PHI within internal networks, run on shared cloud infrastructure, and involve service-to-service communication across multiple components. This chapter applies Zero Trust principles to the enterprise AI architecture: every AI component authenticates explicitly, every data access is authorized to minimum necessary scope, and every action is logged and monitored.
Learning Objectives
- Apply the three Zero Trust principles (verify explicitly, use least privilege, assume breach) to AI infrastructure
- Design network segmentation for AI components that handles external LLM API calls without exposing the PHI data layer
- Implement mTLS for AI service-to-service communication
- Configure cloud-native Zero Trust controls (AWS Security Groups, Azure Private Link, GCP VPC Service Controls) for AI workloads
Enterprise Considerations
Egress inspection for LLM API calls: A Zero Trust egress proxy that inspects outbound LLM API calls can enforce the policy that PHI is not sent to providers without a BAA. Pattern matching on LLM prompt payloads at the egress layer provides a safety net for configurations where PHI must not leave the internal network.
Service mesh for AI microservices: In Kubernetes-based AI deployments, a service mesh (Istio, Linkerd) provides mTLS between all AI platform microservices without requiring each service to implement mTLS itself. Service mesh is the preferred implementation for Zero Trust service-to-service authentication at scale.
Common Mistakes
1. Implementing network perimeter security and calling it Zero Trust. Placing AI services behind a VPN or private subnet does not implement Zero Trust. Zero Trust requires identity verification on every request regardless of network location.
2. Not applying Zero Trust principles to AI agent tool calls. The AI agent that calls EHR APIs is itself a component that must authenticate (service account), be authorized (minimum necessary tool ACL), and log every action (audit log). Agents without Zero Trust controls become the most significant lateral movement risk in clinical AI.
3. No certificate rotation for mTLS. mTLS certificates with no expiry or manual rotation policy become unrotated in practice. Automate certificate rotation with cert-manager or Vault PKI; certificates should expire in 90 days or fewer.
Key Takeaways
- Zero Trust replaces network perimeter trust with identity-based, per-request authorization for every AI component
- AI agentic workflows are the highest Zero Trust risk: an authenticated agent with multiple tools can move laterally without triggering network-based detection
- mTLS for service-to-service communication is the Zero Trust authentication mechanism — more secure than API keys because certificates are bound to service identity
- Egress inspection on LLM API calls is the safety net for the policy that PHI must not be sent to providers without a BAA
- Network segmentation should isolate the PHI data zone from the AI processing zone; PHI data zone has no egress
Further Reading
- AI Security Fundamentals — Threat model context for Zero Trust design
- Identity and Access — Identity layer that Zero Trust builds on
- Networking and API Gateway — The AI gateway that enforces Zero Trust policies