Zero Trust Architecture for AI Systems

Executive Summary

Zero Trust security replaces the perimeter-based model ("trust everything inside the network") with continuous verification of every request regardless of network location. AI systems are a natural fit for Zero Trust because they call external LLM APIs, process PHI within internal networks, run on shared cloud infrastructure, and involve service-to-service communication across multiple components. This chapter applies Zero Trust principles to the enterprise AI architecture: every AI component authenticates explicitly, every data access is authorized to minimum necessary scope, and every action is logged and monitored.

Learning Objectives

Apply the three Zero Trust principles (verify explicitly, use least privilege, assume breach) to AI infrastructure
Design network segmentation for AI components that handles external LLM API calls without exposing the PHI data layer
Implement mTLS for AI service-to-service communication
Configure cloud-native Zero Trust controls (AWS Security Groups, Azure Private Link, GCP VPC Service Controls) for AI workloads

Enterprise Considerations

Egress inspection for LLM API calls: A Zero Trust egress proxy that inspects outbound LLM API calls can enforce the policy that PHI is not sent to providers without a BAA. Pattern matching on LLM prompt payloads at the egress layer provides a safety net for configurations where PHI must not leave the internal network.

Service mesh for AI microservices: In Kubernetes-based AI deployments, a service mesh (Istio, Linkerd) provides mTLS between all AI platform microservices without requiring each service to implement mTLS itself. Service mesh is the preferred implementation for Zero Trust service-to-service authentication at scale.

Common Mistakes

1. Implementing network perimeter security and calling it Zero Trust. Placing AI services behind a VPN or private subnet does not implement Zero Trust. Zero Trust requires identity verification on every request regardless of network location.

2. Not applying Zero Trust principles to AI agent tool calls. The AI agent that calls EHR APIs is itself a component that must authenticate (service account), be authorized (minimum necessary tool ACL), and log every action (audit log). Agents without Zero Trust controls become the most significant lateral movement risk in clinical AI.

3. No certificate rotation for mTLS. mTLS certificates with no expiry or manual rotation policy become unrotated in practice. Automate certificate rotation with cert-manager or Vault PKI; certificates should expire in 90 days or fewer.

Key Takeaways

Zero Trust replaces network perimeter trust with identity-based, per-request authorization for every AI component
AI agentic workflows are the highest Zero Trust risk: an authenticated agent with multiple tools can move laterally without triggering network-based detection
mTLS for service-to-service communication is the Zero Trust authentication mechanism — more secure than API keys because certificates are bound to service identity
Egress inspection on LLM API calls is the safety net for the policy that PHI must not be sent to providers without a BAA
Network segmentation should isolate the PHI data zone from the AI processing zone; PHI data zone has no egress

Zero Trust Architecture for AI Systems#

Executive Summary#

Learning Objectives#

Enterprise Considerations#

Common Mistakes#

Key Takeaways#

Further Reading#