Audit and Logging for AI Systems

Executive Summary

Audit logging for AI systems serves three distinct purposes: HIPAA compliance (PHI access audit trail), security incident detection (anomaly detection on AI behavior), and AI quality assurance (tracking model outputs over time for regression detection). Each purpose has different requirements — HIPAA audit logs must be immutable and patient-attributed; security logs must be real-time and queryable; quality logs must include AI-specific metadata (model version, retrieved documents, confidence signals) that traditional observability systems do not capture. This chapter covers the audit and logging architecture for clinical AI systems across all three purposes.

Learning Objectives

  • Design a multi-tier logging architecture that separates PHI audit logs from operational logs
  • Implement HIPAA-compliant PHI access logging for AI systems without logging PHI content
  • Configure SIEM integration for real-time detection of AI-specific anomalies
  • Track AI model output quality metrics over time for regression detection

Enterprise Considerations

Log retention: HIPAA requires audit log retention for 6 years. Ensure the audit log store has a lifecycle policy that prevents deletion before the retention period expires and moves logs to low-cost archival storage (S3 Glacier, Azure Archive) after 90 days.

Audit log access control: The HIPAA audit log must be accessible to compliance and security staff but protected from modification by anyone, including the AI platform team. Implement write-once logging (S3 Object Lock with Compliance mode, Azure Immutable Blob Storage) and restrict read access to authorized audit personnel.

Operational log separation: PHI audit logs and operational logs (latency, throughput, error rates) should be in separate log stores with different access controls. Operations teams need operational logs for debugging but should not have access to PHI-containing audit logs.

Common Mistakes

1. Logging request and response content for PHI-handling AI features. Even "debug" logs that include prompt content contain PHI for clinical AI features. Every log store that receives AI request/response content becomes a HIPAA data store with full compliance implications.

2. Not setting log retention to 6 years. Default CloudWatch log retention is configurable; the default is often 30–90 days. HIPAA requires 6-year retention for audit logs. Set explicit retention policies on creation and audit them quarterly.

3. Mixing PHI audit logs with operational logs. Operational logs have different retention requirements, different access control requirements, and different compliance implications. Keep them separate.

4. No integrity checking on audit logs. An attacker who can modify the audit log can cover their tracks. Store integrity hashes (SHA-256 of each log entry) separately from the log entries, or use cloud-native tamper-evident logging (AWS CloudTrail with log file integrity validation).

Key Takeaways

  • HIPAA audit logs must include patientid and userid but must never include PHI content, prompt text, or AI response text
  • PHI audit logs must be immutable (write-once), encrypted, and retained for 6 years
  • Operational logs and HIPAA audit logs must be in separate stores with separate access controls
  • AI quality logs capture model-specific metadata (model version, retrieval scores, citation counts, clinician feedback) that traditional observability systems do not
  • Integrity hashing of audit log entries enables detection of tampered audit trails

Further Reading