Cloud AI Platforms

Conceptual Explanation

All three platforms provide the same core service: managed LLM inference with enterprise packaging. The differentiation is in:

Model selection: Which models are available, when, and at what tier. Platform-exclusive models (GPT-4 on Azure OpenAI, Claude on Bedrock and Vertex) vs. multi-model catalogs (Bedrock offers Anthropic, Llama, Cohere, Titan, Stability).

Cloud integration depth: How tightly the AI service integrates with other services on the same cloud platform. Azure OpenAI integrates natively with Azure Cognitive Search, Azure Data Factory, and Azure Monitor. AWS Bedrock integrates with S3, Lambda, Bedrock Agents, and CloudWatch. GCP Vertex AI integrates with BigQuery, Dataflow, and Cloud Storage.

Data handling and residency: Where inference requests are processed, whether input data is used for model training, and what privacy controls apply.

Enterprise features: Private networking (VPC endpoints, Private Link), IAM integration, usage monitoring, content filtering, and SLA guarantees.

Core Architecture

Platform Architecture Comparison

Feature Matrix

Dimension AWS Bedrock Azure OpenAI Google Vertex AI
Key models Claude (Anthropic), Llama, Cohere, Amazon Titan, Stability GPT-4o, o1, o3 (OpenAI exclusive) Gemini, Claude, PaLM, Llama
Model exclusivity Claude requires Bedrock or Anthropic API GPT-4/o1/o3 require Azure or OpenAI Gemini requires GCP or Google AI
Private networking VPC Endpoint (PrivateLink) Azure Private Link VPC Service Controls
Identity integration IAM, AWS SSO Azure AD / Entra ID Google Cloud IAM
HIPAA BAA Yes (included in AWS BAA) Yes (Microsoft Azure BAA) Yes (Google Cloud BAA)
Data residency By AWS region By Azure region By GCP region
Training opt-out Yes (Bedrock doesn't train on inputs by default) Yes (Azure OpenAI) Yes (Vertex AI)
Content filtering Bedrock Guardrails Azure Content Safety Vertex AI Safety Filters
Observability CloudWatch, CloudTrail Azure Monitor, Azure Log Analytics Cloud Monitoring, Cloud Logging
Batch inference Bedrock Batch API Azure Batch Deployments Vertex AI Batch Prediction
Fine-tuning Bedrock Fine-tuning (select models) Azure OpenAI Fine-tuning Vertex AI Fine-tuning
Agent/orchestration Bedrock Agents Azure AI Foundry / Prompt Flow Vertex AI Agent Builder
SLA 99.9% (varies by service) 99.9% (varies by tier) 99.9% (varies by service)

Common Mistakes

1. Assuming model availability before verifying. Model availability on each platform changes frequently. Verify that the specific model version required is available in the required region before designing the architecture.

2. Building platform-specific integration without an abstraction layer. Code that calls azure<em>openai</em>client.chat.completions.create() directly cannot switch to Bedrock without rewriting. All inference calls should go through a platform-agnostic interface.

3. Not requesting quota increases before production launch. Default quotas on all platforms are insufficient for enterprise production. Quota increase requests must be submitted weeks before the target launch date.

4. Treating HIPAA BAA as a static guarantee. BAA terms and scope change. Verify current BAA terms annually and upon any significant platform update.

5. Ignoring content filter calibration. Default content filters on all platforms are calibrated for consumer use cases. Clinical content will trigger false positives on medical descriptions. Calibrate and test content filters before clinical deployment.

Best Practices

  • Implement an AI gateway abstraction (LiteLLM or equivalent) before committing to any single platform — preserves optionality
  • Align platform selection with existing cloud provider relationship when technically equivalent
  • Enable private networking for any inference involving PHI
  • Verify training data opt-out is configured and documented before processing PHI
  • Request quota increases 4–6 weeks before production launch
  • Test content filters against a representative clinical content sample before deployment
  • Review platform data handling terms annually — they change

Trade-offs

Model selection vs. ecosystem integration: The best model for a specific use case may not be available on the organization's preferred cloud platform. AWS Bedrock offers the broadest model selection (multi-vendor); Azure OpenAI offers exclusive access to the GPT-4/o1 family; Vertex offers native BigQuery integration.

Managed features vs. lock-in: Platform-specific features (Bedrock Agents, Azure AI Foundry, Vertex Agent Builder) reduce development effort but create deeper lock-in. Generic infrastructure (LiteLLM gateway + vLLM) preserves optionality at higher operational cost.

Interview Questions

Q: An enterprise client wants to use both Claude and GPT-4 in different parts of their AI platform. How would you architect this without creating application-level dependencies on each platform?

Category: Architecture Difficulty: Senior Role: AI Architect

Answer Framework:

The answer is an AI gateway pattern — a layer that translates a common API format (typically OpenAI-compatible) into platform-specific API calls. All application code calls the gateway using the common format; the gateway routes to the appropriate platform based on the model name requested.

LiteLLM is the standard open-source implementation. The gateway configuration maps logical model names ("claude-premium", "gpt4-enterprise") to platform-specific endpoints. Application code references only the logical name. Switching from Claude via Anthropic API to Claude via Bedrock, or from GPT-4 Azure to GPT-4 OpenAI, requires only gateway configuration changes.

The additional benefits of this pattern: centralized cost attribution, unified audit logging, failover routing (if Claude API is rate-limited, fall back to an alternative), and prompt injection for common headers (department attribution, audit trail).

Key Points to Hit:

  • AI gateway as the abstraction layer
  • LiteLLM as the practical implementation
  • Logical model names vs. platform-specific model IDs
  • Benefits beyond model switching: cost, logging, failover

Key Takeaways

  • AWS Bedrock offers the broadest multi-model catalog; Azure OpenAI offers exclusive GPT-4/o1 access; Google Vertex AI offers the deepest BigQuery and analytics integration
  • All three platforms provide HIPAA BAAs and private networking for PHI-in-cloud deployments
  • The AI gateway pattern (LiteLLM) is the architectural mechanism that preserves cross-platform optionality
  • Align cloud AI platform selection with existing cloud provider relationship when technically equivalent
  • Verify model availability in required regions before architecture commitment
  • Request quota increases 4–6 weeks before production launch — default quotas are insufficient for enterprise production
  • Training data opt-out must be verified and documented before processing PHI