Webhook and Callback Patterns for AI
Executive Summary
Asynchronous AI operations — document analysis, long-form generation, multi-step agent workflows — cannot return results within the HTTP request lifecycle. Webhooks and callback patterns provide the delivery mechanism that bridges the gap between a client submitting an AI job and eventually receiving the result, whether that job takes 10 seconds or 10 minutes. This chapter covers the design of reliable, secure webhook delivery for AI results, polling APIs as a fallback mechanism, and Server-Sent Events (SSE) for near-real-time streaming — the three async result delivery patterns that production clinical AI systems rely on.
Learning Objectives
- Design webhook delivery with signature verification, retry policies, and idempotency
- Implement polling APIs as a fallback for clients that cannot receive webhooks
- Apply Server-Sent Events (SSE) for real-time streaming of AI outputs to interactive UIs
- Secure webhook endpoints against spoofing and replay attacks
Business Problem
A Reference Healthcare Organization's clinical AI platform accepts discharge summary generation requests and returns AI-drafted summaries to the requesting clinician's EHR workflow. The AI generation takes 8–15 seconds — too long for a synchronous REST response within a CDS Hooks flow, but too short to require the clinician to manually check a status screen. The platform needs a delivery mechanism that pushes the result to the clinician's interface as soon as it is ready, without the clinician needing to wait, poll, or take an additional action.
Webhooks and SSE solve this problem for different contexts: SSE for interactive browser-based interfaces where the client maintains a persistent connection, webhooks for server-to-server result delivery (EHR system receives AI result at a registered callback endpoint), and polling for clients behind firewalls that cannot receive inbound webhook calls.
Enterprise Considerations
Webhook endpoint security: Webhook receiving endpoints must validate the HMAC signature on every delivery. Without signature verification, any HTTP client that knows the callback URL can POST a fabricated AI result. Clinical systems must never act on an unverified webhook payload.
Idempotency at the receiver: Webhook delivery services retry on failure. The receiving endpoint will receive the same job result multiple times if the first delivery acknowledgment was lost. Receiving endpoints must be idempotent: processing the same job_id twice must produce the same result as processing it once.
SSE connection management: SSE connections are long-lived HTTP connections. Proxies and load balancers frequently close idle connections after 30–60 seconds. Configure heartbeat pings (: keep-alive\n\n SSE comment events) at 15-second intervals to prevent proxy disconnects during long AI generations.
Common Mistakes
1. Not verifying webhook signatures. A webhook receiver that does not verify the HMAC signature will process fabricated payloads from any source that knows the endpoint URL. Always verify signatures.
2. Returning 200 before processing the webhook. If the webhook receiver processes the payload synchronously and takes 10 seconds, the sender's timeout may expire before it receives the 200 acknowledgment. The receiver should: (1) immediately return 200 Accepted, (2) process the payload asynchronously, (3) handle duplicate delivery via idempotency.
3. No retry policy on webhook delivery. If the receiving server is temporarily unavailable when the AI job completes, the webhook delivery fails and the result is lost. Always implement a retry policy with exponential backoff for webhook delivery.
4. Buffering SSE responses in the web framework or proxy. SSE requires that tokens are flushed immediately to the client. Framework response buffering or nginx proxy buffering accumulates tokens and delivers them in bursts rather than streaming in real time. Always set X-Accel-Buffering: no and disable framework response buffering for SSE endpoints.
Key Takeaways
- Webhooks (server push) are the preferred pattern for server-to-server AI result delivery; polling is the fallback for clients that cannot receive inbound connections
- All webhook deliveries must be signed (HMAC-SHA256) and receiving endpoints must verify the signature
- Webhook receiving endpoints must be idempotent; retry delivery means the same result may arrive multiple times
- SSE is the preferred pattern for real-time token streaming to browser-based clinical interfaces
- SSE connections require heartbeat pings at 15-second intervals to prevent proxy disconnects
Further Reading
- Integration Patterns — Asynchronous pattern that webhooks implement
- API Design for AI — Streaming API patterns (SSE)
- Event-Driven AI — Kafka as an alternative to webhooks for high-throughput event delivery