Blog
Why Most Webhook Systems Lose Events
Most webhook systems do not lose events because HTTP is mysterious. They lose events because the endpoint tries to receive, validate, process, call downstream services, and return a provider response in one fragile path.
The fix is not more logging after the fact. The fix is to split receipt from delivery, keep enough state to replay, and make every retry observable.
Where Events Disappear
- The provider gets a timeout and retries, but the first attempt partially processed.
- The endpoint returns 200 OK before durable storage.
- A queue publish fails after the provider has stopped retrying.
- A downstream service returns 500 and the receiver has no replay path.
- A duplicate event is discarded without an audit trail.
The Better Shape
Model
Provider
-> stable ingress
-> stored request
-> routed events
-> destination attempts
-> retry or replay with evidenceOperational Requirement
- Store raw headers and body before processing.
- Track every destination attempt.
- Keep retry and replay actions narrow and auditable.
- Design receiver idempotency before incidents happen.
- Monitor ingress success separately from downstream delivery success.