Preventing Downtime: How One Team Detected a Twilio Webhook Schema Change Before Their Handler Broke
Webhooks are deceptively fragile.
When you build a webhook handler, you write it against the payload shape you receive on day one. You add some error handling, test it in staging, and ship it. Then it runs reliably for months or years — and you stop thinking about it.
What you can't see is that the third-party system sending those webhooks may change the payload shape at any time. Sometimes they announce it in a changelog. Sometimes they don't. Either way, your handler has no way to know the contract has changed until it crashes on a field that no longer exists.
This is the story of a team that set up monitoring for their Twilio webhook endpoint and caught a payload change before their handler broke.
The Setup: SMS Notifications at Scale
A logistics and delivery company used Twilio's Programmable Messaging to send real-time delivery status notifications to customers: "Your package has been picked up," "Out for delivery," "Delivered." Customers could reply to opt out or request delivery changes.
The webhook handler processed two kinds of events:
- Delivery status callbacks: fired by Twilio when message delivery status changed (
delivered,failed,undelivered) - Inbound SMS: fired when a customer replied to a notification
The handler was a Node.js service that parsed the Twilio webhook payload and:
- Updated internal delivery records with message status
- Processed customer reply commands (
STOP,HELP, reply-to-confirm delivery changes) - Fired internal events to trigger follow-up actions
The code directly destructured several fields from the Twilio payload:
const {
MessageSid,
MessageStatus,
SmsSid,
SmsStatus,
From,
To,
Body,
ErrorCode,
ErrorMessage,
NumMedia,
MediaUrl0,
MediaContentType0,
} = req.body;
This handler had been running in production without modification for 14 months.
The Monitoring Gap
Like most webhook integrations, this one had no active schema monitoring. The team's observability stack tracked:
- HTTP response codes returned to Twilio (Twilio requires a 200 response or it retries)
- Error rates in the handler itself
- Message delivery success rates via internal metrics
What it didn't track: whether the fields in the incoming webhook payload matched what the handler expected.
A schema change in the webhook payload would look identical to a handler bug in the team's existing monitoring — elevated error rates and failed deliveries, with no indication whether the root cause was their code or Twilio's payload.
The team added Rumbliq to their monitoring stack after their on-call engineer spent a Saturday debugging what turned out to be a field name change in a different webhook integration (not Twilio). He didn't want to repeat that experience.
Monitoring Webhook Schema with Rumbliq
Monitoring a webhook schema requires a slightly different approach than monitoring a REST API — instead of Rumbliq fetching the endpoint, you want to capture the shape of what Twilio sends to you.
The team's solution: a dedicated "webhook echo" endpoint in their API that logged incoming Twilio webhook payloads to a file, and a paired monitor in Rumbliq that fetched the most recent captured payload to establish a baseline.
For each webhook type (status callbacks, inbound SMS), they:
- Triggered a test event to their staging webhook endpoint
- Had Rumbliq capture the response schema as a baseline
- Set up Rumbliq to monitor the echo endpoint on a 10-minute interval
Any time Twilio sent a webhook with a field set that differed from the baseline, Rumbliq would detect the change on the next poll and fire an alert.
The Change That Almost Broke the Handler
Six weeks after setting up monitoring, Rumbliq fired an alert on a Tuesday afternoon:
⚠️ SCHEMA DRIFT DETECTED
Monitor: Twilio SMS Status Callbacks (staging)
Severity: MEDIUM
Field added: ReferralNumMedia (string)
Field added: ReferralNumSegments (string)
Field changed: SmsStatus (was: string, now: missing in some responses)
Field note: MessagingServiceSid now present in 100% of responses (was: optional)
Two of these changes were additive and harmless. One wasn't.
The SmsStatus field — historically a redundant alias for MessageStatus included in all SMS webhook payloads — was being phased out. Twilio was transitioning to returning only MessageStatus in some contexts.
The handler had code that checked SmsStatus first, then fell back to MessageStatus:
const status = SmsStatus || MessageStatus;
This had worked fine when SmsStatus was always present. As Twilio gradually rolled out the change and SmsStatus started disappearing from some payloads, the fallback would keep the handler functional — but some downstream code paths that were written with the assumption SmsStatus would always be set were at risk.
More critically, the team found three places in the codebase where SmsStatus was used directly without the fallback:
// In message logging:
logger.info({ smsStatus: SmsStatus, messageId: MessageSid });
// In analytics event:
analytics.track('sms_delivery_update', {
sms_status: SmsStatus, // would log 'undefined' silently
message_status: MessageStatus,
});
// In database update:
await db.query(
'UPDATE messages SET sms_status = $1 WHERE sid = $2',
[SmsStatus, MessageSid] // would write NULL silently
);
None of these would crash the handler — undefined would just flow through. But the database would accumulate NULL sms_status values, the analytics events would log incorrect data, and the logging would be misleading.
The Fix and Deployment
The engineer who received the Rumbliq alert spent about 90 minutes on the fix:
- Standardized all status reads on
MessageStatus, removing all direct references toSmsStatus - Added a migration to backfill the
sms_statuscolumn frommessage_statusin the database for any affected rows - Updated the handler to explicitly ignore unknown fields via a schema validation layer (using Zod), so future additive changes wouldn't require code changes
- Updated the Rumbliq baseline to reflect the new expected payload shape
- Wrote a test that simulated a Twilio webhook payload with
SmsStatusabsent
The fix was reviewed and deployed to production the same day. The MessagingServiceSid addition was documented as expected and noted in the team's internal API integration notes.
The Alert Email: What It Looked Like
The team configured email alerts as a backup to Slack. Here's what the alert email contained:
Subject: [Rumbliq] Schema drift detected on Twilio SMS Status Callbacks
Monitor: Twilio SMS Status Callbacks
Endpoint: https://internal-echo.yourdomain.com/webhooks/twilio/status
Detected: 2026-04-02 14:22 UTCChanged fields:
SmsStatus: was always present (string), now absent in some payloadsReferralNumMedia: new field added (string, optional)ReferralNumSegments: new field added (string, optional)MessagingServiceSid: changed from optional to always presentPrevious baseline: 2026-02-15 09:00 UTC
Action: [View full diff in Rumbliq] [Reset baseline] [Snooze 24h]
The diff view in Rumbliq showed exactly which fields had changed, with color-coded additions and removals, and the raw before/after JSON for comparison.
Why Webhook Schema Monitoring Is Underrated
Most teams monitor their outbound API calls. Almost no teams monitor the schema of webhooks they receive.
This is backwards. Webhooks carry the same risk as any other API contract, with an important difference: you don't control when they're called or what shape they arrive in. You can't version-pin a webhook the way you can pin an SDK. You can't test against a mock that stays up-to-date automatically. You're at the mercy of the sender.
Schema drift monitoring treats the incoming webhook payload as a contract and alerts you the moment that contract is violated — even if the violation is subtle, even if it wouldn't cause an immediate crash.
The Twilio change above would have caused silent data corruption in the team's database and analytics events. No 500 errors. No PagerDuty alerts. Just slowly accumulating NULL values and missing data that someone might notice in a quarterly analytics review — or might never notice at all.
Set Up Webhook Monitoring in 2 Minutes
If you have Twilio webhooks in production, you can set up schema monitoring with Rumbliq today:
- Create a webhook echo endpoint in your API that saves incoming Twilio payloads (staging is fine)
- Trigger a test webhook from Twilio's console to populate the echo endpoint
- Add a monitor in Rumbliq pointing to your echo endpoint
- Configure an alert to Slack, email, or your webhook destination
For more detail on the webhook echo pattern and alternative approaches, see our Webhook Monitoring Best Practices guide.
Set up webhook monitoring in Rumbliq free →
Summary
| Without monitoring | With monitoring |
|---|---|
| Schema change goes undetected | Alert fires within 10 minutes |
| Silent data corruption accumulates | Diff shows exactly what changed |
| Debug session triggered by data anomaly | Fix deployed same day |
| Downstream impact unknown | Zero production impact |
Webhook schema changes are silent by nature. The only way to catch them before they cause damage is to monitor the payload shape and alert on deviation.
Don't wait for silent corruption to surface in a quarterly report. Start monitoring your Twilio webhooks free →
Related reading: Webhook Monitoring Guide · How to Monitor Twilio API Changes · API Schema Drift: The Silent Killer of Integrations