How to Design APIs That Never Break Their Clients


api backend architecture

How to Design APIs That Never Break Their Clients

Versioning strategies, backward compatibility rules, and real-world patterns from the trenches


The Incident

It’s 11 PM on a Thursday. Priya, a backend engineer at a mid-sized fintech, merges what looks like a routine PR. The change? Renaming a JSON field: user_name β†’ username. Clean, consistent, follows the new style guide. CI passes. Deploy goes out.

By midnight, 3 enterprise clients are paging their on-call engineers. Their mobile apps - which parse user_name - are now showing empty profile screens for every user.

By 1 AM, Priya is rolling back. By 2 AM, she’s drafting a post-mortem. By morning, the BD team is fielding angry calls.

- A story every API engineer eventually lives through

This isn’t a hypothetical. It’s a pattern that plays out at companies of every size - from early-stage startups to Stripe, Twilio, and GitHub. The root cause is always the same: the API was designed to be changed, not to be evolved.

There’s a difference. And this post is about that difference.


What Even Counts as a Breaking Change?

Before we talk solutions, let’s sharpen our intuition. Not every change breaks clients. But some changes that look harmless absolutely do.

❌ Breaking Changes

// Renaming a field
{ "user_name": "anshuman" }
β†’
{ "username": "anshuman" }

// Removing a field entirely
{ "user_name": "anshuman", "email": "..." }
β†’
{ "username": "anshuman" }   // email gone!

// Changing a field's type
{ "age": "27" }   // was string
β†’
{ "age": 27 }     // now number

// Changing HTTP status codes
GET /users/999
was: 200 { "user": null }
now: 404  ← clients checking for 200 break

// Restructuring the response shape
{ "data": { "id": 1 } }
β†’
{ "result": { "user_id": 1 } }

// Adding required fields to request body
POST /orders
was: { "item_id": 1 }
now: { "item_id": 1, "warehouse": "required!" }

βœ… Safe (Non-Breaking) Changes

// Adding optional fields to response
{ "id": 1, "name": "Anshuman" }
β†’
{ "id": 1, "name": "Anshuman",
"avatar_url": "https://..." }  // new, optional

// Adding optional fields to request
POST /orders
{ "item_id": 1 }          // old clients still work
{ "item_id": 1, "note": "gift wrap" } // new field optional

// Adding new endpoints entirely
GET /v1/users/:id          // unchanged
POST /v1/users/:id/avatar  // brand new

// New enum values (with caveats)
{ "status": "active" | "inactive" }
β†’
{ "status": "active" | "inactive" | "pending" }
// ⚠️ only safe if clients use default/fallback

// Relaxing validation rules
was: name max 50 chars
now: name max 200 chars

⚠️ The β€œnew enum value” trap Adding a new enum value looks safe but breaks clients that use exhaustive switch/if-else matching without a default. Always document that clients must handle unknown enum values gracefully.

The mental model: think of your API as a contract. Your clients write code that depends on the exact shape of your response. Every field name, every type, every status code - it’s all load-bearing. Changing any of it is demolishing a wall without checking if it’s structural.


The 4 Versioning Strategies

There are four main ways to version an API. Each is a different answer to the question: β€œWhere does the version live?β€œ

4 Ways to Version Your APIβ‘  URL PathGET /api/v1/usersGET /api/v2/usersβœ“ Most commonβœ“ Easy to test/curlβœ“ CDN friendlyβœ— URL duplicationβœ— Multiple codepathsβ‘‘ Query Param/users?version=1/users?version=2βœ“ Clean base URLβœ“ Easy to defaultβœ— Cacheable issuesβœ— Easy to forgetβœ— Less discoverableβ‘’ Custom HeaderAPI-Version: 2024-01API-Version: 2025-01βœ“ URL stays cleanβœ“ Stripe’s approachβœ— Harder to testβœ— Not in browser barβœ— Less discoverableβ‘£ Accept HeaderAccept: app/vnd.v1+jsonAccept: app/vnd.v2+jsonβœ“ RESTfully correctβœ“ Very preciseβœ— Very verboseβœ— Rarely usedβœ— Tooling support weakπŸ’‘ Recommended Approach by Use CasePublic REST APIURL Path (/v1/)Internal MicroservicesHeader or Query ParamDeveloper Platform (Stripe-like)Date-based Header versioningGraphQL APIsEvolve schema, avoid versions🏦 How Stripe Does It: Date-Anchored VersionsEach API key is pinned to the Stripe version active when it was created.Clients opt-in to upgrades explicitly. Your old key still works onStripe-Version: 2020-03-02 even in 2026.
Fig 1 - The 4 main API versioning strategies and when to use them

Strategy Deep-Dive: URL Path Versioning

This is what most teams use, and for good reason. Let’s look at what it actually means in practice.

# v1 - original
GET /api/v1/users/42
Response:
{
"user_name": "anshuman",
"email": "anshuman@example.com"
}

# v2 - new shape, runs in parallel
GET /api/v2/users/42
Response:
{
"username": "anshuman",
"contact": {
"email": "anshuman@example.com",
"phone": null
}
}

Both run simultaneously. Old clients keep hitting /v1. New clients use /v2. The URL is the contract, and the version is part of the contract.

The Stripe Approach: Header Versioning with Date Pins

Stripe is the gold standard for API design. Their versioning is subtle but brilliant:

POST https://api.stripe.com/v1/charges
Stripe-Version: 2024-06-20
Authorization: Bearer sk_live_...

There’s only one β€œversion” in the URL (/v1/) and it’s been there forever. The actual version is passed as a date header. Every API key has a version anchor - the date of the Stripe version active when you created the key.

This means:

  • You never have to upgrade unless you want to
  • You opt-in to breaking changes deliberately
  • Stripe can evolve aggressively without breaking anyone

The Golden Rules of API Evolution

  1. Never remove a field - deprecate it first

    Mark fields with a deprecated: true flag in your OpenAPI spec. Add a Deprecation response header. Document the sunset date. Run in parallel for at least 6 months.

  2. Only add, never remove or rename

    Additive changes are always safe. Old clients ignore fields they don’t understand. New clients get richer responses. This is the Robustness Principle in action: β€œbe conservative in what you send, liberal in what you accept.”

  3. Treat your API schema like a database migration

    Would you drop a column with live traffic on it? No - you’d use the Expand/Contract pattern: add the new column, migrate data, dual-write to both, then eventually remove the old one. Same logic applies to API fields.

  4. Never change a field’s type

    Changing β€œage”: β€œ27” to β€œage”: 27 seems harmless. It’s not. Clients that pass this to a typed function will throw a runtime error. If you must, add a new field: age_int, deprecate the old one.

  5. Extend enums carefully

    Adding a new enum value is a β€œsoft” breaking change. Any client with exhaustive pattern matching (TypeScript discriminated unions, Swift enums, Kotlin sealed classes) will fail at compile time or throw at runtime. Document that clients must handle unknown values.

  6. Document your stability guarantees explicitly

    Tell your consumers what they can count on. Kubernetes does this with alpha / beta / stable labels. Twitter’s API has β€œfrozen” endpoints that don’t change. Make the contract explicit so clients know what to trust.

  7. Version from day one, not day one-hundred

    The cost of adding /v1/ to your routes on day one is near zero. The cost of retrofitting versioning after you have 50 production clients is enormous - and painful for everyone.


The Expand/Contract Pattern for API Evolution

This is the single most useful pattern for evolving APIs without breakage. It mirrors database migration strategy and it works at the API level too.

Phase 1 - EXPAND: add the new thing alongside the old
──────────────────────────────────────────────────────
Response:
{
"user_name": "anshuman",      ← old field, still here
"username": "anshuman",       ← new field, added
"email": "..."
}

Phase 2 - MIGRATE: move clients to the new field
──────────────────────────────────────────────────────
- Update your SDKs and docs to use "username"
- Alert power users / enterprise clients
- Track usage of "user_name" in your analytics
- Wait until "user_name" usage drops to near zero

Phase 3 - CONTRACT: remove the old thing
──────────────────────────────────────────────────────
- Once usage is negligible, remove "user_name"
- Announce the sunset date well in advance
- Log a warning for any client still sending "user_name"

βœ… Real-world timeline The expand phase can last days. The contract phase should wait months. Stripe gives at least 6 months notice for deprecations. For high-scale public APIs with enterprise clients, a year is not unreasonable.


Deprecation Without Drama

Deprecation is not just about deleting code - it’s a communication strategy. Here’s how to do it well.

In your OpenAPI/Swagger spec:

/users/{id}:
get:
responses:
'200':
content:
application/json:
schema:
properties:
user_name:
type: string
deprecated: true          # ← marks it in the spec
description: "Deprecated. Use `username` instead. Will be removed 2027-01-01."
username:
type: string

In your response headers:

HTTP/1.1 200 OK
Deprecation: true
Sunset: Sat, 01 Jan 2027 00:00:00 GMT
Link: <https://api.yourco.com/docs/migration/v2>; rel="deprecation"

In your error logs:

// Log a warning when deprecated field is accessed
if (clientUsedDeprecatedField) {
logger.warn({
event: 'deprecated_field_access',
field: 'user_name',
clientId: req.clientId,
sunset: '2027-01-01'
});
}

This gives you the data to know exactly which clients are still using the old field - so you can reach out to them proactively rather than surprising them.


Strategy Comparison

StrategyDiscoverabilityCacheableBrowser TestableUsed ByBest For
URL Path (/v1/)βœ“ Highβœ“ Yesβœ“ YesGitHub, Twitter, RedditPublic REST APIs
Query Param (?v=1)~ Medium~ Variesβœ“ YesGoogle APIs, internal toolsInternal/simple APIs
Custom Headerβœ— Lowβœ“ Yes (Vary)βœ— Needs curlStripe, AWSDeveloper platforms
Accept Headerβœ— Low~ Variesβœ— Needs curlGitHub (media types)Purist REST
GraphQL Evolutionβœ“ High~ Complex~ GraphiQLFacebook, ShopifyFlexible query APIs

The Versioning Lifecycle Diagram

API Version LifecycleAlpha / BetaActive / StableNew clients onboardedDeprecatedMigration window openSunset / EOLHTTP 410 GoneLaunchGA ReleaseDeprecation NoticeRemoval DateStable window (months–years)Migration window (6–12 months min)After sunset: Return HTTP 410 Gone - not 404. It tells clients the resource existed but was intentionally removed.
Fig 2 - API version lifecycle from alpha to sunset

πŸ’‘ Why HTTP 410 and not 404? 404 means β€œI’ve never heard of this.” 410 means β€œI know what you’re asking for, and it’s intentionally gone.” This distinction matters for clients that auto-retry on 404 - they won’t do the same on 410.


Real-World Example: Evolving a User Profile Endpoint

Let’s walk through a concrete evolution scenario end-to-end.

The starting point:

GET /api/v1/users/42
{
"user_name": "anshuman",
"full_name": "Anshuman Singh",
"email": "anshuman@example.com"
}

The desired end state (after product decides to restructure):

{
"username": "anshuman",
"profile": {
"display_name": "Anshuman Singh",
"contact": {
"email": "anshuman@example.com"
}
}
}

The wrong way: rename everything, merge to main, deploy. Wake up to PagerDuty alerts.

The right way:

Step 1 - EXPAND (Week 1): Serve both shapes simultaneously
───────────────────────────────────────────────────────────
{
"user_name": "anshuman",          // old, still here
"username": "anshuman",           // new, added
"full_name": "Anshuman Singh",    // old
"profile": {
"display_name": "Anshuman Singh"
},
"email": "anshuman@example.com",  // old
"_meta": {
"deprecated_fields": ["user_name", "full_name", "email"],
"sunset_date": "2027-03-01"
}
}

Step 2 - COMMUNICATE (Week 2): Notify clients
───────────────────────────────────────────────────────────
- Update changelog
- Send email to API key holders
- Add Deprecation + Sunset headers
- Track usage of old fields in your metrics

Step 3 - CONTRACT (Month 6+): Remove when usage is <1%
───────────────────────────────────────────────────────────
Remove old fields. Celebrate.

Frequently Asked Questions

No - only breaking changes require a version bump. Additive changes (new optional fields, new endpoints, relaxed validation) are backward-compatible and should be deployed without a new version. Over-versioning is as harmful as under-versioning - it creates unnecessary migration burden for clients.
Industry practice is 2–3 versions. One current, one previous (still supported), and optionally one sunset. Supporting more than 3 versions simultaneously becomes very expensive to maintain - each version needs separate testing, documentation, and potentially separate code paths. Establish a clear deprecation cadence and stick to it.
GraphQL’s design philosophy actively avoids versioning. Instead, you evolve the schema: add new fields, mark old ones with @deprecated, and use the introspection system so clients can discover what’s available. The GitHub GraphQL API, for example, has been running on a single version since launch. The tradeoff is that schema evolution requires more discipline - you can never remove a field until usage drops to zero.
Webhooks are harder than REST because you’re pushing data to clients, not serving it on demand. The common patterns are: (1) include a schema_version field in every webhook payload so clients can route to the right handler; (2) let clients subscribe to a specific version (Stripe’s approach - your webhook config includes a version pin); (3) maintain a webhook envelope format that stays stable while the data field evolves. Never remove fields from webhook payloads without a long deprecation window.
Only in a few narrow circumstances: (a) you’re in alpha/beta and clients have explicitly agreed to instability; (b) the current behavior is a security vulnerability and the risk of not patching outweighs the breakage; (c) you have complete control of all clients (e.g., your own mobile app and your own server). In all other cases, break the contract and you break trust - and trust is much harder to rebuild than code.
The same principles apply, but the timelines are shorter and coordination is easier. For internal APIs, you can often do a coordinated deploy - update the producer and consumer simultaneously in a single release. But if services deploy independently (which they should in a proper microservices setup), you still need the Expand/Contract pattern. Treat your internal APIs as if an external team owns the consumer - it keeps you honest.

Interview Questions

These come up regularly in system design rounds, especially at FAANG-tier and late-stage startups.

Q1What is a breaking change in an API? Give 3 examples. Easy
A breaking change is any modification that causes existing clients - without any change on their end - to malfunction. Three key examples: (1) Removing or renaming a field - clients that read user_name get undefined when it’s renamed to username. (2) Changing a field’s type - from string to number or vice versa causes type errors in typed clients. (3) Removing a required field from a request body - while old clients send it, new validation that rejects it will break them. Contrast with additive changes (new optional fields, new endpoints), which are safe.
Q2Compare URL versioning vs header versioning. When would you choose each? Medium
URL versioning (/v1/, /v2/) is highly discoverable, cacheable by CDNs, easy to test in a browser, and the dominant industry choice for public APIs (GitHub, Reddit, Twitter). The downside is URL proliferation and separate codepaths per version. Header versioning (e.g., API-Version: 2024-01-01 like Stripe) keeps URLs clean and is excellent for platforms where clients pin to a version for months or years. The downside is lower discoverability and harder curl-based debugging. Choose URL versioning for public developer APIs where discoverability matters; choose header versioning for sophisticated developer platforms where stability guarantees over long periods are the primary concern.
Q3Describe the Expand/Contract pattern for API evolution. Medium
Expand/Contract (borrowed from database migration patterns) evolves an API in two phases without ever breaking clients. In the Expand phase, you add the new field/structure alongside the old one - the response temporarily carries both. New clients start using the new field; old clients ignore it. In the Contract phase, once you’ve confirmed through metrics that old field usage has dropped to near zero (and after a suitable deprecation window), you remove the old field. This gives clients time to migrate on their own schedule, makes rollback trivial, and avoids the β€œbig bang” version bump entirely.
Q4How would you design the deprecation and sunset process for a high-traffic public API? Hard
A production-grade deprecation process has several layers. Discovery: add deprecated: true to your OpenAPI spec, include Deprecation and Sunset response headers on every affected response, link to a migration guide. Communication: email all API key holders with sunset timeline, post a changelog entry, update SDK docs. Monitoring: log every use of deprecated endpoints/fields with the client ID - this tells you exactly who still needs to migrate. Enforcement: 30 days before sunset, consider returning a soft warning in the response body. On sunset date, return HTTP 410 Gone (not 404) with a helpful error body pointing to the migration guide. Keep the 410 response running for at least 3 months so latecomers get a clear signal. For enterprise clients, offer migration support contracts.
Q5Why does GraphQL argue against versioning? What’s the tradeoff? Hard
GraphQL’s β€œversionless” philosophy rests on the observation that REST versioning is usually a response to over-fetching - clients receive more fields than they need and any change breaks them. GraphQL solves this at the query level: clients declare exactly what fields they need, so adding new fields is always safe (clients never receive them unless they ask). Removing fields follows the @deprecated β†’ monitor usage β†’ remove pattern. The tradeoff: schema evolution requires much stricter discipline. You can never remove a field until you can prove it’s not being queried - which requires query-level analytics on your schema. You also lose the ability to do clean breaking rewrites behind a new version number. For teams that are willing to invest in schema governance, GraphQL’s approach scales beautifully. For teams that want explicit version boundaries (e.g., β€œv2 is completely different”), REST versioning is more pragmatic.
Q6How would you handle backward compatibility for webhooks specifically? Hard
Webhooks are push-based, so the producer can’t negotiate with the consumer at delivery time. The key patterns are: (1) Version envelope - always include a stable outer envelope with { "event_type": "...", "schema_version": "2", "data": {...} }. The envelope never changes; only the data object evolves, and it’s versioned separately. (2) Per-subscription version pinning (Stripe’s model) - each webhook subscription is pinned to a schema version; clients opt-in to new versions by updating their subscription. (3) Additive-only payload changes - treat webhook payloads like API responses: only add fields, never remove or rename. (4) Idempotency keys - include an event ID so consumers can safely retry on failure. The most important thing: never change the type or name of a field in a webhook payload without the full Expand/Contract cycle.

The Mental Model to Take Away

Priya’s 2 AM incident could have been prevented with one shift in thinking: treat your API as a published contract, not a private implementation detail.

Your clients - whether they’re a mobile app, a third-party integration, or another team’s microservice - write real code against your API. Every field name is a variable name in their codebase. Every status code is a branch in their error handler.

When you evolve an API:

  • Add, don’t replace. New fields can coexist with old ones.
  • Deprecate loudly, remove quietly. Give clients time, data, and a migration guide.
  • Version from day one. /v1/ costs nothing to add on day one and everything to retrofit later.
  • Measure before you remove. Use your own logs to know when it’s safe to cut the old field.
  • Return 410, not 404. Signal intent, not absence.

The best API is one that clients barely notice evolving - because it does so gracefully, predictably, and always with them in mind.