API caching can reduce origin load, improve latency, and smooth traffic spikes, but it can also leak private data or serve stale responses if it is keyed or controlled poorly. This guide explains how to cache APIs safely, which methods and responses are usually good candidates, which HTTP headers matter most, and how to handle edge cases like authentication, query strings, pagination, and invalidation at the CDN or reverse proxy layer.
Overview
If you are working on a public API, a backend-for-frontend, or an application that serves repeated read traffic, caching API responses is often one of the simplest ways to improve performance. Done well, it reduces response time, lowers origin compute usage, and improves resilience under burst traffic. Done carelessly, it can create the worst kind of bug: the system looks fast, but users receive the wrong data.
A safe API caching strategy starts with one principle: not every response should be cached, and not every cache should behave the same way. Browser caches, CDN edge caching, reverse proxy cache layers, and application-level caches all have different roles. For most teams, the practical question is not “should we cache APIs?” but “which responses are safe to cache, for how long, and using which cache key?”
In general, the best candidates for REST API caching are read-heavy, frequently requested, low-sensitivity responses that change on a predictable schedule. Examples include product catalogs, public documentation payloads, location metadata, feature flag manifests intended for public clients, or aggregated content used across many sessions. By contrast, highly personalized responses, sensitive account data, and anything tied to per-user authorization usually require either no shared caching or extremely careful separation.
It helps to think in terms of cache scope:
- Private cache: a browser or client-specific cache. Suitable for user-specific but non-sensitive responses in some cases.
- Shared cache: a CDN, reverse proxy cache, or gateway cache used by many users. This is where most risk lives.
- Application cache: server-side memoization, object caching, or datastore query caching. Useful, but outside standard HTTP semantics.
For edge caching, shared cache safety is the main concern. If a response varies by user, token, locale, tenant, or feature state, the cache key must reflect that variation, or the response should bypass the shared cache entirely. If you remember only one thing from this guide, remember this: cacheability and cache key design matter more than the TTL alone.
Core framework
Here is a practical framework for deciding how to cache APIs safely.
1. Classify the response before you cache it
Before choosing headers, classify each endpoint into one of four buckets:
- Public and stable: same response for everyone, changes infrequently. Good shared-cache candidate.
- Public but fast-changing: same response for everyone, but updates often. Cacheable with shorter TTLs and revalidation.
- User-specific or tenant-specific: varies by identity or account scope. Usually private-cache only, or shared-cache with an explicit segmented key.
- Sensitive or transactional: account balances, checkout state, tokens, health data, admin responses. Usually avoid shared caching.
This first classification prevents a common mistake: treating all GET endpoints as equally cacheable. GET is cacheable by method semantics, but that does not mean every GET response belongs in a CDN cache.
2. Use HTTP method semantics carefully
Most API caching happens on GET and sometimes HEAD. These are the natural methods for cacheable reads. POST, PUT, PATCH, and DELETE generally should not be cached in shared layers unless you have a very deliberate design and explicit support in your stack.
Even when only GET responses are cached, write operations still matter because they affect invalidation. If a POST updates product inventory, any related GET response may need purge or revalidation logic. Safe API caching is as much about invalidation after writes as it is about caching reads.
3. Set cache headers that match intent
The core headers for REST API caching are usually:
- Cache-Control
- ETag
- Last-Modified
- Vary
- Surrogate-Control or platform-specific edge controls, where supported
For a public shared-cache response, a pattern like this is often reasonable:
Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=30This means browsers may keep it for 60 seconds, shared caches can keep it for 300 seconds, and stale content can be served briefly while the cache revalidates. The exact values should match your update frequency and correctness requirements.
For a private user-specific response:
Cache-Control: private, max-age=30That tells shared caches not to reuse the response across users, while still allowing a client-side cache in some cases.
For highly sensitive responses:
Cache-Control: no-storeUse this when you do not want the response retained by any cache.
Do not confuse no-cache with no-store. no-cache allows storage but requires revalidation before reuse. no-store means do not store it at all.
4. Design the cache key explicitly
The cache key defines what makes one response distinct from another. For API caching strategy, this is where many production bugs begin. A safe key usually includes the path and only the query parameters or headers that truly change the response.
Common cache key dimensions include:
- Path
- Normalized query parameters
- Locale or language
- API version
- Tenant or region
- Selected request headers
- Device class, only if the response truly differs
Be very cautious with these:
- Authorization: if a response depends on it, shared caching is risky unless the key is segmented correctly.
- Cookies: many CDNs bypass cache when cookies are present, sometimes wisely.
- All query strings by default: this often destroys cache hit ratio and can create unnecessary fragmentation.
A good rule is to normalize the key to the smallest set of meaningful inputs. If ?utm_source= does not change the payload, it should not create a separate cache object. If ?page=2 or ?sort=price_asc changes the payload, it must be part of the key.
For a broader look at bypass conditions and key fragmentation, see CDN Cache Bypass Rules Explained: Cookies, Query Strings, and Headers.
5. Prefer revalidation where freshness matters
Short TTLs are useful, but they are not the only tool. Validators such as ETag and Last-Modified let clients and edge layers revalidate instead of downloading full responses again. This is especially helpful for API responses that are requested often but do not always change.
Conditional requests work like this:
- Origin returns an
ETagorLast-Modified. - Client or cache later sends
If-None-MatchorIf-Modified-Since. - Origin replies
304 Not Modifiedif unchanged.
This approach reduces bandwidth and origin work without extending staleness too aggressively.
6. Match invalidation to data volatility
There are only a few ways cached API data becomes fresh again:
- TTL expiry
- Revalidation
- Explicit purge or ban
- Versioned URLs or cache-busting keys
If your API data changes unpredictably and correctness matters immediately, rely on purge or event-driven invalidation rather than a long TTL. If it changes on a schedule, TTL plus revalidation may be enough. If the response is tied to content releases or data snapshots, versioning can simplify everything.
For teams implementing shared edge caching, it is worth documenting exactly which write operations trigger which cache purge paths. If you need a broader purge workflow, see How to Purge CDN Cache Without Breaking Your Site.
Practical examples
These examples show how the framework works in production-style situations.
Public catalog endpoint
GET /api/products?category=laptops&page=1
This is often a strong candidate for edge caching. The response is public, repeated across many users, and usually tolerates brief staleness.
A reasonable setup might be:
- Cache by path plus
category,page, andsort - Ignore tracking parameters
- Use
Cache-Control: public, s-maxage=300, max-age=60, stale-while-revalidate=30 - Purge category and product-list endpoints when price or stock changes materially
This type of endpoint can improve TTFB significantly when served from an edge delivery network.
Per-user dashboard endpoint
GET /api/me/dashboard
This endpoint usually includes personalized metrics, recent events, or account state. Do not place it into a shared CDN cache keyed only by URL. If cached incorrectly, one user can receive another user’s data.
Safer options:
Cache-Control: private, max-age=30for client-side reuseCache-Control: no-storeif the data is sensitive or frequently changing- Application-layer caching behind the API, keyed by user ID, if backend optimization is needed
In most cases, shared caching is the wrong tool here.
Tenant-scoped API in a SaaS product
GET /api/settings
This looks dangerous because the path is identical for every tenant, but the response changes by account. If requests are routed with a tenant header or hostname, the cache key must include that dimension.
Safe pattern:
- Key by hostname or explicit tenant identifier
- Do not vary on arbitrary headers
- Use
Varyonly on the headers that truly alter the response - Document invalidation for tenant config changes
If your CDN or reverse proxy cache cannot key cleanly by tenant, bypass shared caching for this endpoint.
Search endpoint
GET /api/search?q=cache&page=2
Search can be cacheable, but cardinality is high. If every unique query becomes a separate object, the cache hit ratio may be poor. Consider caching only popular searches, normalizing whitespace and casing, or setting a short TTL.
Practical approach:
- Normalize
qwhere safe - Include
pagein the key - Keep TTL short
- Do not cache authenticated search results in shared cache unless segmented carefully
This is a case where API caching strategy should be driven by request patterns, not just technical possibility.
Versioned content API
GET /api/v2/docs/getting-started
Versioned documentation or CMS-delivered content is often ideal for edge caching. Responses are public, consistent, and versioning reduces purge complexity.
A common pattern is:
- Longer edge TTL
- Strong validators with
ETag - Purge on publish, or new versioned URL on major content changes
If you are balancing reverse proxy cache and CDN layers, Reverse Proxy Cache vs CDN: What’s the Difference and When Do You Need Both? provides a useful model for deciding which layer should own what.
Common mistakes
The fastest way to make API caching unsafe is to assume the defaults are protecting you. They often are not. These are the mistakes that deserve explicit review.
Caching authenticated responses in a shared layer without segmentation
If the response depends on a bearer token, session cookie, or user identity, do not let the CDN cache it by URL alone. This is the classic private-data leak.
Ignoring Vary behavior
If your response changes by Accept-Language, origin header logic, compression, or a custom tenant header, that variation must be represented correctly. Misusing Vary can either leak data or create needless cache fragmentation.
Including too many query strings in the key
Tracking parameters, empty filters, parameter order, and duplicate values can explode object count and reduce cache hit ratio. Normalize aggressively where it is safe to do so.
Setting a long TTL without an invalidation plan
Long cache life sounds efficient until urgent updates are needed. If an API powers prices, stock status, feature flags, or compliance content, ask first: how will we force freshness when data changes unexpectedly?
Using no-cache when you mean no-store
These directives are not interchangeable. If storage itself is the problem, use no-store.
Assuming every GET should be cached at the edge
Some GET endpoints are technically cacheable but operationally poor candidates because of sensitivity, volatility, or low reuse. Safety and usefulness matter more than semantics alone.
Skipping observability
Without response headers, logs, or trace tools, it becomes hard to diagnose why an API was a HIT, MISS, BYPASS, or REVALIDATED object. Include cache status headers where your platform supports them, and test with controlled requests before rollout. For troubleshooting patterns, see How to Diagnose a CDN Cache MISS: A Step-by-Step Troubleshooting Checklist and How to Improve Cache Hit Ratio on a CDN.
When to revisit
API caching rules should be reviewed whenever the shape of the response, the identity model, or the edge platform changes. A configuration that was safe six months ago can become wrong after a product feature adds localization, tenancy, personalization, or new query parameters.
Revisit your setup when:
- You add authentication, session cookies, or token-based authorization to an endpoint that was previously public
- You introduce new query parameters, sort orders, filters, or pagination rules
- You move an API behind a CDN, gateway, or reverse proxy cache for the first time
- You change how tenancy, region, or language is resolved
- You add stale serving, background revalidation, or new cache directives
- You adopt new tooling or standards that change edge behavior
- You notice lower cache hit ratio, stale data incidents, or unexplained cache bypasses
A practical review process is simple:
- List your top API endpoints by traffic and business importance.
- Label each one public, tenant-scoped, user-specific, or sensitive.
- Document the intended cache scope: browser, edge, reverse proxy, or none.
- Write the exact cache key inputs for each cacheable endpoint.
- Confirm the
Cache-Control,Vary, and validator headers match that intent. - Define invalidation triggers for each endpoint touched by writes.
- Test with representative requests that vary identity, locale, cookies, and query strings.
If you want a companion reference for TTL planning across content types, including APIs, read Cache TTL Strategy by Content Type: HTML, Images, CSS, JS, and APIs.
The simplest safe rule set is also the most durable: cache public, repeated, low-sensitivity responses aggressively; cache private responses only in private scopes; and treat cache key design as part of application correctness, not just performance tuning. That mindset will help you improve website caching and edge caching without creating subtle data integrity problems later.