CDN and Edge Caching: Serving Content From Where Your Users Are
Your API is hosted in US-East. A user in Singapore requests a product image. The request travels 15,000 kilometers to your server and back. Round-trip time: 250ms. Just for the network. Before your server does anything.
You add Cloudflare. The same user’s request now goes to a Cloudflare edge node in Singapore, 50 kilometers away. Round-trip time: 5ms. The image is served from the edge cache. Your origin server never sees the request.
That is what a CDN does. And it is one of the highest-leverage performance improvements you can make for a globally distributed user base.
What a CDN actually is
A Content Delivery Network is a geographically distributed network of servers (called Points of Presence, or PoPs) that cache and serve content close to users. Instead of all requests going to your origin server, requests go to the nearest PoP.
CDNs work at the HTTP layer. They cache responses based on cache-control headers and serve cached responses to subsequent requests for the same URL.
graph TB subgraph without["Without CDN"] U1["User in Singapore"] -->|"250ms RTT"| OR1["Origin US-East"] U2["User in London"] -->|"180ms RTT"| OR1 U3["User in Brazil"] -->|"200ms RTT"| OR1 end subgraph with["With CDN"] U4["User in Singapore"] -->|"5ms"| SG["CDN PoP Singapore"] U5["User in London"] -->|"8ms"| LO["CDN PoP London"] U6["User in Brazil"] -->|"10ms"| BR["CDN PoP Sao Paulo"] SG -->|"cache miss only 250ms"| OR2["Origin US-East"] LO -->|"cache miss only"| OR2 BR -->|"cache miss only"| OR2 end style OR1 fill:#FCEBEB,stroke:#A32D2D,color:#791F1F style OR2 fill:#EEEDFE,stroke:#534AB7,color:#3C3489 style SG fill:#E1F5EE,stroke:#0F6E56,color:#085041 style LO fill:#E1F5EE,stroke:#0F6E56,color:#085041 style BR fill:#E1F5EE,stroke:#0F6E56,color:#085041
How CDN caching works
Cache-Control headers
The origin server tells the CDN (and browsers) how to cache a response using HTTP headers:
Cache-Control: public, max-age=86400
public- The response can be cached by CDNs and browsersprivate- Only the browser can cache it (not CDNs). Used for user-specific responses.max-age=86400- Cache for 86400 seconds (24 hours)s-maxage=3600- CDN-specific max age (overrides max-age for CDNs)no-cache- Must revalidate with origin before serving (not “do not cache”)no-store- Do not cache at allstale-while-revalidate=60- Serve stale content for up to 60 seconds while fetching fresh content in the background
The CDN request flow
- User requests
https://example.com/images/product-123.jpg - DNS resolves to the CDN’s anycast IP (routes to nearest PoP)
- CDN PoP checks its cache for this URL
- Cache hit: serve the cached response immediately
- Cache miss: forward the request to the origin server, cache the response, serve it to the user
Cache keys
By default, the CDN caches by URL. https://example.com/api/products?sort=price and https://example.com/api/products?sort=name are different cache keys.
You can configure the CDN to vary the cache by headers (e.g., Accept-Language for localized content, Accept-Encoding for compression) or to ignore certain query parameters.
Edge computing: beyond caching
Modern CDNs (Cloudflare Workers, Fastly Compute, AWS Lambda@Edge) let you run code at the edge, not just cache static responses. This enables:
- A/B testing at the edge - Route users to different variants without a round trip to origin
- Authentication at the edge - Validate JWTs before forwarding to origin
- Request transformation - Modify headers, rewrite URLs, add CORS headers
- Dynamic content generation - Generate personalized responses at the edge using edge-local data stores
Where it breaks or gets interesting
Cache invalidation is the hard part
You deploy a new version of your JavaScript bundle. The CDN has the old version cached with a 24-hour TTL. Users get the old version for up to 24 hours.
Solutions:
Cache busting with content hashing - Include a hash of the file content in the filename: app.a3f8c2d1.js. When the content changes, the hash changes, the URL changes, and the CDN fetches the new version. Old URLs expire naturally. This is the standard approach for static assets.
Short TTLs for dynamic content - For HTML pages that reference versioned assets, use a short TTL (5-60 minutes) or no-cache with revalidation.
CDN purge API - Most CDNs provide an API to invalidate specific URLs or patterns immediately. Use this after deployments. But purge propagation takes 1-30 seconds to reach all PoPs globally.
Surrogate keys (cache tags) - Tag cached responses with logical keys (e.g., product:123). When product 123 changes, purge all responses tagged with product:123. Fastly and Cloudflare support this.
The stale-while-revalidate pattern
Cache-Control: max-age=60, stale-while-revalidate=300
Serve the cached response for up to 60 seconds. After 60 seconds, serve the stale response while fetching a fresh one in the background. The user never waits for a cache miss. The response might be up to 360 seconds old (60 + 300) in the worst case.
This is excellent for content where brief staleness is acceptable (product listings, blog posts) and you want to eliminate cache miss latency entirely.
CDN bypass for authenticated requests
CDNs should not cache user-specific responses. A response with Authorization headers or Set-Cookie headers is typically not cached by CDNs (they respect Cache-Control: private). But misconfiguration can cause CDNs to cache authenticated responses and serve one user’s data to another.
Always set Cache-Control: private, no-store for responses containing user-specific data.
Origin shield
Without origin shield, a cache miss at any PoP goes directly to your origin. If you have 200 PoPs and a popular URL expires simultaneously, 200 requests hit your origin at once.
Origin shield adds a second layer of caching: a small number of “shield” nodes that sit between PoPs and the origin. Cache misses from PoPs go to the shield first. Only shield misses go to the origin. This dramatically reduces origin load.
graph LR subgraph edge["Edge PoPs (200 locations)"] P1["PoP Singapore"] P2["PoP London"] P3["PoP Sao Paulo"] end subgraph shield["Origin Shield (3 locations)"] S1["Shield US-East"] end subgraph origin["Origin"] O["Your Server"] end P1 -->|"cache miss"| S1 P2 -->|"cache miss"| S1 P3 -->|"cache miss"| S1 S1 -->|"shield miss only"| O style P1 fill:#E1F5EE,stroke:#0F6E56,color:#085041 style P2 fill:#E1F5EE,stroke:#0F6E56,color:#085041 style P3 fill:#E1F5EE,stroke:#0F6E56,color:#085041 style S1 fill:#EEEDFE,stroke:#534AB7,color:#3C3489 style O fill:#FAEEDA,stroke:#854F0B,color:#633806
Real-world systems
Cloudflare - 300+ PoPs globally. Anycast routing. Workers for edge compute. Cache Rules for fine-grained caching control. Used by millions of websites.
AWS CloudFront - Integrated with AWS services. Lambda@Edge for edge compute. Origin shield. Supports custom cache behaviors per path pattern.
Fastly - Varnish-based CDN with powerful VCL (Varnish Configuration Language) for custom caching logic. Surrogate keys for tag-based invalidation. Compute@Edge for edge compute.
Akamai - The original CDN. 4,000+ PoPs. Used by major enterprises and media companies for high-traffic events.
Vercel Edge Network - CDN optimized for Next.js and static sites. Automatic cache invalidation on deployment. Edge Functions for dynamic content.
How to apply it in practice
What to cache at the CDN
Always cache:
- Static assets (images, CSS, JavaScript, fonts) with long TTLs and content-hashed URLs
- Public API responses that do not change per user (product catalog, public blog posts)
- Video and audio files
Cache with short TTLs:
- HTML pages (5-60 minutes)
- API responses for semi-dynamic data (prices, inventory counts for display)
Never cache:
- Authenticated API responses
- Responses with user-specific data
- Payment and checkout flows
- Admin interfaces
Cache-Control header strategy
# Static assets with content hash in URL - cache forever
Cache-Control: public, max-age=31536000, immutable
# HTML pages - short TTL with stale-while-revalidate
Cache-Control: public, max-age=300, stale-while-revalidate=3600
# API responses - short TTL
Cache-Control: public, s-maxage=60, stale-while-revalidate=300
# User-specific responses - no CDN caching
Cache-Control: private, no-store
FAQ
Q: What is the difference between a CDN and a reverse proxy?
A reverse proxy sits in front of your servers and forwards requests to them. It can cache, but it is typically a single location. A CDN is a globally distributed network of reverse proxies. Every CDN is a reverse proxy, but not every reverse proxy is a CDN. nginx as a reverse proxy in your data center is not a CDN. Cloudflare with 300 global PoPs is a CDN.
Q: Does using a CDN mean my origin server gets no traffic?
No. Cache misses, authenticated requests, and non-cacheable responses still go to your origin. The CDN reduces origin traffic for cacheable content, but your origin still needs to handle the uncacheable portion. For a typical website, a CDN might absorb 80-95% of traffic, but the remaining 5-20% still hits your origin.
Q: How do you handle CDN caching for a single-page application?
The HTML file (index.html) should have a short TTL or no-cache because it references the versioned JavaScript and CSS files. The JavaScript and CSS files should have content-hashed URLs and very long TTLs (1 year) because they never change once deployed. When you deploy a new version, the HTML file changes to reference new hashes, the CDN fetches the new HTML, and users get the new JavaScript and CSS. Old JavaScript and CSS files remain cached but are no longer referenced.
Interview questions
Q1: You deploy a critical bug fix. How do you ensure users get the new version immediately despite CDN caching?
Strong answer: For static assets (JavaScript, CSS), use content-hashed filenames. The bug fix changes the code, which changes the hash, which changes the URL. The CDN has no cache for the new URL and fetches it from origin. No manual invalidation needed. For HTML files, use a short TTL (5 minutes) or no-cache with revalidation. After deployment, use the CDN’s purge API to immediately invalidate the HTML cache. For API responses, use the CDN’s purge API or surrogate keys to invalidate affected responses. The key insight: design your caching strategy so that deployments automatically invalidate the right things without manual intervention.
Q2: Your CDN is caching a response that should be user-specific. How did this happen and how do you fix it?
Strong answer: This happens when the origin returns a user-specific response without Cache-Control: private or with Cache-Control: public. The CDN caches it and serves it to the next user who requests the same URL. Fix: add Cache-Control: private, no-store to all responses containing user-specific data. Also add Vary: Cookie, Authorization to tell the CDN that responses vary by these headers (though most CDNs will not cache responses with these headers anyway). Audit your CDN’s cache hit logs to find URLs that are being cached but should not be. Going forward, make Cache-Control: private the default for all API responses and explicitly opt in to public caching for responses you know are safe to cache.
Q3: How would you design the caching strategy for a news website that publishes articles frequently and needs low latency globally?
Strong answer: Use a CDN with origin shield. For article pages: Cache-Control: public, s-maxage=300, stale-while-revalidate=3600. This serves cached content for 5 minutes, then serves stale content while refreshing in the background. Users almost never wait for a cache miss. When an article is published or updated, use the CDN’s purge API (or surrogate keys tagged with the article ID) to immediately invalidate the cached version. For the homepage and category pages: shorter TTL (60 seconds) since they change more frequently. For images and static assets: content-hashed URLs with 1-year TTL. For the API that powers the frontend: s-maxage=60 with surrogate keys for targeted invalidation. The result: global users get sub-10ms response times for cached content, and new articles appear within seconds of publication.