🚀 Caching - Redis, Memcached, & CDNs

The Senior Mindset: There are only two hard things in Computer Science: cache invalidation and naming things. A senior engineer treats the cache as an optimization, not a source of truth, and always designs a system that can function (albeit slowly) if the cache layer disappears.

🏔️ The Caching Layers

1. Client-Side / Browser Caching

Uses HTTP headers (Cache-Control, ETag) to store assets directly on the user’s device.

Best for: Static assets (JS, CSS, Images).

2. CDN (Content Delivery Network)

Edge servers distributed globally (Cloudflare, Akamai, AWS CloudFront).

Best for: Static and semi-static content. It reduces latency by serving data from the server physically closest to the user.

3. Distributed Cache (Redis vs. Memcached)

In-memory data stores sitting between your App and your Database.

Memcached: Simple, multi-threaded, best for large, static objects.
Redis: Single-threaded but rich in data structures (Lists, Sets, Hashes, Pub/Sub). The industry standard for most modern apps.

🛠️ Caching Strategies

Cache-Aside (Lazy Loading)

The application code checks the cache first. If it’s a miss, it queries the DB, stores the result in the cache, and returns.

Pros: Resilient to cache failure.
Cons: First request is always slow; data can become stale if the DB is updated directly.

Write-Through

The application writes data to the cache and the DB simultaneously.

Pros: Cache is never stale.
Cons: Higher write latency.

Write-Behind (Write-Back)

The application writes to the cache only. A background process syncs the cache to the DB later.

Pros: Incredible write performance.
Cons: High risk of data loss if the cache crashes before the sync.

⚖️ Managing Cache Invalidation

“Stale data” is the biggest risk in caching. You need a strategy to clear the cache when data changes.

TTL (Time To Live): Setting an expiration time. A simple “safety net” but doesn’t guarantee immediate consistency.
Event-based Invalidation: Using your Event-Driven Architecture to purge or update cache keys when a specific event (e.g., ProductUpdated) occurs.
Versioning: Changing the key name (e.g., user_123_v1 to user_123_v2) to force a fresh fetch.

🚩 Senior Challenges: The “Edge Cases”

1. Cache Penetration

When requests for non-existent data hit the DB because the cache doesn’t have it.

Solution: Cache “null” results with a short TTL or use a Bloom Filter to check if a key exists before hitting the cache/DB.

2. Cache Stampede (Thundering Herd)

When a high-traffic key expires, and thousands of concurrent requests all hit the DB at once to re-cache it.

Solution: Use Locking (only one request goes to the DB, others wait) or External Re-computation (a background job refreshes the cache before it expires).

3. Hot Keys

When a single cache node is overwhelmed because one specific key (e.g., a viral post) is requested millions of times.

Solution: Use Local In-Memory Caching (in the application process) for the most popular keys or replicate the hot key across multiple cache nodes.

📊 Comparison: Redis vs. Memcached

Feature	Redis	Memcached
Data Types	Strings, Hashes, Lists, Sets, Geospatial	Strings / Blobs only
Persistence	Yes (RDB/AOF)	No (In-memory only)
Architecture	Single-threaded (mostly)	Multi-threaded
Scaling	Redis Cluster	Client-side sharding
Best For	Complex logic, Pub/Sub, Persistent state	Simple, high-speed object caching

💡 Seniority Note: Before implementing a distributed cache, check your Database Indexes. A common junior mistake is trying to “fix” a slow database with Redis, when a simple index on the table would have solved the problem at its source without adding more infrastructure.

[[Networks-HTTP-Caching]]
[[Relational-Databases-Indexing]]
[[Architecture-Resilience-Patterns]]