Wednesday, July 16, 2025

🔄 A Curious case of Redis vs Caffeine and ever existing Trade-offs

🧠 The Assumption

For years, I treated Redis as the default solution for caching.... after all

  • It's fast.
  • It supports persistence.
  • It’s distributed.
  • Everyone uses it.

It was one of those tools that seemed like an automatic “yes” whenever caching came up in any design discussion.

So naturally, when I started analyzing a high-performance OAuth server, I was fully expecting Redis to be at the heart of its caching layer.

But what I found instead was... Caffeine.

And that changed how I look at caching decisions entirely. I always knew its always a trade-off but we inherently bias towards our favorite solutions and we need to learn to stop and think.


🧭 Lessons from Existing Code: Don’t Just Learn by Building

As engineers, we often think the best way to learn is to build from scratch. But this experience reminded me how much there is to learn from studying well-designed production systems — especially when built by expert teams operating in high-stakes environments like banking/fintech. Afterall why reinvent a wheel each time.

I didn't write the OAuth server. But reviewing its code, dissecting the reasoning, and understanding the performance considerations taught me as much as (if not more than) writing one myself.


🔍 Redis vs. Caffeine: Revisiting the Comparison

Feature Redis Caffeine
Location Remote (over network) In-JVM
Latency ~1-3 ms <1 Ξs (nanoseconds)
Setup Needs deployment, scaling, monitoring Zero setup, embedded
Persistence Yes No
Cross-instance sharing Yes No (per instance)
Eviction policies LRU, TTL Advanced: size, frequency, TTL
Serialization overhead Yes None
Startup state Warm (centralized) Cold start
Failure handling Needs retry, backoff logic JVM-safe, no external calls

🔍 Why Redis Wasn’t Chosen — and Why That Makes Sense

1. The cache wasn’t critical to correctness

The cached data was:

  • Mostly static client configuration
  • Derivable from a database or config service
  • Frequently read, rarely updated

A cache miss wasn’t a system failure — just a slightly slower response.

2. Latency mattered more than persistence

In an OAuth server that processes thousands of token validations per second, even 1ms per lookup has measurable impact.

Redis: ~1-3ms
Caffeine: sub-millisecond (nanoseconds)

3. Simplicity wins in high-availability setups

Using Redis in a high performance/critical environment means:

  • Setting up secure connections (TLS)
  • Managing replication/failover
  • Ensuring Redis HA is as resilient as the app itself

Caffeine required none of that — it was a zero-dependency, drop-in cache with advanced eviction and tuning capabilities.

4. Each instance could function independently

The OAuth server didn’t require session sharing or cross-instance consistency. Local in-memory cache was not just acceptable — it was ideal.


🧠 Key Realization: The Best Tool is Use-Case Driven

Before this, I might have pitched Redis on auto-pilot. But this real-world implementation forced me to slow down and ask:

“What does this specific use case actually need?”

And the answer wasn’t persistence, or distributed state, or warm startup.

It was:

  • Speed
  • Simplicity
  • Per-instance safety
  • GC-aware eviction
  • Minimal moving parts

Caffeine was the better fit — not because Redis is bad, but because Caffeine aligned better with the real goals of the system.


🚧 Learning From Existing Systems

This wasn’t something I learned by building a new service from scratch.

It was something I learned by observing what had already been done well, and then understanding why those decisions were made.

We often glorify greenfield projects. But there is incredible value in studying the systems already running reliably in production.

Existing codebases, especially in complex environments are full of real, hard-earned lessons.


🧭 Final Thought: It’s Always About Trade-offs

This experience brought me back to a fundamental truth of software engineering:

There is no best tool — only the best fit for the problem.

Redis is great — even amazing — when the problem demands:

  • Shared state
  • Central cache
  • Persistence
  • Warm start

But in a high-performance OAuth server, built to run fast, scale horizontally, and tolerate cold starts — Caffeine was the elegant and practical choice.


🔚 Takeaways worth summarizing

  • Don’t default to Redis just because it’s popular.
  • Understand the performance vs. persistence trade-off.
  • Learn from production code — not just personal projects.
  • Choose the caching strategy that fits your use case, not your toolbox.

No comments:

Post a Comment