Abstract
As AI applications demand faster and more intelligent data access, traditional caching strategies are hitting performance and reliability limits.
This talk presents architecture patterns that shift from milliseconds to microseconds using Valkey Cluster, an open-source, Redis-compatible, in-memory datastore. Learn when to use proxy-based versus direct-access caching, how to avoid hidden reliability issues in sharded systems, and how to optimize for high price-performance at scale. Backed by the Linux Foundation, Valkey offers rich data structures and community-driven innovation.
Whether you’re building GenAI services or scaling existing platforms, this session delivers actionable patterns to improve speed, resilience, and efficiency.
Interview:
What is your session about, and why is it important for senior software developers?
Imagine your in-memory system is a Formula 1 engine. We've made the engine incredibly fast, but now the racetrack itself is the bottleneck. My talk is about fixing that track. We will look at how extra latency hops are hidden speed bumps, and how expensive abstractions are like adding a heavy new spoiler that just adds drag and cost. We'll dive into the real trade-offs of building for microsecond latency, using patterns you can apply with the open-source, Redis-compatible Valkey. (Only milliseconds were harmed.)
Why is it critical for software leaders to focus on this topic right now, as we head into 2026?
Because speed is a feature, and inefficiency is a tax. When your Redis-style operations take microseconds, your real slowdown comes from network hops, proxies, and client behavior. This is what bloats your latency tail and your cloud bill. Focusing on this now is also a key strategic move. The open governance of Valkey after the 2024 Redis license change makes it a safe, vendor-neutral base to build on, so your efficiency gains aren't locked to a single vendor.
What are the common challenges developers and architects face in this area?
The biggest challenge is that a system on a whiteboard and a system under pressure tell two different stories. On paper, your design looks clean. But in reality, a single busy component can create a traffic jam that backs everything up. We will talk about how to spot these expensive problems before they become a 3 a.m. emergency.
What's one thing you hope attendees will implement immediately after your talk?
Go back and play detective with your system. Draw a map of the journey your code takes to get data. Every stop on that map, every latency hop, and every "simple" abstraction has a price tag in both time and money. Start asking what each step costs. When you see the real price, you will find amazing ways to make things faster and cheaper.
What makes QCon stand out as a conference for senior software professionals?
QCon is where you learn from people who have the operational scars to prove their advice works. It's built on practitioner-led content, so you get unfiltered, real world lessons without the sales pitch. You leave with insights you can actually apply on Monday morning.
Speaker
Dumanshu Goyal
Uber Technical Lead @Airbnb Powering $11B Transactions, Formerly @Google and @AWS
Dumanshu Goyal leads Online Data Priorities at Airbnb, powering its $11B transaction platform and building the next generation of the company’s data systems. Previously, he led in-memory caching for Google Cloud Databases, delivering 10x improvements in scale and price-performance for Google Cloud Memorystore, one of the rare times “10x” was more than a slide promise. Before that, he spent 10 years at AWS as the founding engineer of AWS Timestream, a serverless time-series database, and architected durability and availability features for one of the internet’s foundational services, AWS DynamoDB, ensuring data stayed available even when Wi-Fi did not.
An expert in building and operationalizing large-scale distributed systems, Dumanshu holds 20 patents and brings deep experience in architecting the resilient infrastructure that underpins today’s digital world.