Summary

Disclaimer: This summary has been generated by AI. It is experimental, and feedback is welcomed. Please reach out to info@qconsf.com with any comments or concerns.

The presentation discusses the transition from milliseconds to microseconds in data access, addressing the limitations of traditional caching strategies. It highlights the open-source Valkey Cluster, a Redis-compatible, in-memory datastore, and provides guidance on when to use proxy-based versus direct-access caching. Crucial strategies are presented to avoid reliability issues in sharded systems and optimize for price-performance at scale.

Key Points:

NASA's Space Shuttle Program: Used as an analogy to demonstrate how initial broad visions can become complex in execution.
Microsecond Latency: The importance of microsecond latency in supporting AI applications, citing a DoorDash AI platform case study.
Valkey Architecture: Discussion on architectural evolution and trade-offs, covering cost efficiencies and reliability factors.
Direct Access Architecture: Benefits of eliminating proxies for improved system performance and reliability, likened to ancient ship architecture for fault isolation.
System Performance: Analysis of latency and network hops, demonstrating advantages of the designed architecture in cost and performance enhancements.

Conclusion: The presentation concludes with takeaways on how direct access architecture removes single points of failure, reduces costs, and benefits from community contributions to maintain a low infrastructure cost.

This is the end of the AI-generated content.

Abstract

As AI applications demand faster and more intelligent data access, traditional caching strategies are hitting performance and reliability limits.

This talk presents architecture patterns that shift from milliseconds to microseconds using Valkey Cluster, an open-source, Redis-compatible, in-memory datastore. Learn when to use proxy-based versus direct-access caching, how to avoid hidden reliability issues in sharded systems, and how to optimize for high price-performance at scale. Backed by the Linux Foundation, Valkey offers rich data structures and community-driven innovation.

Whether you’re building GenAI services or scaling existing platforms, this session delivers actionable patterns to improve speed, resilience, and efficiency.

Interview:

What is your session about, and why is it important for senior software developers?

Imagine your in-memory system is a Formula 1 engine. We've made the engine incredibly fast, but now the racetrack itself is the bottleneck. My talk is about fixing that track. We will look at how extra latency hops are hidden speed bumps, and how expensive abstractions are like adding a heavy new spoiler that just adds drag and cost. We'll dive into the real trade-offs of building for microsecond latency, using patterns you can apply with the open-source, Redis-compatible Valkey. (Only milliseconds were harmed.)

Why is it critical for software leaders to focus on this topic right now, as we head into 2026?

Because speed is a feature, and inefficiency is a tax. When your Redis-style operations take microseconds, your real slowdown comes from network hops, proxies, and client behavior. This is what bloats your latency tail and your cloud bill. Focusing on this now is also a key strategic move. The open governance of Valkey after the 2024 Redis license change makes it a safe, vendor-neutral base to build on, so your efficiency gains aren't locked to a single vendor.

What are the common challenges developers and architects face in this area?

The biggest challenge is that a system on a whiteboard and a system under pressure tell two different stories. On paper, your design looks clean. But in reality, a single busy component can create a traffic jam that backs everything up. We will talk about how to spot these expensive problems before they become a 3 a.m. emergency.

What's one thing you hope attendees will implement immediately after your talk?

Go back and play detective with your system. Draw a map of the journey your code takes to get data. Every stop on that map, every latency hop, and every "simple" abstraction has a price tag in both time and money. Start asking what each step costs. When you see the real price, you will find amazing ways to make things faster and cheaper.

What makes QCon stand out as a conference for senior software professionals?

QCon is where you learn from people who have the operational scars to prove their advice works. It's built on practitioner-led content, so you get unfiltered, real world lessons without the sales pitch. You leave with insights you can actually apply on Monday morning.

Speaker

Dumanshu Goyal

Uber Technical Lead @Airbnb Powering $11B Transactions, Formerly @Google and @AWS

Dumanshu Goyal leads Online Data Priorities at Airbnb, powering its $11B transaction platform and building the next generation of the company’s data systems. Previously, he led in-memory caching for Google Cloud Databases, delivering 10x improvements in scale and price-performance for Google Cloud Memorystore, one of the rare times “10x” was more than a slide promise. Before that, he spent 10 years at AWS as the founding engineer of AWS Timestream, a serverless time-series database, and architected durability and availability features for one of the internet’s foundational services, AWS DynamoDB, ensuring data stayed available even when Wi-Fi did not.

An expert in building and operationalizing large-scale distributed systems, Dumanshu holds 20 patents and brings deep experience in architecting the resilient infrastructure that underpins today’s digital world.