How to Build an Exchange: Sub Millisecond Response Times and 24/7 Uptimes in the Cloud

Summary

Disclaimer: This summary has been generated by AI. It is experimental, and feedback is welcomed. Please reach out to info@qconsf.com with any comments or concerns.

This presentation, "How to Build an Exchange: Sub Millisecond Response Times and 24/7 Uptimes in the Cloud," features Frank Yu from Coinbase as the speaker. It addresses the challenges and strategies involved in creating high-performance exchange systems in a cloud environment.

Key Points:

  • Introduction to Exchange Systems: Frank Yu discusses the fundamental operations of an exchange as a platform where orders to buy or sell are processed and updates on pricing and trades are provided efficiently.
  • Architecture and Performance: The talk emphasizes the importance of removing operating system interference and optimizing memory usage to enhance performance. Techniques like pinning threads to CPUs and using single-threaded processes to minimize latency are highlighted.
  • Scalability and Redundancy: The system is designed for high availability and disaster recovery through the use of raft consensus algorithms, allowing for fast and reliable order processing even in the event of hardware failures.
  • Upgradation and Deployment: Implementation of continuous integration practices and blue-green deployments enables the system to maintain a 24/7 uptime without traditional long maintenance windows, crucial in a fast-paced financial environment like cryptocurrency trading.
  • Data Management: The architecture leverages deterministic processing for consistency in transactions, ensuring that identical inputs result in predictable outputs, which is critical for handling financial trades.
  • User Interaction: The system prioritizes giving users the best available prices to encourage fair trading practices, detailed through examples of order fulfillment and price-time priority rules.
  • System Enhancements: Frank Yu advocates for reducing unnecessary processes and blocking operations in the hot path of data handling to improve order processing speed and efficiency.

This structured approach ensures that the exchange platform remains resilient, performant, and reliable, meeting the demanding needs of continuous and rapid transaction processing in the cloud.

This is the end of the AI-generated content.


Abstract

These days it is possible to achieve fairly good performance on cloud provisioned systems. We discuss the design of a high performance, strongly consistent system which maintains constant service in the face of regular updates to core logic.

Interview:

What is your session about, and why is it important for senior software developers?

Due to unique performance and consistency requirements, financial exchanges have traditionally pushed the boundaries of what’s possible on backend services. We can now do a similar exercise on cloud workloads and provide some counterintuitive anecdata to common wisdom about scaling.

Why is it critical for software leaders to focus on this topic right now, as we head into 2026?

If some folks designing distributed systems gain some conviction to try to do more transactions with less complexity, we will have seen critical impact.

What are the common challenges developers and architects face in this area?

Standard patterns in scaling distributed systems in the cloud can lead to proliferation of provisioned resources and an overwhelming cross product of interactions between different systems and tools. There are circumstances where addition by subtraction can be uniquely impactful.

What's one thing you hope attendees will implement immediately after your talk?

Something deterministic would be wonderful.

What makes QCon stand out as a conference for senior software professionals?

QCon has had plenty of impact on the landscape of Software Developers in the Bay. There are talks that have created massive movements, leaving a sprawling wake of refactored and scaled systems across the industry. 

What was one interesting thing that you learned from a previous QCon?

Database people give excellent talks, I learned about fsyncgate, mediocre client-server protocols, and streaming magic all at QCon. The rigorous tracks are properly rigorous here at QCon.


Speaker

Frank Yu

Director of Engineering @Coinbase, Previously Principal Engineer and Director @FairX

Frank is an engineering leader at Coinbase, focusing on distributed low latency trading platforms. Prior to Coinbase, he served as Principal Engineer and later Director of Software Engineering at FairX, leading the design and build of what would become the Coinbase Derivatives Exchange post acquisition. Frank has spent over a decade making tradeoffs on mission critical systems with submillisecond response times and loves chatting about complexity, testing, and performance.

 

Read more
Find Frank Yu at:

From the same track

Session Platform Engineering

Building Resilient Platforms: Insights from 20+ Years in Mission-Critical Infrastructure

Monday Nov 17 / 02:45PM PST

In this talk, Matthew will describe lessons learned from over 20+ years of building scalable, secure and stable infrastructure platforms for software in financial services (electronic trading, credit card processing etc.), the talk is relevant to anyone building platforms for mission-critic

Speaker image - Matthew Liste

Matthew Liste

Head of Infrastructure @American Express, Previously @JPMorgan Chase and @Goldman Sachs

Session

Unconference: Architectures You've Always Wondered About

Monday Nov 17 / 05:05PM PST

Session Architecture

Architecting a Centralized Platform for Data Deletion at Netflix

Monday Nov 17 / 03:55PM PST

What does it take to safely delete data at Netflix scale? In large-scale systems, data deletion cuts across infrastructure, reliability, and performance complexities.

Speaker image - Vidhya Arvind

Vidhya Arvind

Tech Lead & a Founding Architect for the Data Abstraction Platform @Netflix, Previously @Box and @Verizon

Speaker image - Shawn Liu

Shawn Liu

Senior Software Engineer @Netflix, Building Reliable and Extensible Systems for Consumer Data Lifecycle at Scale

Session Durability

Compiling Workflows into Databases: The Architecture That Shouldn't Work (But Does)

Monday Nov 17 / 11:45AM PST

What if everything you know about building distributed systems is backwards?

Speaker image - Jeremy Edberg

Jeremy Edberg

CEO of DBOS, Creator of Chaos Engineering, Tech Editor for 'AWS for Dummies'; Previously Founding Reliability Engineer @Netflix, and First Engineer @Reddit

Speaker image - Qian Li

Qian Li

Co-founder, Architect @DBOS, Stanford CS Ph.D., Co-organizer of South Bay Systems

Session Architecture

Parting the Clouds: The Rise of Disaggregated Systems

Monday Nov 17 / 01:35PM PST

Cloud systems are undergoing an architectural shift. Traditional shared-nothing designs struggle to deliver the elasticity, availability, and operational simplicity that the cloud demands.

Speaker image - Murat Demirbas

Murat Demirbas

Principal Research Scientist @MongoDB Research, Previously Principal Applied Scientist @AWS and a Professor of Computer Science at the University at Buffalo (SUNY)