Summary
Disclaimer: This summary has been generated by AI. It is experimental, and feedback is welcomed. Please reach out to info@qconsf.com with any comments or concerns.
The presentation titled Continuous Delivery for Foundational Platforms by Ian Nowland highlights the challenges and strategies for implementing continuous delivery in foundational platforms.
Key Points Discussed:
- Continuous Delivery Challenges: Ian discusses how typical continuous delivery (CD) practices often do not align well with the demands of foundational platforms, emphasizing the unique constraints these systems face compared to traditional application environments.
- Obsolete Technology Dilemma: Platform teams are frequently pressured over using outdated technologies, leading to frustrations in fast-paced product organizations that expect rapid updates and adaptations.
- Outage Risks: Foundational platforms are known to cause significant outages, which necessitates a focus on balancing innovation with stability to prevent disruptions.
- Practical Strategies: Ian shares his experiences at AWS and Datadog, providing insights into how platform teams can efficiently deploy changes while maintaining service reliability and avoiding process pitfalls.
- Importance of Synthetic Monitoring: An essential takeaway is the prioritization of synthetic monitoring to ensure the safety and effectiveness of changes before they impact users.
Conclusion: The talk underscores the need for platform teams to adopt tailored CD practices that align with their specific operational requirements, maintaining a balance between deployment velocity and system reliability to effectively support organizational goals.
This is the end of the AI-generated content.
Abstract
Platform teams frequently inherit systems that were never architected for their current scale, yet are so foundational that downtime can halt the business. Operating on these fragile foundations, teams face the daunting challenge of continuously shipping new features while scaling infrastructure significantly. Continuous delivery can feel risky in such critical scenarios—but avoiding it can stall progress, frustrate internal customers, and trap teams in endless rewrites that never materialize.
Drawing from his experiences leading foundational platform teams at AWS EC2 and Datadog, Ian Nowland will share practical strategies to safely implement continuous delivery, balancing reliability with innovation. Attendees will learn how to scale confidently, enhance developer productivity, and sustainably improve their platforms—even under immense pressure.
Interview:
What is your session about, and why is it important for senior software developers?
This session is about how platform teams can safely implement continuous delivery for foundational infrastructure. Systems like CI/CD, compute, networking, and service discovery are so critical you can’t afford to break them—yet they still need to evolve. These are often legacy systems that were never designed for today’s scale but now sit at the center of everything.
For senior developers—especially those who end up inheriting these systems—it’s a real trap: the pressure to innovate is high, but the blast radius is huge. I’ll share strategies we used at AWS and Datadog to keep delivering change safely, and why that’s essential to avoid stagnation, rewrites, and developer burnout.
Why is it critical for software leaders to focus on this topic right now, as we head into 2026?
We’re entering a phase where AI is accelerating everything. Teams are racing to ship new features, and developers are using AI to generate code faster than ever. The bottleneck is no longer ideation or execution—it’s the platform standing in the way of safely and quickly getting that code into production.
In the past—like during the cloud migration—platform teams could respond by building a new “V2” platform tailored to emerging use cases. But this time is different. With AI-accelerated development, nearly every team wants to move faster. Supporting just a subset of use cases for the first couple of years isn’t enough. Foundational platforms need to incrementally evolve to deliver capabilities for all users, even as they carry foundational load.
That’s why continuous delivery for these systems has become critical. It’s about enabling safe, sustainable iteration without requiring a full rewrite followed by a years-long migration by your users. The goal is to build the internal tooling and processes that allows foundational platform teams to ship, test, and recover quickly—while the business keeps moving.
What are the common challenges developers and architects face in this area?
The most common challenges I see are:
Staging environments don’t match production in either diversity or scale, which makes testing platform changes almost impossible.
Change becomes scary. Platform teams hesitate to ship—even small “quality of life” improvements—because one wrong move could bring everything down.
Grand rewrites stall out. The team starts building a “V2” but never cuts over, because the risk is too high.
Techniques like blue/green deploys, one-box testing, and traffic shadowing are well-established for stateless microservices—but often seem out of reach for foundational platforms. In this talk, I’ll cover how to bridge that gap, even when you’re working on critical, fragile systems.
What’s one thing you hope attendees will implement immediately after your talk?
Build a path to production that feels safe. That might mean introducing a shadowing mechanism. It might mean running a flaky staging use case behind a flag in production. Or it might just mean adding better observability during rollouts. But the goal is the same: get to a place where it’s safe to ship small changes continuously—even to your scariest systems.
What makes QCon stand out as a conference for senior software professionals?
As someone who’s run large platform teams and now started a company in the space, I appreciate conferences where you can talk openly about failure modes—not just success stories. QCon consistently gets those conversations right.
What was one interesting thing that you learned from a previous QCon?
In 2019, I caught Brian Cantrill’s talk, “No Moore Left to Give: Enterprise Computing after Moore’s Law.” He was one of the first to clearly articulate that the “free” gains we’ve relied on—faster chips, more efficient transistors, cheaper compute—were all slowing down. And while it wasn’t the sole focus of his talk, it was one of the first times I saw someone point to GPUs becoming essential for non-graphics (well, and non-blockchain) workloads, which feels prescient today.
Speaker
Ian Nowland
CEO @Junction Labs, Author of O'Reilly's Platform Engineering, Previously SVP Core Engineering at Datadog and Leader of AWS Nitro
Ian Nowland is the CEO and co-founder of Junction Labs, and co-author of O'Reilly’s Platform Engineering. With 25 years in software, Ian previously served as SVP of Core Engineering at Datadog during its hypergrowth phase, and spent eight formative years at AWS (2008–2016), where he led the creation and development of EMR and AWS Nitro, EC2’s virtualization platform.