Designing AI Platforms for Reliability: Tools for Certainty, Agents for Discovery

Abstract

Modern AI platforms don’t have to choose between deterministic precision and probabilistic exploration—they need both. Deterministic tools provide the certainty required for high-stakes operations like transactions, security, and compliance, while probabilistic agents bring adaptability and discovery to complex, evolving problems. In this talk, we’ll explore how to design platforms that combine these modes effectively: long-running agents grounded by frequent truth checks, tools that guarantee reliable outcomes where variability is unacceptable, and hybrid systems that thrive in uncertainty when the right tool for the job is probabilistic reasoning. Using real-world examples—from detecting anomalous clusters to health agents debating diagnostic hypotheses—we’ll show how this dual-layer approach leads to platforms that are not only more capable, but also more trustworthy.

Interview:

What is your session about, and why is it important for senior software developers?

Generally my session is about how one weaves together traditional software and platforms with new forms of agentic software that are usually stochastic. This is often framed as an either/or choice. This is one we should reject in favor of “right tool for the job”.

Why is it critical for software leaders to focus on this topic right now, as we head into 2026?

This is critical because nearly every company is doing work to deploy AI, yet often struggling to product ionize it, often due to mismatched expectations around what AI does well and what it does less well with current tools. Getting this right is critical.

What are the common challenges developers and architects face in this area?

The common challenge is often knowing when to apply a stochastic AI driven approach, what the patterns are in agent development (and how they are different), and how you interleave these well so that the whole is more than the sum of the parts. How, for example, do you build in a grounded deep research feature into a product in a manner that doesn’t run off into incoherent reasoning chains?

What's one thing you hope attendees will implement immediately after your talk?

One immediate implication is that we should be building tool catalogs for AI agents to use. This goes beyond mere “here are all your MCP endpoints”, and more about giving the agents context to properly select what tool to use for what problem in the first place.

What makes QCon stand out as a conference for senior software professionals?

What makes QCon stand out is that the presenters are builders, not marketers, not trying to drive hype.


Speaker

Aaron Erickson

Senior Manager and Founder of the DGX Cloud Applied AI Lab @NVIDIA, Previously Engineer @ThoughtWorks, VP of Engineering @New Relic, CEO and Co-Founder @Orgspace

Aaron Erickson founded the Applied AI Lab for DGX Cloud at NVIDIA, which specializes in building foundation models and agentic systems to solve broad industry problems like time series-based anomaly detection. Previously, he held engineering leadership roles at ThoughtWorks and New Relic before founding Orgspace, a startup that pioneered generative AI–driven organizational design. He is the author of The Nomadic Developer and Professional F# 2.0, and most recently launched NVIDIA’s Llo11yPop project, applying AI agents to govern GPU resources at global scale.

Read more
Find Aaron Erickson at:

From the same track

Session

Keeping the Mainline Green Across Diverse Language Monorepos

Monday Nov 17 / 02:45PM PST

At Uber’s scale, ensuring an always-green mainline while processing hundreds of changes per hour is a massive challenge— especially when those changes span multiple language monorepos supporting dozens of business-critical apps.

Speaker image - Dhruva Juloori

Dhruva Juloori

Senior Software Engineer @Uber, Core Contributor to SubmitQueue (Uber's CI System at Scale), Expert in Machine Learning, Distributed Systems, and Developer Productivity

Session

Rust at the Core - Accelerating Polyglot SDK Development

Monday Nov 17 / 03:55PM PST

Developing SDKs for your users in multiple languages can come at a high cost - especially if you need to implement complex logic client side, but traditionally options for sharing logic across those languages have been quite limited.

Speaker image - Spencer Judge

Spencer Judge

Engineering Manager @Temporal Technologies, previously Senior Software Engineer @Transparent Systems, Senior Software Engineer @ Tableau Software

Session

Driving Innovation with a Polyglot Platform

Monday Nov 17 / 01:35PM PST

Details coming soon.

Speaker image - Bishwajeet Paul

Bishwajeet Paul

Architect, Platform Engineering @JPMorgan Chase - Specializing in Solving Complex Challenges for the Developer Community

Session

Confidently Automating Changes Across a Diverse Fleet

Monday Nov 17 / 11:45AM PST

Maintaining up-to-date and secure software across a polyglot fleet is a challenge for any engineering organization. Manual migrations and urgent updates disrupt productivity and require coordination across many teams.

Speaker image - Casey Bleifer

Casey Bleifer

Senior Software Engineer @Netflix

Session

From Monolith to Mosaic: Strategies for a Safe and Successful Polyglot Migration

Monday Nov 17 / 05:05PM PST

Details coming soon.

Speaker image - Adrian Cockcroft

Adrian Cockcroft

Technology Advisor and Consultant @OrionX.net, Previously VP Open Source and Sustainability @Amazon, Cloud Architect @Netflix, Distinguished Engineer @eBay