SRE

Session Platform Engineering

Building Resilient Platforms: Insights from 20+ Years in Mission-Critical Infrastructure

Monday Nov 17 / 02:45PM PST

In this talk, Matthew will describe lessons learned from over 20+ years of building scalable, secure and stable infrastructure platforms for software in financial services (electronic trading, credit card processing etc.), the talk is relevant to anyone building platforms for mission-critic

Speaker image - Matthew Liste

Matthew Liste

Head of Infrastructure @American Express, Previously @JPMorgan Chase and @Goldman Sachs

Session Incidents

When Incidents Refuse to End

Wednesday Nov 19 / 11:45AM PST

As engineers, we’re used to managing failure, but long-running outages hit differently. They stretch teams, systems, and assumptions about how incidents “should” play out.

Speaker image - Vanessa Huerta Granda

Vanessa Huerta Granda

Resiliency Manager @Enova, Co-Author of the Howie Guide on Post Incident Analysis

Session Incident Response

Week-Long Outage: Lifelong Lessons

Wednesday Nov 19 / 02:45PM PST

Routine database upgrades should be straightforward, especially with familiar, well-established technology. We were confident heading into our Elasticsearch upgrade, equipped with a solid plan and excited to see performance gains like we had seen from past upgrades.

Speaker image - Molly Struve

Molly Struve

Staff Site Reliability Engineer @Netflix