High-Resolution Platform Observability

Many observability tools fail to provide us with the relevant insights for understanding hardware health and utilization. Whether due to incomplete instrumentation of key components or resolution that’s too coarse to capture brief or intermittent disturbances, we’re often left with gaps in our understanding and questions left unanswered. In this talk, we’ll explore techniques and technologies for getting a more detailed and comprehensive understanding of hardware health and utilization. With more comprehensive hardware telemetry we can pinpoint issues or exonerate components. We can finally answer the questions we have and unlock better performance.


Speaker

Brian Martin

Co-founder and Software Engineer @IOP Systems, Focused on High-Performance Software and Systems, Previously @Twitter

Brian is a software engineer who focuses on performance optimization and distributed systems. He worked at Twitter for 8 years, initially with the Cache Team and later as a member of the newly created Performance Team. After November 2022, Brian joined his teammates from Twitter as a co-founder of IOP Systems and continues to work on improving software and platform performance, efficiency, and reliability.

Read more

From the same track

Session Hybrid cloud

Evaluating and Deploying State-of-the-Art Hardware to Meet the Challenges of Modern Workloads

Wednesday Nov 20 / 01:35PM PST

At GEICO we are on a journey to entirely modernize our Infrastructure. We are building an open-source, cloud-agnostic hybrid stack to run across public and on prem private cloud infrastructure without having to expose vendor specific stacks to our application developers.

Speaker image - Rebecca Weekly

Rebecca Weekly

VP of Infrastructure @GEICO

Session AI HW/SW optimization

Maximizing Deep Learning Performance on CPUs using Modern Architectures

Wednesday Nov 20 / 11:45AM PST

As deep learning continues to drive advancements across various industries, efficiently navigating the landscape of specialized AI hardware has a huge impact on cost and speed of operation.

Speaker image - Bibek Bhattarai

Bibek Bhattarai

AI Technical Lead @Intel, Computer Scientist Invested in Hardware-Software Optimization, Building Scalable Data Analytics, Mining, and Learning Systems

Session RISC-V

Optimizing Custom Workloads with RISC-V

Wednesday Nov 20 / 02:45PM PST

This talk will explore how RISC-V architecture can accelerate custom workloads, focusing on AI/ML applications. We’ll start by examining the RISC-V ecosystem and its increasing relevance in the software development landscape.

Speaker image - Ludovic Henry

Ludovic Henry

Member of Technical Staff @Rivos, Performance-Minded Engineer, Hardware & Software, Previously @Xamarin, @Microsoft, @Datadog

Session AI/ML

Unleashing Llama's Potential: CPU-Based Fine-Tuning

Wednesday Nov 20 / 10:35AM PST

Generative AI landscape is rapidly changing as new models are appearing in horizon every few days. However, the hardware and software characteristics of these models have many similar patterns and execution phases.

Speaker image - Anil Rajput

Anil Rajput

AMD Fellow, Software System Design Eng. Java Committee Chair @SPEC, Architected Industry Standard Benchmarks and Authored Best Practices Guides for Platform Engineering and Cloud

Speaker image - Rema Hariharan

Rema Hariharan

Principal Engineer @AMD, Seasoned Performance Engineer With a Base in Quantitative Sciences and a Penchant for Root-Causing