Supporting Diverse ML Systems at Netflix

Netflix uses data science and machine learning across all facets of the company, powering a wide range of business applications. The Metaflow machine learning platform at Netflix provides an entire user-centric ecosystem of abstractions and integrations that allow practitioners to tackle a diverse set of business problems.

In this talk, we will first introduce Metaflow, an Open Source Software, and how it accelerates the work of ML practitioners by providing simple and consistent abstractions over core properties of ML pipelines. We will show how components such as data, orchestration, hosting and others can be simply combined to iterate quickly from design to production over a wide range of complex data science and ML workflows. 

Through the talk we will highlight use cases to demonstrate the breadth of ML applications developed on Metaflow, and we will point out the user-design, technical and operational principles that helped us scale to hundreds of users and several hundreds of workflows while maintaining a small team with a focus on customer service and interactions.

The presentation will cover: 

  • How to build foundational components that can be combined to create novel ML applications across diverse use cases.
  • How to design user-centric systems that cater to a wide range of ML practitioners.
  • Technical and operational lessons about scaling and maintaining a large platform with a small team. 

Speaker

Speaker

Romain Cledat

Senior Software Engineer @Netflix, Metaflow Core Contributor, Previously @Facebook and @Intel

Romain Cledat is an engineer at Netflix, where he has played a pivotal role in the development and open-sourcing of Metaflow since 2019. As the technical lead of the project at Netflix, Romain has continued making significant contributions to the open-source community, driving innovation and excellence in a platform that is used by many companies. Before joining Netflix, Romain was part of the Applied Machine Learning group at Facebook (now Meta), where he specialized in building foundational ML infrastructure for the company's training platform. His career began in the high-performance computing (HPC) domain and compiler technology, working on exascale computing initiatives at Intel. Romain holds a Ph.D. in Computer Science from the Georgia Institute of Technology, awarded in 2011, with a dissertation focused on compilers and operating systems.

Read more
Find Romain Cledat at:

From the same track

Session Architecture

Optimizing Search at Uber Eats

Monday Nov 18 / 11:45AM PST

Uber has an in-house search engine called Search In Action (SIA). As the backbone behind the feed and search capabilities of Uber's Delivery business, SIA plays a crucial role in expanding selection seamlessly for customers which is a strategic advantage to the business.

Speaker image - Janani Narayanan

Janani Narayanan

Applied ML Engineer @Uber, Previously Tech Lead on DynamoDB Control Plane (Early Stage), 10+ Years Tech Industry Experience

Speaker image - Karthik Ramasamy

Karthik Ramasamy

Senior Staff Software Engineer @Uber, 15 Years of Experience in Design and Implementation of Web Applications, Distributed Systems, Search and Analytics Infrastructure

Session HTTP

How GitHub Copilot Serves 400 Million Completion Requests a Day

Monday Nov 18 / 03:55PM PST

GitHub Copilot is the largest LLM powered Code Completion service in the world, serving hundreds of millions of requests per day with an average response time of under 200ms. This is the story of the architecture which powers this product.

Speaker image - David Cheney

David Cheney

Lead, Copilot Proxy @GitHub, Open Source Contributor and Project Member for Go Programming Language, Previously @VMware

Session Architecture

Unified Grid: How We Re-Architected Slack for our Largest Customers

Monday Nov 18 / 01:35PM PST

Slack’s enterprise solution allows users to join multiple workspaces within the same organization. However, for years, users could only view channels, messages, and other content from a single workspace at a time.

Speaker image - Ian Hoffman

Ian Hoffman

Staff Software Engineer @Slack, Previously @Chairish

Session

Unconference: Architectures You've Always Wondered About

Monday Nov 18 / 02:45PM PST

Session

Legacy Modernization: Architecting Real-Time Systems Around a Mainframe

Monday Nov 18 / 05:05PM PST

Designing systems that take advantage of modern platforms, tools, and techniques is critical for building scalable, evolvable applications that underpin businesses of all stripes. Leveraging those when your data is captured in a mainframe, which does not scale well, is challenging.

Speaker image - Jason Roberts

Jason Roberts

Lead Software Consultant @Thoughtworks, 15+ years in Software Development, Azure Solutions Architect Expert

Speaker image - Sonia Mathew

Sonia Mathew

Director, Product Engineering @National Grid, 20+ Years in Tech