Self-hosted Language Models are going to power the next generation of applications in critical industries like financial services, healthcare, and defense. Self-hosting LLMs, as opposed to using API-based models, comes with its own host of challenges - as well as needing to solve business problems, engineers need to wrestle with the intricacies of model inference, deployment and infrastructure. In this talk we are going to discuss the best practices in model optimisation, serving and monitoring - with practical tips and real case-studies.
Speaker
Meryem Arik
Co-Founder @TitanML, Recognized as a Technology Leader in Forbes 30 Under 30, Recovering Physicist
Meryem is a recovering physicist and the co-founder of TitanML - TitanML is an NLP development platform that focuses on deployability of LLMs - allowing businesses to build smaller and cheaper deployments of language models with ease. The TitanML platform automates much of the difficult MLOps and Inference Optimisation science to allow businesses to build and deploy state-of-the-art language models with ease. She has been recognised as a technology leader in the Forbes 30 Under 30 list.