Open-source foundation models, such as Mixtral, Llama-2, and Yi, are rapidly improving in performance and will enable the next generation of LLM applications to be fine-tuned and hosted in your cloud account or data center. For that, you will need LLM infrastructure support - or LLMops support. Hopsworks is a scalable ML platform that supports fine-tuning of LLMs and retrieval augmented generation with its Feature Store with built-in Vector Database support.
In this webinar, we will see the 3 programs you need to write to productionize a fine-tuned RAG LLM: a feature pipeline to create your training data for fine tuning, a training pipeline, and an inference pipeline to integrate RAG with your LLM application. We will look at how Hopsworks solves the data problem of scaling with HopsFS/S3, with 30X the performance of S3 for fast reading/writing of LLMs.