Only 4 years ago, if you wanted to build a production ML system, you would need tens of engineers to build and maintain a ML platform, such as Michelangelo at Uber. Now, with serverless ML platforms, you can build an operate a ML system in minimal time.
In this tutorial, we will build a serverless ML system from three different Python programs that, when plugged together make up a production ML system. The programs are:
- a feature pipeline that we will schedule to run at an interval using Modal
- a training pipeline that we run on demand (Colab)
- a batch inference pipeline that produces predictions that we show on a Dashboard (Gradio/HuggingFace)
The example shown will be drawn from our course on serverless ML: https://github.com/featurestoreorg/serverless-ml-course