When an organization needs to productionize machine learning models, built using data from their data platforms, they often require a feature store as a central repository and collaboration layer for the data teams. But feature stores are more than just a data warehouse for features. They also power online AI applications, providing the key operational machine learning infrastructure that feeds data to online models.
In this article we will explain why the feature stores are now widely considered as the backbone of modern machine learning systems and we will identify the various challenges to consider when you start thinking about implementing one in your organization.
First, a short history on the origins of and need for feature stores. The first feature store was announced in late 2017 as a platform developed internally by Uber. Uber had difficulties in scaling and operationalising models for machine learning; engineering teams were building bespoke, often one-off systems that would enable each model to be put in production. Those early ML Systems produced numerous anti-patterns and a range of technical debts that organizations would incur over time. Simply put, there is a gap between the systems needed to develop a model and the systems needed to operationalise a model that valuable business systems will depend on.
As companies mature enough to understand their business historically with Business intelligence tools and data, they often look to AI to build intelligent applications or services. Business intelligence looks at the current and past data to help build meaningful insights and advice on the current situation of the business and potential actions that can be undertaken. Machine learning is the logical next step; it helps build products and meaningful predictions; it is a look into the future.
Companies are increasingly focussing on getting AI in operation as quickly as possible and using feature stores and ML platforms to accelerate that journey is the appropriate approach, chosen by industry leading companies such as robinhood, stripe, doordash and more.
Here are a few considerations that can help you identify if and when your organization has reached the threshold towards operational ML and might start the implementation of a feature store and ML platforms;
In addition to solving those issues, in general, the implementation of a feature store will enforce best-practices in managing AI assets, and helps eliminate silos in organizations where different business units or teams are responsible for a portion of the process. As a collaborative platform, the feature store empowers each team as part of a cohesive process for productionizing AI and it reduces friction in establishing collaboration across teams.
There is also a strong argument that the feature store is not just a tool for big corporations, but for any organization that wants to build an AI-enabled system or application. Feature stores remove the need to build the operational AI support, reducing headcount and operational costs for small teams, enabling them to focus on the more valuable work of building the ML pipelines and AI-enabled applications or systems.
A feature store is a central, governed platform that enables data to be discovered for training AI and it also enables operational systems to use data to make predictions (inference) with AI. The feature store provides a Data Catalog describing the available data (the features) along with metadata, used for discovery but also to define the constraints under which data may be used in AI models to ensure compliance. The feature store also needs to provide security and SLAs (service level agreements) to ensure data is highly available for use by models in operational systems, ensuring critical business systems do not suffer downtime.
Where does the feature data, stored in feature stores and used by AI for training and predictions, come from? Typically, feature data is created using data from existing Enterprise Data Sources (databases, data warehouses, data lakes, lakehouses, message queues). Ultimately, this data is generated by business products and services (financial results, customer behaviors, click events…), and the predictions made by AI are used by those same products and services.
The feature data, itself, is often slightly different from the data in your existing Enterprise data platforms. Feature data is often concentrated signals from existing data (e.g., feature data might be the moving point averages for a stock every 1 minute, rather than all individual stock prices that change many times per second), it can also include things like customer demographics, product prices, or website traffic data. Feature data may be computed at regular intervals (hourly, daily, etc) or it might be real-time data (the current Nasdaq Composite Index price ). The variety and the varying cadences for different feature data create complexities in computing, storing, and making that feature data available to operational models.
Delving into the technical specifics, a feature store is a unified platform that consists of three internal data systems:
Feature stores solve two major problems in operational machine learning; how to deliver the right feature data in the right format for training or prediction, and how to compute and manage feature data from disparate and disjointed sources.
Since their inception, feature stores have widely extended their capabilities and are the foundation of most operational machine learning platforms in companies that provide operational AI services. Feature stores now support collaboration across data and operations teams, data re-use, versioning features and models, governance, monitoring, and more.
After you have identified the need for faster implementation of your models, you need to set up a framework. As a result of helping and collaborating with hundreds of professionals in Fortune 500 companies, we have identified the most efficient model: create an end-to-end minimum viable product.
The MVP approach allows you to be more nimble in making decisions and changes to your pipelines and with the appropriate technology it empowers your organization to generate value faster and helps onboard stakeholders in the process.
Beyond the frame of work you choose; here are other considerations that are important when you have decided on implementing (or even building!) a feature store:
Before moving to your implementation of a feature store, you might want to tackle a few common misconceptions of feature stores. Some are benign misunderstandings that require a deeper understanding of the technology and challenges, while others might lead your organization down a rabbit hole from which it will take months, or even years to emerge.
Here we cover some real-world examples of companies that successfully implemented feature stores, in particular, we will look at how Hopsworks has been leveraged to make a positive impact in the machine learning operations of a few organizations and we will look at the general lessons we learned along the way in assisting those customers.
At AFCU, Hopsworks’ feature store powers the Enterprise AI services and integrates all operational and analytical data into machine learning processes, allowing for faster development cycles, integration with existing data warehouses, and providing a flexible environment for data scientists.
Outcome: AFCU achieved significant gains over their previous process for training models. They saw a 3 to 4 times productivity gain while simplifying their machine learning codebase/pipelines. New features were easier to test and data science workflows were improved. AFCU was able to reduce complexity and increase the readability of new features. There was also improved visibility and reusability of features across models and use cases.
The Swedish National Employment Agency “Arbetsförmedlingen” needed a highly available production environment for AI and was looking for a feature store capable of not only working as a unified data layer but also to manage and orchestrate workflows and processes around AI, including GPUs for model training and serving models.
Outcome: Arbetsförmedlingen used Hopsworks to quickly serve real-time predictions of suitable job postings. Hopsworks also helped identify discriminatory texts in job announcements. The platform offered centralization and collaboration allowing Data Scientists to work with modern libraries developing, creating feature pipelines and developing AI models in a structured manner.
HEAP requires large-scale processing of genomic data on Apache Spark and deep learning to analyze large datasets of human exposome data. HEAP has many activities around identifying novel viruses, performing large cohort studies, and identifying genetic mutations causing diseases.
Outcome: HEAP used Hopsworks to lead the delivery of the Informatics Platform and Knowledge Engine. This enabled data warehousing, stream processing, and deep learning with advanced analytics. As a result, HEAP saw a 90% cost reduction, faster data processing, and an integrated data science platform.
In examining many of our customers’ journeys in implementing a feature store, and as one of the leaders in this new field, we have gleaned valuable insights in how data for AI has emerged as the core capabilities for operational machine learning systems. Our main conclusion is that companies that implement a feature store see faster iteration and faster models in production; it also allows them to scale and move to real-time machine learning use cases with more ease.
There has also been an obvious increase in market interest in feature stores in 2023 (Snowflake Summit had 4 tracks dedicated to feature stores this 2023 summer, and Microsoft is the last of the 4 main cloud players to announce their own light-weight implementation of one), which only reinforces the position of the feature store as a core technology in the ML system and MLOps spaces.
Here are some additional lessons we believe are valuable for anyone considering machine learning systems as a whole, and even more importantly through the scope a feature store;
Ultimately, you need a feature store not because we say so, but because your models need a feature store to run in production. You need a feature store to generate value from your models.