Introducing the Feature Store – the Data Warehouse for Machine Learning

January 21, 2019

Logical Clocks, the enterprise vendor for Hopsworks – a data platform for scale-out data science and AI, today announced the release of the first Enterprise Feature Store for Machine Learning. The Feature Store solves the problem of ad-hoc and siloed machine learning pipelines, where features, the training data for such pipelines, tend to become disorganized, disjointed, and duplicated, leading to correctness problems and redundant work.

Today, Logical Clocks AB are announcing the release of a Feature Store as part of Hopsworks version 0.8.0. The Feature Store is a central vault for documented, curated, and access-controlled features. In-house Feature Stores are already successfully in production at companies such as Uber, LinkedIn, Airbnb, and Comcast. Now, for the first time, a Feature Store is available, as open-source, in an Enterprise Data platform, Hopsworks.

With the increasing adoption of machine learning in the Enterprise, organizations are looking to reduce the cost of developing and deploying AI by increasing the productivity of their Data Scientists. According to Uber, “dealing with data access, integration, feature management, and pipelines can often waste a huge amount of a data scientist’s time”. The Feature Store solves the data access and feature management problem for Data Science by removing the need for Data Scientists to constantly re-implement feature pipelines for collecting and transforming data to feed their machine learning models. Instead, Data Scientists can select features from the Feature Store to generate clean training data that can then be consumed directly by machine learning models. Hopsworks’ Feature Store builds on Apache Spark and Apache Hive to enable it to scale to massive data volumes.

“As part of the Hopsworks platform, the Feature Store also gives Enterprises full Machine Learning Governance – the exercise of authority and control (access, monitoring, auditing, and provenance) over the management of machine learning assets. Repeatable experiments, features, and models are now all governed and managed by Hopsworks” Dr. Jim Dowling (CEO) said.
No items found.