Hopsworks research paper at SIGMOD 2024 Industrial Track
Today June 12th, our research paper, "The Hopsworks Feature Store for Machine Learning," will be presented at the SIGMOD 2024 Industrial Track. The paper is the first feature store to appear at a top-tier database or systems conference. In peer-reviewed benchmarks, Hopsworks was the class leading feature store, with an order of magnitude higher throughput and lower latency than Databricks, AWS Sagemaker, and GCP vertex, enabling the most challenging real-time AI systems, from personalized recommendations to financial trading.
Research Paper Highlights:
- Data Transformation Taxonomy for ML:
A new model for understanding how and where to transform data for AI systems, based on whether the data will be reused across models, be specific to one model, or require real-time inputs. - High-Performance Hopsworks Feature Query Service:
Hopsworks introduced a new end-to-end Arrow based query service for historical feature data that avoids the row-to-column pivots and excessive serialization/deserialization found in the other feature stores evaluated in the paper. - Feature Store Workloads Benchmarks:
Hopsworks Feature Store outperformed Vertex, Databricks and SageMaker. For the offline Feature Store Hopsworks, when reading 10M rows, achieves 11, 10 and 17 times the throughput of Sagemaker, Vertex and Databricks, respectively. For the online Feature Store, Hopsworks, for p99 latency, had 15% of the latency of Sagemaker and 11% of the latency of Vertex.
About the SIGMOD/PODS Conference:
The annual SIGMOD/PODS Conference is the leading international forum for database researchers, practitioners, developers, and users to explore cutting-edge ideas and results, and to exchange techniques, tools, and experiences. The conference includes a technical program with research and industrial talks, tutorials and workshops.
Hopsworks is honored to present our research paper at SIGMOD 2024 today, contributing to the dialogue on advancing ML data management technologies.
About Hopsworks:
Hopsworks is a leading Lakehouse for AI, offering an end-to-end solution for developing, deploying, and monitoring AI/ML models at scale. Recognized for its innovative feature store and comprehensive toolset, Hopsworks empowers organizations to unlock the full potential of their data.