Feature Store

How we secure your data with Hopsworks

Integrate with third-party security standards and take advantage from our project-based multi-tenancy model to host data in one single shared cluster.

Jim Dowling

January 3, 2025

23 min

Amazon FSx for NetApp ONTAP interoperability test in a Hopsworks 4.x Deployment

By following this tutorial, you can evaluate the interoperability between Hopsworks 4.x and Amazon FSx for NetApp ONTAP

Javier Cabrera

November 19, 2024

16 min

Air-gapped Installations in Hopsworks

In this tutorial, we walk you through how we install air-gapped environments in Hopsworks and how we extended this process to test air-gapped installations as well.

Javier Cabrera

October 21, 2024

Migrating Hopsworks to Kubernetes

Nearly a year ago, the Hopsworks team embarked on a journey to migrate its infrastructure to Kubernetes. In this article we describe three main pillars of our Kubernetes migration.

Javier Cabrera

September 26, 2024

A Year of Insights with the MLOps & LLMOps Dictionary

In this blog post, we have selected the 25 most-read dictionary entries from the MLOps and LLMOps Dictionary to highlight key trends and lessons in AI.

Carolin Svenberg

September 2, 2024

30 min

Introducing the AI Lakehouse

We describe the capabilities that need to be added to Lakehouse to make it an AI Lakehouse that can support building and operating AI-enabled batch and real-time applications as well LLM applications.

Jim Dowling

August 6, 2024

Reproducible Data for the AI Lakehouse

We present how Hopsworks leverages its time-travel capabilities for feature groups to support reproducible creation of training datasets using metadata.

Jim Dowling

July 10, 2024

24 min

The Feature Store Makes Your Data Warehouse Easy to Use for AI

In this article, we cover the added value of a feature store over a data warehouse when managing offline data for AI.

Jim Dowling

July 8, 2024

14 min

The Journey from Star Schema to Snowflake Schema in the Feature Store

In this article we introduce the snowflake schema data model for feature stores, and show how it helps you include more features to make better predictions

Davit Bzhalava

June 25, 2024

25 min

Modularity and Composability for AI Systems with AI Pipelines and Shared Storage

We present a unified software architecture for batch, real-time, and LLM AI systems that is based on a shared storage layer and a decomposition of machine learning pipelines.

Jim Dowling

April 30, 2024

3 min

Feature Pipelines in Production with Hopsworks

In this post, we will look at how to put feature pipelines into production using Hopsworks.

Fabio Buso

April 17, 2024

Job Scheduling & Orchestration using Hopsworks and Airflow

This article covers the different aspects of Job Scheduling in Hopsworks including how simple jobs can be scheduled through the Hopsworks UI by non-technical users

Ehsan Heydari

April 10, 2024

17 min

Build Vs Buy: For Machine Learning/AI Feature Stores

On the decision of building versus buying a feature store there are strategic and technical components to consider as it impacts both cost and technological debt.

Rik Van Bruggen

March 25, 2024

Doubling Down on Open Source: How RonDB Upholds the Principles Redis Left Behind

Redis will no longer be open source. Our own project, RonDB, will continue being open source in order to uphold the principles that keeps the technology advancing.

Mikael Ronström

March 21, 2024

19 min

The Enterprise Journey to introducing a Software Factory for AI Systems

In this article we describe the software factory approach to building and maintaining AI systems.

Jim Dowling

March 5, 2024

Delta Lake comes to Hopsworks

Hopsworks has added support for Delta Lake to accelerate our mission to build the Python-Native Data for AI platform.

Jim Dowling

February 14, 2024

14 min

Federated Data with the Hopsworks Feature Query Service

A tutorial of the Hopsworks Feature Query Service which efficiently queries and joins features from multiple platforms such as Snowflake, BigQuery and Hopsworks without data any duplication.

Steffen Grohsschmiedt

February 6, 2024

5 Machine Learning Myths Debunked

The rapid development pace in AI is the cause for a lot of misconceptions surrounding ML and MLOps. In this post we debunk a few common myths about MLOps, LLMs and machine learning in production.

Carolin Svenberg

December 22, 2023

15min

Feature Store Benchmark Comparison: Hopsworks and Feast

A comparison of the online feature serving performance for Hopsworks and Feast feature stores, contrasting the approaches to building a feature store.

Dhananjay Mukhedkar

November 27, 2023

20 min

What is MLOps?

This blog explores MLOps principles, with a focus on versioning, and provides a practical example using Hopsworks for both data and model versioning.

Haziqa Sajid

September 25, 2023

Bring Your Own Kafka Cluster to Hopsworks

A tutorial of how to use our latest Bring Your Own Kafka (BYOK) capability in Hopsworks. It allows you to connect your existing Kafka clusters to your Hopsworks cluster.

Ralfs Zangis

September 13, 2023

25 min

From MLOps to ML Systems with Feature/Training/Inference Pipelines

We explain a new framework for ML systems as three independent ML pipelines: feature pipelines, training pipelines, and inference pipelines, creating a unified MLOps architecture.

Jim Dowling

September 4, 2023

18 min

Feature Engineering with Apache Airflow

Unlock the power of Apache Airflow in the context of feature engineering. We will delve into building a feature pipeline using Airflow, focusing on two tasks: feature binning and aggregations.

Prithivee Ramalingam

August 23, 2023

Automated Feature Engineering with FeatureTools

An ML model’s ability to learn and read data patterns largely depend on feature quality. With frameworks such as FeatureTools ML practitioners can automate the feature engineering process.

Haziqa Sajid

August 11, 2023

20 min

Why Do You Need a Feature Store?

Discover the power of feature stores in modern machine learning systems and how they bridge the gap between model development and production.

Lex Avstreikh

August 9, 2023

Faster reading from the Lakehouse to Python with DuckDB/ArrowFlight

In this article, we outline how we leveraged ArrowFlight with DuckDB to build a new service that massively improves the performance of Python clients reading from lakehouse data in the Feature Store

Till Döhmen

June 21, 2023

Building Feature Pipelines with Apache Flink

Find out how to use Flink to compute real-time features and make them available to online models within seconds using Hopsworks.

Fabio Buso

June 20, 2023

Feature Engineering for Categorical Features with Pandas

Explore the power of feature engineering for categorical features using Pandas. Learn essential techniques for handling categorical variables, and creating new features.

Prithivee Ramalingam

February 2, 2023

30 min

Feature Store: The missing data layer for Machine Learning pipelines?

In this blog, we discuss the state-of-the-art in data management and machine learning pipelines (within the wider field of MLOps) and present the first open-source feature store, Hopsworks.

Jim Dowling

December 9, 2022

Optimize your MLOps Workflow with a Feature Store CI/CD and Github Actions

In this blog we present an end to end Git based workflow to test and deploy feature engineering, model training and inference pipelines.

Fabio Buso

September 15, 2022

How to use external data stores as an offline feature store in Hopsworks with Connector API

In this blog, we introduce Hopsworks Connector API that is used to mount a table in an external data source as an external feature group in Hopsworks.

Dhananjay Mukhedkar

September 7, 2022

15 min

Great Models Require Great MLOps: Using Weights & Biases with Hopsworks

Discover how you can easily make the journey from ML models to putting prediction services in production by choosing best-of-breed technologies.

Moritz Meister

August 23, 2022

20 min

From Pandas to Features to Models to Predictions - A deep dive into the Hopsworks APIs

Learn how the Hopsworks feature store APIs work and what it takes to go from a Pandas DataFrame to features used by models for both training and inference.

Fabio Buso

August 3, 2022

7 min

Introducing the Serverless Feature Store

Hopsworks Serverless is the first serverless feature store for ML, allowing you to manage features and models seamlessly without worrying about scaling, configuration or management of servers.

Jim Dowling

July 26, 2022

Hopsworks 3.0: The Python-Centric Feature Store

Hopsworks is the first feature store to extend its support from the traditional Big Data platforms to the Pandas-sized data realm, where Python reigns supreme. A new Python API is also provided.

Jim Dowling

July 20, 2022

7 min

Hopsworks 3.0 - Connecting Python to the Modern Data Stack

Hopsworks 3.0 is a new release focused on best-in-class Python support, Feature Views unifying Offline and Online read APIs to the Feature Store, Great Expectations support, KServe and a Model serving

Jim Dowling

April 26, 2022

17 min

Testing feature logic, transformations, and feature pipelines with pytest

Operational machine learning requires the offline and online testing of both features and models. In this article, we show you how to design, build, and run test for features.

Jim Dowling

December 7, 2021

How to Transform Snowflake Data into Features with Hopsworks

Learn how to connect Hopsworks to Snowflake and create features and make them available both offline in Snowflake and online in Hopsworks.

Fabio Buso

November 12, 2021

6 min

Receiving Alerts in Slack/Email/PagerDuty from Hopsworks

Learn how to set up customized alerts in Hopsworks for different events that are triggered as part of the ingestion pipeline.

Ermias Gebremeskel

August 3, 2021

MLOps Wars: Versioned Feature Data with a Lakehouse

With support to Apache Hudi, the Hopsworks Feature Store offers lakehouse capabilities to improve automated feature pipelines and training pipelines (MLOps).

Davit Bzhalava

June 17, 2021

Hopsworks Online Feature Store: Fast Access to Feature Data for AI Applications

Read about how the Hopsworks Feature Store abstracts away the complexity of a dual database system, unifying feature access for online and batch applications.

Moritz Meister

March 26, 2021

15 min

Detecting Financial Fraud Using GANs at Swedbank with Hopsworks and NVIDIA GPUs

Recently, one of Sweden’s largest banks trained generative adversarial neural networks (GANs) using NVIDIA GPUs as part of its fraud and money-laundering prevention strategy.

Jim Dowling

February 26, 2021

AI/ML needs a Key-Value store, and Redis is not up to it

Seeing how Redis is a popular open-source feature store with features significantly similar to RonDB, we compared the innards of RonDB’s multithreading architecture to the commercial Redis products.

Mikael Ronström

February 25, 2021

How to engineer and use Features in Azure ML Studio with the Hopsworks Feature Store

Learn how to design and ingest features, browse existing features, create training datasets as DataFrames or as files on Azure Blob storage.

Moritz Meister

February 9, 2021

How to transform Amazon Redshift data into features with Hopsworks Feature Store

Connect the Hopsworks Feature Store to Amazon Redshift to transform your data into features to train models and make predictions.

Ermias Gebremeskel

October 23, 2020

7 min

Feature Store for MLOps? Feature reuse means JOINs

Use JOINs for feature reuse to save on infrastructure and the number of feature pipelines needed to maintain models in production.

Jim Dowling

October 8, 2020

ML Engineer Guide: Feature Store vs Data Warehouse

The feature store is a data warehouse of features for machine learning (ML). Architecturally, it differs from the traditional data warehouse in that it is a dual-database.

Jim Dowling

July 1, 2020

11 min

Beyond Self-Driving Cars

This blog introduces the feature store as a new element in automotive machine learning (ML) systems and as a new data science tool and process for building and deploying better Machine learning models

Remco Frijling

June 15, 2020

4 min

Manage your own Feature Store on Kubeflow with Hopsworks

Learn how to integrate Kubeflow with Hopsworks and take advantage of its Feature Store and scale-out deep learning capabilities.

Jim Dowling

May 26, 2020

How to build your own Feature Store

We have many conversations with companies and organizations who are deciding between building their own feature store and buying one. We thought we would share our experience of building one.

Jim Dowling

May 18, 2020

Hopsworks Feature Store for AWS SageMaker

Integrate AWS SageMaker with Hopsworks to manage, discover and use features for creating training datasets and for serving features to operational models.

Fabio Buso

April 23, 2020

11 min

Hopsworks Feature Store for Databricks

This blog introduces the Hopsworks Feature Store for Databricks, and how it can accelerate and govern your model development and operations on Databricks.

Fabio Buso

February 20, 2020

6 min

Towards better AI-models in the betting industry with a Feature Store

Introducing the feature store which is a new data science tool for building and deploying better AI models in the gambling and casino business.

Jim Dowling

February 14, 2020

15 min

MLOps with a Feature Store

This blog introduces platforms and methods for continuous integration (CI), delivery (CD), and training (CT) with ML platforms, with details on how to do CI/CD MLOps with a Feature Store.

Fabio Buso

November 27, 2019

18 min