A feature view is a selection of features (and labels) from one or more feature groups. You create a feature view by joining together features from existing feature groups and optionally performing following steps: defining one or more of the selected features as labels, declaring a transformation (feature encoding) for one or more selected features, and returning only certain feature values by applying a user-supplied filter condition.
As a feature view can include model-dependent transformation functions for features, it can be said to be aware of each feature’s feature type. The feature view also knows about the entity_id (primary key) and event_time columns for each feature in the feature view.
Why are feature views important?
The feature view is a representation of the features (and labels) used by one or more models. As such, it is a model-specific view of features/labels in the feature store. The feature view:
provides a unified interface for features used by a model for training and inference;
prevents training/inference skew for model-dependent transformations by providing declarative support for specifying transformations, and executing model-dependent transformations when reading feature data for training and inference;
supports filters when creating training data and batch inference data, enabling support specialized models that share the same set of features. For example, if you want to train a model for users in different regions (e.g., USA, Europe, Asia), you can create training data with a filter for the geographic region, returning only training data for users in that geographic region.
Example of a feature view
Suppose you have two feature groups for an e-commerce platform: customer_information and purchase_history. You want to create a feature view for predicting customer churn that combines relevant features from both feature groups.
#connect to the feature store
import hopsworks
project = hopsworks.login()
fs = project.get_feature_store()
# get the feature group instances
fg1 = fs.get_or_create_feature_group(
name="customer_information",
version=1
)
fg2 = fs.get_or_create_feature_group(
name="purchase_history",
version=1
)
# select the features that will be used by the model
query = fg1.select_all().join(fg2.select_all())
# get the standard_scalar transformation function
standard_scaler = fs.get_transformation_function(name='standard_scaler')
# dictionary of "feature - transformation function" pairs
transformation_functions = {col_name: standard_scaler for col_name in df.columns}
feature_view = fs.create_feature_view(
name='customer_churn',
version=1,
label = ['has_churned'],
transformation_functions=transformation_functions,
query=query
)
Does this content look outdated? If you are interested in helping us maintain this, feel free to contact us.