Feature skew is when there are significant differences between the feature logic executed in an offline ML pipeline (the feature or training pipeline) and the feature logic executed in the corresponding online inference pipeline.
The necessary conditions for feature skew are:
On-demand feature transformations are applied in both the feature and online-inference pipelines and different implementations can result in on-demand feature skews. Training-inference skew can occur when model-dependent transformations are implemented differently in the training and online inference pipelines. Feature skew can silently and negatively affect model performance making such issues difficult to detect.
Feature skew can result in silently degraded model performance that is difficult to discover. It will show up during online inference, as the model may not generalize well to the new data during inference due to the discrepancies in feature transformations.
Consider a feature transformation where the raw data is scaled using the mean and standard deviation calculated from the training dataset. Suppose the following code snippets are used for the transformations during the training pipeline and online inference pipeline:
Feature Pipeline:
Online inference pipeline:
In this example, the feature transformation at the inference stage incorrectly calculates the mean and standard deviation from the new data instead of using the values from the training dataset. This discrepancy leads to training/inference skew and can negatively impact the model's performance during inference.