Back to the Index

Training-Inference Skew

What is training-inference skew?

Model-dependent transformations are applied in both the training and inference pipelines. Training-inference skew is when there are (even slightly) different implementations of a transformation between the training and inference pipelines. Training-inference skew can silently and negatively affect model performance and is a hard bug to detect.

Why is it important to watch for training-inference skew?

Training-inference skew is a discrepancy that arises when the data preprocessing or feature transformation steps differ between the training and inference pipelines. Such inconsistencies can lead to degraded model performance and hard-to-detect issues in real-world applications. It is crucial to watch for training-inference skew for several reasons:

  1. Model performance: Discrepancies between training and inference pipelines can result in the model performing poorly when deployed, even if it performed well during training and validation.
  2. Debugging and troubleshooting: Training-inference skew can be challenging to identify and diagnose, as the issues often stem from subtle differences in the implementation of data preprocessing or feature transformations.
  3. Reproducibility: Ensuring that the same data preprocessing and feature transformation steps are used in both pipelines is essential for achieving reproducible results.
Does this content look outdated? If you are interested in helping us maintain this, feel free to contact us.