Model-dependent transformations are applied in both the training and inference pipelines. Training-inference skew is when there are (even slightly) different implementations of a transformation between the training and inference pipelines. Training-inference skew can silently and negatively affect model performance and is a hard bug to detect.
Training-inference skew is a discrepancy that arises when the data preprocessing or feature transformation steps differ between the training and inference pipelines. Such inconsistencies can lead to degraded model performance and hard-to-detect issues in real-world applications. It is crucial to watch for training-inference skew for several reasons: