← Back to Dictionary

Training-Inference Skew

What is training-inference skew?

Model-dependent transformations are applied in both the training and inference pipelines. Training-inference skew is when there are (even slightly) different implementations of a transformation between the training and inference pipelines. Training-inference skew can silently and negatively affect model performance and is a hard bug to detect.

Why is it important to watch for training-inference skew?

Training-inference skew is a discrepancy that arises when the data preprocessing or feature transformation steps differ between the training and inference pipelines. Such inconsistencies can lead to degraded model performance and hard-to-detect issues in real-world applications. It is crucial to watch for training-inference skew for several reasons:

  1. Model performance: Discrepancies between training and inference pipelines can result in the model performing poorly when deployed, even if it performed well during training and validation.
  2. Debugging and troubleshooting: Training-inference skew can be challenging to identify and diagnose, as the issues often stem from subtle differences in the implementation of data preprocessing or feature transformations.
  3. Reproducibility: Ensuring that the same data preprocessing and feature transformation steps are used in both pipelines is essential for achieving reproducible results.