In the data augmentation session, why don't we apply data augmentation to the test set?

It was said in the lecture that we can’t apply data augmentation to the test set even though the augmenation will stay the same if we wanted to change the model for different results.

There’s no point in augmenting a test set.

Augmentation is used to increase the number of possible training examples.
A possible augmentations (in case of images) consist of cropping, flipping images verticaly/horizontaly, zooming in/out and so on.

There’s no point in doing this for a test set, because we don’t train the model on it, so we don’t need more examples to increase how well the model generalizes to unseen data.

But I think augmentation will challenge the model to do it’s best so it learns better, right?

What kind of learning are you expecting during testing phase?
Everything’s already learned. No point in producing horizontally flipped cats to detect :stuck_out_tongue:

It does help for the training set, because the horizontally flipped cat introduces an example that was in theory never seen by the model. If it can learn that the side the cat looks at doesn’t matter, then the generalization is better.

Well if we use data augmentation to produce more data, the loss may get higher so the model learns more.

Test Sets are a way to see how your model will apply or perform on real world datasets, which would not be augmented in real life. So it’s best not to augment our test sets as we will not be augmenting real life datasets that we will want to predict our model with.

1 Like