@aakashns I got following error while working with MNIST datasets using Resnets.
RuntimeError: Function AddmmBackward returned an invalid gradient at index 1 - got [300, 512] but expected shape compatible with [300, 2048]
Any help shall be highly appreciated. File is available in the following link.
[sumera-rounaq381/resnet9-mnist - Jovian]
You are resizing the train dataset, but you leave validation untouched. This means that the model sees 2 different versions of images: 32x32 and 24x24 (probably, since it’s mnist).
Suspicion (not sure, because it failed for backward, while it should throw an error earlier):
Your linear layer in classifier accepts 512 inputs.
This is ok for 24x24 inputs since, they get maxpooled into 1x1x512.
For 32x32 inputs it’s 2x2x512 → 2048 (sounds familiar to something from the error?).
Also it would be good idea to convert the images to grayscale, since you use 3 channels, which is a waste of memory in case of this dataset.
@Sebgolos Thanks a million for pointing out my mistake. I corrected it and my model gave 99% accuracy. Credit goes to you.!!!