Pytorch GPU Memory Error

There is a model that worked yesterday night on colab but stopped working today due to memory constraints.

I am getting this error:
RuntimeError: CUDA out of memory. Tried to allocate 496.00 MiB (GPU 0; 11.17 GiB total capacity; 10.44 GiB already allocated; 150.81 MiB free; 10.57 GiB reserved in total by PyTorch)

Even if I run “torch.cuda.empty_cache()”, it still gives me the same error message.
Doesn’t that method clean up the memory on my gpu? It doesn’t clean up this “10.57 GiB reserved”?

Also “nividia-smi” shows nothing is running on the gpu right now.



The VRAM available probably fluctuates as more/less users use the service. Probably yesterday the limits were a bit higher.

I think you could try restarting the runtime, but overall it’s better to think how you can decrease the memory consumption:

  • decrease complexity of the model
  • decrease batch size
  • decrease size of the input
  • use mixed precision (not that great tho, didn’t work for me as much as the other options)
Oh ok so you are saying the 10gb already used by gpu is actually used by the model rather than old data from previous run?

Probably :smiley: (even more probable if you do a fresh run after restarting runtime)
Depends on how big your images are. If you use convolutional layers, then the parameters are only a small part of the whole memory needed to train such model → most is taken by feature maps produced during activation in consequent layers, because they’re needed for backprop. Lowering batch size 2 times, usually means there are 2 times less intermediate results, so the model might start working again.

I see, let me try playing around with it. Thanks sebastian.