What is exactly we mean by transfer learning?
I know that it’s using a pre-trained model and its parameters such as weights and biases then we adapt the hyperparameters such as epochs and learning rates.
But can we use it for a totally different dataset? If not what is the benefit of using pre-trained models rather than reducing the training time?
Can we name all the CNN as transfer learning? Since it’s changing the information from pixels to matrix mat and still reserve the spacial and pattern relationship which is kind of strong transfer learning?
The idea behind transfer learning is that if you have one dataset of images, then some features (edges, shapes etc) are also apparent in a different dataset.
By using pre-trained parameters, you avoid finding the “general” parameters used to find such common features, while training only the part necessary for classification.
For example a car has two circular objects - wheels. But the face also has such objects - eyes. While this is a bit far-fetched example, you can get the idea.
You also get another nice bonus from transfer learning - you don’t have to train the whole model, because the parts responsible for finding such features are already trained, so you can “disable” them from training (less memory usage, faster training time).