The len(val_ds.dataset) and len(val_ds) results different things. Same issue with train part did not get why?
If anyone can help, I didn’t get idea from pytorch documentation.
From what I remember, the datasets made by random_split
keep an original one from which the examples are then “choosen”.
So the train_ds
and val_ds
have the same data inside (.dataset
field) but they “return” different examples when asked for batch.
I just suggest looking at the source of this function, because I was surprised by this behavior as well - I’ve made different transforms for train and validation set, but apparently, when I changed one transform, the other one changed as well, because they share the same dataset.
1 Like