i saw many problem where we have 2 separate file for train and test dataset, in that file target is given in train dataset but its not present in test dataset, so in this what should be our approach and how to assign inputs & targets for test data
Not sure if I understand.
According to the topic title:
If you have separate files for training and testing, then you just don’t split the training dataset (because you already have something to test on).
If you ask about what to do if your test dataset doesn’t have any target column, then probably this is part of some sort of competition, and the targets are hidden on purpose.
Your task is to predict them, and then they’re evaluated and you get a score representing how you compare to other competitors.
If you don’t want to participate in such competition (for whatever reason), you can just use the training dataset (ignoring the one marked as test) and then split it to obtain a test dataset with targets present.
To add on what @Sebgolos said, testing and validation sets are different.
We use validation test to verify how good our model performs with unseen data, and whether our model has overfitted to the data or not.
Whereas, we use test data to finally predict the output of a new data, it is always not necessary that you will have the target column on the test data. If you have it, you would need it to compare how good your model is with an actual data, if you don’t have the target column it is probably your job to get the target column of the test data with the model you created.