as it mentioned when we have test data is available then we can split train-validation data in 75-25 or 70-30, i am confuses how to split in this situtation
If you have separate test data set available, divide the training set into any of these:
Keep the majority chunk for training and a small portion for validation.
Thank you for your quick reply and i got your point but my question is when we have separate test data, so then how to assign inputs & target col.
We don’t need to make split for test data and we can directly assign input-target for test set.
In case of training data we split them to form train and validation set
# Create training and validation sets train_inputs, val_inputs, train_targets, val_targets = train_test_split( inputs_df[numeric_cols + encoded_cols], targets, test_size=0.25, random_state=42)
# For test set ( If we don't have target variable available ) test_input = test_df[numeric_cols + encoded_cols]
# For test set ( If target variable is available ) test_input = test_df[numeric_cols + encoded_cols] test_target = test_df[target]
Its just for representational though, hope it helps.