When training a ML or DL model it's crucial to have a separate validation data set to tune hyperparameters and prevent overfitting. Validation data set should come from the original training data which is divided into a new training data set and a validation data set.
A validation data set is a data set of examples used to tune the hyperparameters (i.e. the architecture) of a classifier. It is sometimes also called the development set or the "dev set"... In order to get more stable results and use all valuable data for training, a data set can be repeatedly split into several training and a validation data sets. This is known as cross-validation. To confirm the model's performance, an additional test data set held out from cross-validation is normally used.
The original test data set is used to evaluate the final model performance on unseen data and usually we keep the test data set separate and only use it for the final evaluation of the model after all training and validation steps are completed. Using part of the test set for validation would mean that the test set is no longer completely unseen, leading to potential data leakage and biased performance metrics.